Measuring and Mitigating Post-hoc Rationalization in Reverse Chain-of-Thought Generation

Abstract

Reverse Chain-of-Thought Generation (RCG) synthesizes reasoning traces from query-answer pairs, but it risks producing post-hoc rationalizations: when models can see the answer during generation, the answer serves as a cognitive anchor that shapes the entire explanation. We formalize this phenomenon through a three-level measurement hierarchy: lexical, entropic, and probabilistic anchoring, which capture surface artifacts, entropy dynamics, and latent answer dependence, respectively. We analyze semantic suppression, the intuitive mitigation strategy that instructs models to ignore the answer, and find that it is counterproductive: while it reduces lexical overlap, it paradoxically increases entropic and probabilistic anchoring. We attribute this failure to active monitoring of the forbidden answer, which inadvertently deepens dependence on it. To break this cycle, we propose Structural Skeleton-guided Reasoning (SSR), whose core contribution is to replace answer suppression with structural decoupling: SSR first generates a response-abstracted functional skeleton designed to limit direct answer encoding and then uses it as a structural target for full trace generation. Experiments across open-ended reasoning benchmarks show that SSR consistently mitigates anchoring, and that Distilled SSR (SSR-D), a distillation variant that internalizes skeleton-guided reasoning from teacher-generated traces, achieves up to 10\% improvement over suppression baselines while mitigating out-of-distribution (OOD) degradation.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…