Latent Phase-Shift Rollback: Inference-Time Error Correction via Residual Stream Monitoring and KV-Cache Steering

Abstract

Large language models frequently commit unrecoverable reasoning errors mid-generation: once a wrong step is taken, subsequent tokens compound the mistake rather than correct it. We introduce Latent Phase-Shift Rollback (LPSR): at each generation step, we monitor the residual stream at a critical layer lcrit, detect abrupt directional reversals (phase shifts) via a cosine-similarity + entropy dual gate, and respond by rolling back the KV-cache and injecting a pre-computed steering vector. No fine-tuning, gradient computation, or additional forward passes are required. LPSR achieves 44.0\% on MATH-500 with an 8B model versus 28.8\% for standard AR (+15.2 pp; McNemar 2 = 66.96, p < 10-15). Critically, prompted self-correction, the most natural inference-time baseline, scores only 19.8\%, below standard AR; LPSR exceeds it by +24.2 pp (2 = 89.4, p ≈ 0). LPSR also outperforms Best-of-16 (+7.8 pp) at 5.4× lower token cost, and surpasses a standard 70B model (35.2\%) with 8.75× fewer parameters at 3× the token budget. A 32-layer sweep reveals a novel detection-correction dissociation: error-detection AUC peaks at layer~14 (0.718) but task accuracy peaks at layer~16 (44.0\% vs.\ 29.2\%), demonstrating that optimal monitoring depth differs for detection and correction.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…