Overcoming State Inertia: Minimally Invasive Temporal Alignment for Evolving Contexts

Abstract

Long-context dialogue systems suffer from state inertia, where models over-attend to history and fail to adapt to evolving intents. We demonstrate that standard alignment methods like DPO and even recent long-context optimization techniques struggle to resolve this without incurring a severe contextual alignment tax--a substantial perplexity surge caused by disrupting pre-trained priors. To address this, we propose DZ-TiDPO, a minimally invasive framework that synergizes conflict-aware optimization (during training) with a structural temporal attention bias. This design effectively decouples state updating from general linguistic modeling. Experiments on Multi-Session Chat and our new Inertia Challenge (IC-Bench) show DZ-TiDPO preserves structural coherence while resolving inter-turn conflicts. Crucially, our framework supports dual inference strategies: a negligible-latency static mode for general robustness and a precision-focused dynamic mode for micro-semantic conflicts. Furthermore, our scaling analysis reveals a capacity-stability trade-off, confirming that highly capable mid-sized models (7B) can efficiently internalize temporal alignment. Code and data are available at: https://github.com/lyj20071013/DZ-TiDPO.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…