BackPlay: Head-Only Look-Back Self-Correction for Diffusion Language Models

Abstract

Diffusion Language Models (DLMs) decode multiple tokens in parallel, but aggressive multi-token decoding amplifies cross-token dependency errors and can sharply degrade generation quality. We propose BackPlay, a frozen-backbone self-correction framework that trains only a lightweight correction head on a finetuned DLM without updating any backbone or adapter parameters. Because the head is trained on errors produced by the same frozen generator used at inference time, its training distribution aligns with the error patterns of the deployed model. We further introduce Look-back Correction, a training mechanism that injects predictions from earlier, more corrupted denoising states into later, richer contexts, enabling the head to leverage later context to detect mistakes made in earlier generation steps. During inference, BackPlay periodically revisits previously generated tokens through selective remasking and regeneration to limit error accumulation. Across mathematical reasoning and code generation benchmarks, BackPlay improves the speed--quality trade-off of the underlying DLM under multi-token decoding.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…