Again-Pose: Anchor-Guided Adaptive Inter-Frame Motion Cues Propagating for High-quality Human Pose Reconstruction
Abstract
Reconstructing continuous 3D human poses from unconstrained videos is challenging, especially in extreme motion scenarios involving severe motion blur and occlusion. Current state-of-the-art methods typically rely on implicit temporal attention to aggregate features across frames. However, under severe visual degradation, input features often suffer from collapse, rendering them indistinguishable from noise. In such cases, implicit aggregation fails to distinguish valid signals, leading to catastrophic reconstruction errors. To address this robustness gap, we propose a simple yet effective framework called Anchor-guided adaptive inter-frame motion cues propagating (Again-Pose), reformulating pose estimation in degraded frames as a motion-guided recovery task. Instead of blindly smoothing features, we explicitly identify high-quality Anchor Frames based on feature saliency and propagate reliable kinematic cues to "inpaint" the poses of degraded intermediate frames. Specifically, a Dual-path Motion-aware Module captures fine-grained inter-frame dynamics, while a Difference-weighted Fusion Module adaptively propagates these cues to suppress drift. Extensive experiments on standard benchmarks (Human3.6M, 3DPW, PoseTrack) and the challenging FineDiving dataset demonstrate that Again-Pose significantly outperforms state-of-the-art methods in robustness and stability, effectively recovering plausible poses where other methods fail.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.