Dynamic Weight-based Temporal Aggregation for Low-light Video Enhancement Under Extreme Noise

Abstract

Low-light video enhancement (LLVE) is challenging due to noise, low contrast, and color degradation. While learning-based methods enable fast inference, they often fail under heavy real-world noise because they do not sufficiently exploit long-term temporal cues. We propose DWTA-Net, a novel deep-learning recurrent LLVE framework with a recurrent design. DWTA-Net adopts an integrated two-stage architecture: Stage I restores local structure and color via multi-frame alignment for temporally consistent Mamba-based enhancement, while Stage II performs recurrent refinement using a novel dynamic weight-based temporal aggregation guided by optical flow, functioning as a recurrent denoiser that adapts to motion. We further introduce a texture-adaptive loss that preserves fine details in textured regions while suppressing noise in homogeneous areas. Experiments on real-world low-light footage show that DWTA-Net achieves stronger noise suppression and fewer artifacts, delivering superior visual quality compared with state-of-the-art methods.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…