Policy-DRIFT: Dynamic Reward-Informed Flow Trajectory Steering
Abstract
Skin-friction drag induced by wall-bounded turbulent flows accounts for a substantial fraction of energy consumption across commercial aerospace, wind energy, and marine transport. Its active reduction is one of the highest-value targets in engineering fluid dynamics. Deep reinforcement learning (DRL) has emerged as the leading approach for real-time flow control, yet its performance ceiling is set not by algorithmic capability but by reward structure, the naive scalar objective does not optimally reflect the underlying physics. Policy-DRIFT bypasses this ceiling by relocating reward information from policy gradients to generative model inference: a conditional flow matching model (CFM) constructs a physically-grounded manifold of realisable flow states spanning multiple control regimes, Terminal Reward Guidance (TRG) steers samples toward reward-maximising targets at inference, and a lightweight DRL policy, structurally decoupled from reward quality, tracks these full-field targets via root-mean-squared error (RMSE) minimisation. The test case is turbulent channel flow simulated using direct numerical simulation (DNS) at friction Reynolds number of Reτ= 180, which is the canonical benchmark for wall-bounded turbulence. Policy-DRIFT achieves 49\% drag reduction approaching the theoretical upper bound, which is ≈ 16\% higher than the DRL benchmark, while consuming 37× less actuation energy. Our approach combines generative methods with active flow control, marking a paradigm shift towards controlling complex physical systems efficiently.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.