Turbulence generation and data assimilation in wall-bounded flows with a latent diffusion model
Abstract
Wall-bounded turbulent flows are chaotic and multiscale, rendering real-time prediction at high Reynolds numbers computationally prohibitive in applications such as wind farms. Classical data assimilation methods are based on repeated solution of the governing equations and thus inherit this cost. Generative models instead learn the probability distribution of flow states, enabling scalable probabilistic reconstruction. Using plane Couette flow, we develop a generative framework that couples a β-VAE with a transformer-based diffusion model to generate four-dimensional spatiotemporal samples. Bayesian conditioning enables data assimilation without retraining and allows statistical constraints to be imposed through sampling. The framework is applied to a subdomain of turbulent plane Couette flow at Reh=1300, where the DNS resolution in this region requires O(106) spatial degrees of freedom. The diffusion model reproduces two-point correlations, energy spectra, and single-point statistics up to fourth order using O(10) latent spatial degrees of freedom, yielding a compression ratio of O(105) - one to two orders of magnitude above prior reports. Two assimilation scenarios demonstrate that, when observations are statistically consistent with the prior, conditional diffusion models with the proposed sampling strategy preserve complex turbulent statistics in the posterior. However, enforcing these constraints while preserving physical fidelity and sample diversity introduces an inherent trade-off. Excessive conditioning can distort the learned diffusion prior, paralleling limitations of classical ensemble-based data assimilation. These results highlight both the promise of diffusion models as probabilistic surrogates for turbulent wall-bounded flows and the challenges of conditioning such models, establishing a foundation for real-time reconstruction from operational data.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.