Panel Flow Matching: A Generative Approach to Learning Distributions of Longitudinal Data
Abstract
Learning distributions of longitudinal data is central to tasks such as visualization, completion, classification, and synthetic data generation, but it remains statistically challenging because longitudinal observations are often irregular, sparse, and collected from only a limited number of subjects. To address this, we develop a novel generative framework, termed panel flow matching (PFM), for learning longitudinal distributions by pooling information across time via a continuous panel flow model. PFM combines a forward flow-matching step with a backward kernel-fitting step, yielding a flexible and data-adaptive approach for capturing complex distributional structures. We apply PFM to estimate panel densities, namely the cross-sectional densities of longitudinal data, and establish statistical guarantees under irregular and sparse sampling designs. Under this, PFM naturally supports tasks including longitudinal completion, synthetic data generation, and classification, without requiring a preliminary dimension-reduction step to handle data irregularity. Extensive simulations demonstrate that PFM outperforms existing methods across these tasks. We further apply PFM to a vaginal microbiome longitudinal dataset from 188 pregnancies labeled as term or preterm, where it improves classification accuracy and reveals time-varying distributional differences between the two groups.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.