When Marginals Match but Structure Fails: Covariance Fidelity in Generative Models

Abstract

Generative models are increasingly deployed as substitutes for real data in downstream scientific workflows, yet standard evaluation criteria remain focused on marginal distribution matching. We argue that this represents a fundamental gap: downstream inference is rarely a marginal operation, and a model that passes every univariate diagnostic can still produce structurally unreliable synthetic data. We introduce covariance-level dependence fidelity, measured by DSigma(P,Q) = ||SigmaP - SigmaQ||F, as a principled, computable criterion for evaluating whether a generative model preserves the joint structure of data beyond its univariate marginals. Three results formalise this criterion. First, marginal fidelity provides no constraint on dependence structure: DSigma can be made arbitrarily large while all univariate marginals match exactly. Second, covariance divergence induces quantifiable downstream instability, including sign reversals in population regression coefficients. Third, bounding DSigma provides positive stability guarantees for dependence-sensitive procedures such as PCA via Davis-Kahan-type bounds. Empirical validation across three domains, image data (Fashion-MNIST VAE, n = 60,000), bulk RNA-seq (TCGA-BRCA, n = 1,111), and a small-sample stress test (Alzheimer's gene expression, n = 113), shows that DSigma/delta consistently distinguishes structure-discarding from structure-preserving generators in cases where standard marginal diagnostics show little separation, confirming that covariance-level fidelity provides information orthogonal to existing evaluation metrics across domains and sample sizes.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…