Breaking the Curse of Dimensionality: Diffusion Models Efficiently Learn Low-Dimensional Distributions
Abstract
Despite their empirical success across a wide range of generative tasks, the fundamental principles underlying the ability of diffusion models to learn data distributions are poorly understood. In this work, we develop a new mathematical framework that explains how diffusion models can effectively learn low-dimensional distributions from a finite number of training samples without suffering from the curse of dimensionality. Specifically, motivated by the intrinsic low-dimensional structure of image data, we theoretically analyze a setting in which the data distribution is modeled as a mixture of low-rank Gaussians. Under suitable network parameterization, we show that optimizing the training objective of diffusion models is equivalent to solving the canonical subspace clustering problem over the training samples, where each subspace basis corresponds to the low-rank covariance of a Gaussian component. This equivalence allows us to show that the sample complexity for learning the underlying distribution scales linearly with the intrinsic dimension of the data, rather than exponentially with the ambient dimension. Our theoretical findings are further supported by empirical evidence that demonstrates phase transition phenomena in generalization on both synthetic and real-world image datasets. Moreover, we establish a correspondence between the learned subspace bases and semantic attributes of image data, providing a principled foundation for controllable image generation.