High-dimensional ridge regression with random features for non-identically distributed data with a variance profile

Abstract

Random feature ridge regression is often analyzed in the high-dimensional regime under the homogeneous sampling model xi=Σ1/2xi', where the vectors xi' have iid entries and the same covariance matrix Σ is shared by all samples. In this paper, we move beyond this setting and study non-identically distributed data through a variance-profile model in which the training and test covariates have row-dependent diagonal covariance matrices Σi=(γi12,…,γip2) and Σi=(γi12,…,γip2). Our main contribution is the derivation of asymptotic equivalents for the training and test risks of ridge regression with random features when n, p, and m grow proportionally. The first set of equivalents is obtained by combining the linear-plus-chaos approximation with traffic-probability arguments, whereas the second set is deterministic and follows from operator-valued free probability through an amalgamation-over-the-diagonal argument. These equivalents are sharp in numerical experiments. They also reveal how heterogeneous variance profiles, including mixture-type profiles inspired by MNIST, can modify generalization and exhibit double-descent behavior when the ridge parameter is small.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…