Eigenvalue distribution of the Neural Tangent Kernel in the quadratic scaling

Abstract

We compute the asymptotic eigenvalue distribution of the neural tangent kernel of a two-layer neural network under a specific scaling of dimension. Namely, if X∈Rn× d is an i.i.d random matrix, W∈Rd× p is an i.i.d N(0,1) matrix and D∈Rp× p is a diagonal matrix with i.i.d bounded entries, we consider the matrix \[ NTK = 1dXX 1p σ'( 1dXW )D2 σ'( 1dXW ) \] where σ' is a pseudo-Lipschitz function applied entrywise and under the scaling ndp γ1 and pd γ2. We describe the asymptotic distribution as the free multiplicative convolution of the Marchenko--Pastur distribution with a deterministic distribution depending on σ and D.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…