Eigenvalue distribution of the Neural Tangent Kernel in the quadratic scaling

Elliot Paquette

Eigenvalue distribution of the Neural Tangent Kernel in the quadratic scaling

Abstract

We compute the asymptotic eigenvalue distribution of the neural tangent kernel of a two-layer neural network under a specific scaling of dimension. Namely, if X∈Rn× d is an i.i.d random matrix, W∈Rd× p is an i.i.d N(0,1) matrix and D∈Rp× p is a diagonal matrix with i.i.d bounded entries, we consider the matrix \[ NTK = 1dXX 1p σ'( 1dXW )D2 σ'( 1dXW ) \] where σ' is a pseudo-Lipschitz function applied entrywise and under the scaling ndp γ1 and pd γ2. We describe the asymptotic distribution as the free multiplicative convolution of the Marchenko--Pastur distribution with a deterministic distribution depending on σ and D.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…