Some Theoretical Results on Layerwise Effective Dimension Oscillations in Finite Width ReLU Networks

Darshan Makwana

Some Theoretical Results on Layerwise Effective Dimension Oscillations in Finite Width ReLU Networks

Abstract

We analyze the layerwise effective dimension (rank of the feature matrix) in fully-connected ReLU networks of finite width. Specifically, for a fixed batch of m inputs and random Gaussian weights, we derive closed-form expressions for the expected rank of the × n\ hidden activation matrices. Our main result shows that E[EDim()]=m[1-(1-2/π)]+O(e-c m) so that the rank deficit decays geometrically with ratio 1-2 / π ≈ 0.3634. We also prove a sub-Gaussian concentration bound, and identify the "revival" depths at which the expected rank attains local maxima. In particular, these peaks occur at depths k*≈(k+1/2)π/(1/) with height ≈ (1-e-π/2) m ≈ 0.79m. We further show that this oscillatory rank behavior is a finite-width phenomenon: under orthogonal weight initialization or strong negative-slope leaky-ReLU, the rank remains (nearly) full. These results provide a precise characterization of how random ReLU layers alternately collapse and partially revive the subspace of input variations, adding nuance to prior work on expressivity of deep networks.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…