Near-optimal estimates for the p-Lipschitz constants of deep random ReLU neural networks

Abstract

This paper studies the p-Lipschitz constants of ReLU neural networks : Rd R with random parameters for p ∈ [1,∞]. The distribution of the weights follows a variant of the He initialization and the biases are drawn from symmetric distributions. We derive high probability upper and lower bounds for wide networks that differ at most by a factor that is logarithmic in the network's width and linear in its depth. In the special case of shallow networks, we obtain matching bounds. Remarkably, the behavior of the p-Lipschitz constant varies significantly between the regimes p ∈ [1,2) and p ∈ [2,∞] . For p ∈ [2,∞], the p-Lipschitz constant behaves similarly to gp', where g ∈ Rd is a d-dimensional standard Gaussian vector and 1/p + 1/p' = 1. In contrast, for p ∈ [1,2), the p-Lipschitz constant aligns more closely to g 2.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…