Convergence Analysis of the Dynamics of a Special Kind of Two-Layered Neural Networks with 1 and 2 Regularization

Abstract

In this paper, we made an extension to the convergence analysis of the dynamics of two-layered bias-free networks with one ReLU output. We took into consideration two popular regularization terms: the 1 and 2 norm of the parameter vector w, and added it to the square loss function with coefficient λ/2. We proved that when λ is small, the weight vector w converges to the optimal solution w (with respect to the new loss function) with probability ≥ (1-)(1-Ad)/2 under random initiations in a sphere centered at the origin, where is a small value and Ad is a constant. Numerical experiments including phase diagrams and repeated simulations verified our theory.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…