Neural Network Approximation: Three Hidden Layers Are Enough

Shijun Zhang

doi:10.1016/j.neunet.2021.04.011

Neural Network Approximation: Three Hidden Layers Are Enough

Abstract

A three-hidden-layer neural network with super approximation power is introduced. This network is built with the floor function ( x), the exponential function (2x), the step function (1x≥ 0), or their compositions as the activation function in each neuron and hence we call such networks as Floor-Exponential-Step (FLES) networks. For any width hyper-parameter N∈N+, it is shown that FLES networks with width \d,N\ and three hidden layers can uniformly approximate a H\"older continuous function f on [0,1]d with an exponential approximation rate 3λ (2d)α 2-α N, where α ∈(0,1] and λ>0 are the H\"older order and constant, respectively. More generally for an arbitrary continuous function f on [0,1]d with a modulus of continuity ωf(·), the constructive approximation rate is 2ωf(2d)2-N+ωf(2d\,2-N). Moreover, we extend such a result to general bounded continuous functions on a bounded set E⊂eqRd. As a consequence, this new class of networks overcomes the curse of dimensionality in approximation power when the variation of ωf(r) as r→ 0 is moderate (e.g., ωf(r) rα for H\"older continuous functions), since the major term to be concerned in our approximation rate is essentially d times a function of N independent of d within the modulus of continuity. Finally, we extend our analysis to derive similar approximation results in the Lp-norm for p∈[1,∞) via replacing Floor-Exponential-Step activation functions by continuous activation functions.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…