Deep Network with Approximation Error Being Reciprocal of Width to Power of Square Root of Depth
Abstract
A new network with super approximation power is introduced. This network is built with Floor ( x) or ReLU (\0,x\) activation function in each neuron and hence we call such networks Floor-ReLU networks. For any hyper-parameters N∈N+ and L∈N+, it is shown that Floor-ReLU networks with width \d,\, 5N+13\ and depth 64dL+3 can uniformly approximate a H\"older function f on [0,1]d with an approximation error 3λ dα/2N-αL, where α ∈(0,1] and λ are the H\"older order and constant, respectively. More generally for an arbitrary continuous function f on [0,1]d with a modulus of continuity ωf(·), the constructive approximation rate is ωf(d\,N-L)+2ωf(d)N-L. As a consequence, this new class of networks overcomes the curse of dimensionality in approximation power when the variation of ωf(r) as r 0 is moderate (e.g., ωf(r) rα for H\"older continuous functions), since the major term to be considered in our approximation rate is essentially d times a function of N and L independent of d within the modulus of continuity.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.