Minimum width for universal approximation using ReLU networks on compact domain

Abstract

It has been shown that deep neural networks of a large enough width are universal approximators but they are not if the width is too small. There were several attempts to characterize the minimum width w enabling the universal approximation property; however, only a few of them found the exact values. In this work, we show that the minimum width for Lp approximation of Lp functions from [0,1]dx to Rdy is exactly \dx,dy,2\ if an activation function is ReLU-Like (e.g., ReLU, GELU, Softplus). Compared to the known result for ReLU networks, w=\dx+1,dy\ when the domain is Rdx, our result first shows that approximation on a compact domain requires smaller width than on Rdx. We next prove a lower bound on w for uniform approximation using general activation functions including ReLU: w dy+1 if dx<dy2dx. Together with our first result, this shows a dichotomy between Lp and uniform approximations for general activation functions and input/output dimensions.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…