Depth Lower Bounds for ReLU Networks with Binary Inputs
Abstract
We study the role of depth in ReLU networks with discrete (Boolean) inputs and real-valued outputs, complementing two established lines of work. For Boolean inputs, striking depth separation results were proven for AC0 but with threshold (TC0) or ReLU gates depth separation is only established for depth two vs. three. On the other hand, for real-valued functions and ReLU networks, Telgarsky's (2016) constructed a simple one variable class of functions which establishes separation at higher depths. In this paper we are interested to establish an all-depths depth separation for ReLU networks on \0,1\n. We do so by exhibiting an explicit family of functions computable exactly by a ReLU network of depth n+1 and constant width, such that any ReLU network of depth d and width w computing the function exactly must satisfy wd = Ω(2n); in particular, no network of depth d = o(n/ n) can compute it with width polynomial in n. We note that our lower bound relies on exact, infinite-accuracy computation as an exponential precision truncation of the output is computable by a polynomial-size TC0 circuit.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.