Upper Bounds for Local Learning Coefficients of Three-Layer Neural Networks

Abstract

Three-layer neural networks are known to form singular learning models, and their Bayesian asymptotic behavior is governed by the learning coefficient, or real log canonical threshold. Although this quantity has been clarified for regular models and for some special singular models, broadly applicable methods for evaluating it in neural networks remain limited. Recently, a formula for the local learning coefficient of semiregular models was proposed, yielding an upper bound on the learning coefficient. However, this formula applies only to nonsingular points in the set of realization parameters and cannot be used at singular points. In particular, for three-layer neural networks, the resulting upper bound has been shown to differ substantially from learning coefficient values already known in some cases. In this paper, we derive a formula for an upper bound on local learning coefficients at a class of singular realization parameters in three-layer neural networks. This formula can be interpreted as a counting rule under budget, demand, and supply constraints. In the non-polynomial real-analytic case, the formula applies in general settings, whereas in the polynomial case it applies under the restriction that the true distribution has no hidden units. In particular, our result covers activation functions such as the swish function and also includes polynomial activation functions under the above restriction, thereby extending previous results to a broader class of activation functions. We further show that, when the input dimension is one, the numerical value given by the right-hand side of our upper-bound formula agrees with the previously known learning coefficient, thereby providing a useful comparison with known exact results. Our result also provides a systematic perspective on how the weight parameters of three-layer neural networks affect the learning coefficient.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…