Orthogonal Transforms in Neural Networks Amount to Effective Regularization
Abstract
We consider applications of neural networks in nonlinear system identification and formulate a hypothesis that adjusting general network structure by incorporating frequency information or other known orthogonal transform, should result in an efficient neural network retaining its universal properties. We show that such a structure is a universal approximator and that using any orthogonal transform in a proposed way implies regularization during training by adjusting the learning rate of each parameter individually. We empirically show in particular, that such a structure, using the Fourier transform, outperforms equivalent models without orthogonality support.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.