Compressing Neural Networks Using Tensor Networks with Exponentially Fewer Variational Parameters
Abstract
Neural network (NN) designed for challenging machine learning tasks is in general a highly nonlinear mapping that contains massive variational parameters. High complexity of NN, if unbounded or unconstrained, might unpredictably cause severe issues including overfitting, loss of generalization power, and unbearable cost of hardware. In this work, we propose a general compression scheme that significantly reduces the variational parameters of NN's, despite of their specific types (linear, convolutional, etc), by encoding them to deep automatically differentiable tensor network (ADTN) that contains exponentially-fewer free parameters. Superior compression performance of our scheme is demonstrated on several widely-recognized NN's (FC-2, LeNet-5, AlextNet, ZFNet and VGG-16) and datasets (MNIST, CIFAR-10 and CIFAR-100). For instance, we compress two linear layers in VGG-16 with approximately 107 parameters to two ADTN's with just 424 parameters, improving the testing accuracy on CIFAR-10 from 90.17\% to 91.74\%. We argue that the deep structure of ADTN is an essential reason for the remarkable compression performance of ADTN, compared to existing compression schemes that are mainly based on tensor decompositions/factorization and shallow tensor networks. Our work suggests deep TN as an exceptionally efficient mathematical structure for representing the variational parameters of NN's, which exhibits superior compressibility over the commonly-used matrices and multi-way arrays.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.