Hyperflux: Pruning Reveals Importance

Abstract

Network pruning is used to reduce inference latency and power consumption in large neural networks. However, most methods focus on empirical results at the expense of understanding the pruning process. We introduce Hyperflux, a novel L0 method which models pruning as a continuously evolving system determined by flux, the gradient response to a weight's removal, and pressure, a global regularization driving weights toward pruning. By exploiting this model, Hyperflux's pruning behavior becomes understandable at both microscopic (weight regrowth/pruning) and macroscopic (sparsity convergence, etc.) levels. We also introduce a novel pressure scheduler that reliably targets desired sparsities. Hyperflux achieves competitive results with ResNet-50, VGG-19 and DeiT-T/S on CIFAR-10, CIFAR-100 and ImageNet datasets.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…