SP3: Enhancing Structured Pruning via PCA Projection

Abstract

Structured pruning is a widely used technique for reducing the size of pre-trained language models (PLMs), but current methods often overlook the potential of compressing the hidden dimension (d) in PLMs, a dimension critical to model size and efficiency. This paper introduces a novel structured pruning approach, Structured Pruning with PCA Projection (SP3), targeting the effective reduction of d by projecting features into a space defined by principal components before masking. Extensive experiments on benchmarks (GLUE and SQuAD) show that SP3 can reduce d by 70%, compress 94% of the BERTbase model, maintain over 96% accuracy, and outperform other methods that compress d by 6% in accuracy at the same compression ratio. SP3 has also proven effective with other models, including OPT and Llama. Our data and code are available at an anonymous repo.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…