Efficient Unbiased Sparsification

Abstract

An unbiased m-sparsification of a vector p∈ Rn is a random vector Q∈ Rn with mean p that has at most m<n nonzero coordinates. Unbiased sparsification compresses the original vector without introducing bias; it arises in various contexts, such as in federated learning and sampling sparse probability distributions. Ideally, unbiased sparsification should also minimize the expected value of a divergence function Div(Q,p) that measures how far away Q is from the original p. If Q is optimal in this sense, then we call it efficient. Our main results describe efficient unbiased sparsifications for divergences that are either permutation-invariant or additively separable. Surprisingly, the characterization for permutation-invariant divergences is robust to the choice of divergence function, in the sense that our class of optimal Q for squared Euclidean distance coincides with our class of optimal Q for Kullback-Leibler divergence, or indeed any of a wide variety of divergences.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…