On Sketching Trimmed Statistics
Abstract
We study sketching trimmed statistics of a frequency vector, including the Fp moment of the top-k coordinates and of the trimmed-k vector. Despite their natural role in robust analytics, this is the first time these problems have been studied in any sublinear space setting. For p ∈ [0,2], we obtain poly( n/)-space algorithms for both tasks when k is moderately large, and for general k we identify a sharp structural threshold that characterizes exactly when sublinear space is possible: in particular, it is actually determined by the ratio between ak2 and \|x-k\|22/k. We extend these results to p > 2 and present several applications including algorithms for thresholded Fp estimation and generalized impact indices. Notably, we improve the space bounds of Govindan, Monemizadeh, and Muthukrishnan (PODS 2017) for computing the h-index.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.