Multi-Objective Weighted Sampling

Abstract

Multi-objective samples are powerful and versatile summaries of large data sets. For a set of keys x∈ X and associated values fx ≥ 0, a weighted sample taken with respect to f allows us to approximate segment-sum statistics Sum(f;H) = sumx∈ H fx, for any subset H of the keys, with statistically-guaranteed quality that depends on sample size and the relative weight of H. When estimating Sum(g;H) for g=f, however, quality guarantees are lost. A multi-objective sample with respect to a set of functions F provides for each f∈ F the same statistical guarantees as a dedicated weighted sample while minimizing the summary size. We analyze properties of multi-objective samples and present sampling schemes and meta-algortithms for estimation and optimization while showcasing two important application domains. The first are key-value data sets, where different functions f∈ F applied to the values correspond to different statistics such as moments, thresholds, capping, and sum. A multi-objective sample allows us to approximate all statistics in F. The second is metric spaces, where keys are points, and each f∈ F is defined by a set of points C with fx being the service cost of x by C, and Sum(f;X) models centrality or clustering cost of C. A multi-objective sample allows us to estimate costs for each f∈ F. In these domains, multi-objective samples are often of small size, are efficiently to construct, and enable scalable estimation and optimization. We aim here to facilitate further applications of this powerful technique.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…