Local Glivenko-Cantelli

Aryeh Kontorovich

Local Glivenko-Cantelli

Abstract

If μ is a distribution over the d-dimensional Boolean cube \0,1\d, our goal is to estimate its mean p∈[0,1]d based on n iid draws from μ. Specifically, we consider the empirical mean estimator pn and study the expected maximal deviation n=Ej∈[d]| pn(j)-p(j)|. In the classical Universal Glivenko-Cantelli setting, one seeks distribution-free (i.e., independent of μ) bounds on n. This regime is well-understood: for all μ, we have n(d)/n up to universal constants, and the bound is tight. Our present work seeks to establish dimension-free (i.e., without an explicit dependence on d) estimates on n, including those that hold for d=∞. As such bounds must necessarily depend on μ, we refer to this regime as local Glivenko-Cantelli (also known as μ-GC), and are aware of very few previous bounds of this type -- which are either ``abstract'' or quite sub-optimal. Already the special case of product measures μ is rather non-trivial. We give necessary and sufficient conditions on μ for n0, and calculate sharp rates for this decay. Along the way, we discover a novel sub-gamma-type maximal inequality for shifted Bernoullis, of independent interest.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Or open the topic learn hub

Discussion (0)

Sign in to join the discussion.

Loading comments…