Quantitative asymptotics of graphical projection pursuit
Abstract
There is a result of Diaconis and Freedman which says that, in a limiting sense, for large collections of high-dimensional data most one-dimensional projections of the data are approximately Gaussian. This paper gives quantitative versions of that result. For a set of deterministic vectors \xi\i=1n in d with n and d fixed, let θ∈d-1 be a random point of the sphere and let μnθ denote the random measure which puts mass 1n at each of the points ∈prodx1θ,...,∈prodxnθ. For a fixed bounded Lipschitz test function f, Z a standard Gaussian random variable and σ2 a suitable constant, an explicit bound is derived for the quantity [|∫ f dμnθ- f(σ Z)|>ε]. A bound is also given for [dBL(μnθ, N(0,σ2))>ε], where dBL denotes the bounded-Lipschitz distance, which yields a lower bound on the waiting time to finding a non-Gaussian projection of the \xi\ if directions are tried independently and uniformly on d-1.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.