How fast can you find a good hypothesis?

Abstract

In the hypothesis selection problem, we are given sample and query access to finite set of candidate distributions (hypotheses), H = \H1, …, Hn\, and samples from an unknown distribution P, both over a domain X. The goal is to output a distribution Q whose distance to P is comparable to that of the nearest hypothesis in H. Specifically, if the minimum distance is OPT, we aim to output Q such that, with probability at least 1-δ, its total variation distance to P is at most C · OPT + . The optimal approximation for proper algorithms (where Q ∈ H) is C=3 using Θ((n/δ)/2) samples from P and for improper algorithms (where Q is not necessarily in H) is C=2 using Θ((n/δ)/2) samples from P. In the improper setting, the algorithm achieving C=2 [Bousquet, Braverman, Kol, Efremenko, Moran, FOCS 2021] runs in time which grows polynomially with |X| -- it does not run in finite time for real-valued distributions. A promising path towards improved runtime is to consider improper algorithms which output a mixture Q of the hypotheses as such a distribution can be represented in n words of memory. We show (1) a lower bound that no algorithm which outputs a mixture can achieve approximation better than C = 3-2/n unless the number of samples is polynomial in |X|, as well as (2) an algorithm which runs in time poly(n) and achieves the same approximation guarantee. In the proper setting, [Aliakbarpour, Bun, Smith, NeurIPS 2024] provided an algorithm with C=3 running in O(n/(δ33)) time. We improve this time complexity to O(n/(δ2)), significantly reducing the dependence on the confidence and error parameters.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…