Feature Selection and Junta Testing are Statistically Equivalent

Abstract

For a function f \0,1\n \0,1\, the junta testing problem asks whether f depends on only k variables. If f depends on only k variables, the feature selection problem asks to find those variables. We prove that these two tasks are statistically equivalent. Specifically, we show that the ``brute-force'' algorithm, which checks for any set of k variables consistent with the sample, is simultaneously sample-optimal for both problems, and the optimal sample size is \[ ( 1 ( 2k n k + n k)). \]

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…