Power of masking methods for adaptive testing in a multivariate normal means problem

Abstract

Many large-scale testing procedures learn signal structure from the data to boost power. Direct data reuse can inflate Type-I error ("double dipping"), so a common remedy is masking: withholding some information during learning and using it for testing. Sample splitting masks by withholding observations for testing, while null augmentation (e.g., knockoffs or full-conformal outlier detection) masks by appending null samples or variables and withholding their identities until testing. In many settings, little is known about how the power of masking methods compares across mechanisms, across tuning choices, or against more data-efficient non-masking alternatives. We study these questions in a stylized two-groups multivariate normal means model with an unknown signal direction learned from the data. Within this testbed, we develop a transparent, unified set of asymptotic power expressions for three parallel methods differing in masking choices: a sample splitting method, a full-conformal-style null augmentation method, and an oracle in-sample benchmark. Our main findings are: (1) the augmentation method is more powerful than the splitting method with matched tuning; (2) the power-optimal number of null samples for the augmentation method is a vanishing fraction of the number of tests, in which case its power approaches that of the in-sample benchmark; and (3) for a tractable approximation to the augmentation method, the optimal number of null samples scales as the square root of the number of tests, with empirical evidence suggesting a similar scaling for the method itself. These results characterize masking-induced power trade-offs in a tractable model and suggest qualitative lessons for other settings.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…