PRADAS: PRior-Assisted DAta Splitting for False Discovery Rate Control

Abstract

In the FDR-controlling literature, mirror statistics offer a flexible alternative to p-value based procedures. When prior information is available, however, it is unclear how to incorporate mirror statistics in a principled way, and the standard equal split used by data-splitting methods can be inefficient. In this paper, we characterize a broader class of mirror statistics for any fixed splitting scheme and establish asymptotic FDR control under mild weak-dependence conditions using a two-stage procedure inspired by li2021whiteout. Within this class, we derive a Bayes-optimal mirror statistic. Theoretically, we demonstrate its power advantage through analyses in the Rare/Weak signal model. Building upon this Bayes-optimal mirror statistic, we propose PRADAS (PRior-Assisted DAta Splitting) that treats split ratio as a stopping time and recasts the data-splitting as an optional stopping over a natural filtration; the optimal stopping rule is characterized by the Snell envelope and computed efficiently via a Longstaff--Schwartz regression approximation. Both simulations and real data examples demonstrate the effectiveness of our proposed framework.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…