On Extending Type-I Error to Data-Dependent Levels
Abstract
The emerging literature on hypothesis testing with data-dependent and post-hoc significance levels relies on a particular extension of the Type-I error to data-dependent levels. Existing arguments for this extension are heuristic, and primarily motivated by a resulting connection to the E-value. Our main contribution is to argue that the extension is 'right', by showing that it emerges from three axioms: within a large class of possible extensions, it is the only extension that nests classical Type-I error validity for data-independent levels, preserves classical validity for data-dependent levels and is monotone in the strength of the rejection claim. As a second contribution, we apply this result to support the common definition of the E-value, by showing that it arises as the 'right' notion of validity for the numerical representation of a generalized hypothesis test that may reject at different data-driven significance levels.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.