Combining independent p-values in replicability analysis: A comparative study
Abstract
Given a family of null hypotheses H1,…,Hs, we are interested in the hypothesis Hsγ that at most γ-1 of these null hypotheses are false. Assuming that the corresponding p-values are independent, we are investigating combined p-values that are valid for testing Hsγ. In various settings in which Hsγ is false, we determine which combined p-value works well in which setting. Via simulations, we find that the Stouffer method works well if the null p-values are uniformly distributed and the signal strength is low, and the Fisher method works better if the null p-values are conservative, i.e. stochastically larger than the uniform distribution. The minimum method works well if the evidence for the rejection of Hsγ is focused on only a few non-null p-values, especially if the null p-values are conservative. Methods that incorporate the combination of e-values work well if the null hypotheses H1,…,Hs are simple.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.