Genuinely Robust Inference for Clustered Data

Abstract

Conventional cluster-robust inference can be invalid when data contain clusters of unignorably large size. We formalize this issue by deriving a necessary and sufficient condition for its validity, and show that this condition is frequently violated in practice: specifications from 77% of empirical research articles in American Economic Review and Econometrica during 2020-2021 appear not to meet it. To address this limitation, we propose a genuinely robust inference procedure based on a new cluster score bootstrap. We establish its validity and size control across broad classes of data-generating processes where conventional methods break down. Simulation studies corroborate our theoretical findings, and empirical applications illustrate that employing the proposed method can substantially alter conventional statistical conclusions.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…