Regression adjustment in completely randomized experiments with many covariates
Abstract
This paper investigates estimation and inference for average treatment effects in completely randomized experiments when researchers observe potentially many covariates. Within Neyman's (1923) design-based framework, allowing the number of covariates to grow more slowly than the sample size, we demonstrate that a cross-fitted regression adjustment estimator--adapted from Aronow and Middleton (2013)--exhibits more favorable asymptotic properties than existing alternatives, such as Lin's (2013) regression adjustment estimator and the bias-corrected estimator of Lei and Ding (2021). For inference, we derive the first- and second-order terms in the stochastic expansions of regression-adjusted estimators, analyze the higher-order behavior of existing inference procedures, and introduce a modified version of the HC3 standard error. The proposed methods extend naturally to stratified experiments with large strata. Simulation studies show that the cross-fitted estimator, in combination with the modified HC3, provides accurate point estimates and reliable size control across a wide range of data-generating processes.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.