Finite- and Large- Sample Inference for Model and Coefficients in High-dimensional Linear Regression with Repro Samples
Abstract
In this paper, we present a novel and effective inference approach to conduct both finite- and large-sample inference for high-dimensional linear regression models. This approach is developed under the so-called repro samples framework, in which we conduct statistical inference by creating and studying the behavior of artificial samples that are obtained by mimicking the sampling mechanism of the data. We construct confidence sets for (a) the true model corresponding to the nonzero coefficients, (b) a single or any collection of regression coefficients, and (c) both the model and regression coefficients jointly. To facilitate the constructions of these confidence sets and overcome computational difficulties of searching all possible models, we use an innovative Fisher inversion technique to construct a model candidate set that includes the true sparse model with the probability close to 1 for models with both Gaussian and non-Gaussian errors. The proposed approach fills in two major gaps in the high-dimensional regression literature: (1) lack of effective approaches to addressing model selection uncertainty and providing valid inference for the underlying true model; (2) lack of effective inference approaches to guaranteeing finite-sample performance. We provide both finite-sample and asymptotic results to theoretically guarantee the performance of the proposed methods. In addition, our numerical results demonstrate that the proposed methods are valid and achieve better coverage with smaller confidence sets than the current state-of-the-art approaches, such as debiasing and bootstrap approaches.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.