Robust Estimation of Regression Models with Potentially Endogenous Outliers via a Modern Optimization Lens
Abstract
This paper addresses the robust estimation of linear regression models in the presence of potentially endogenous outliers. Through Monte Carlo simulations, we demonstrate that existing L1-regularized estimation methods, including the Huber estimator and the least absolute deviation (LAD) estimator, exhibit significant bias when outliers are endogenous. Motivated by this finding, we investigate L0-regularized estimation methods. We propose systematic heuristic algorithms, notably an iterative hard-thresholding algorithm and a local combinatorial search refinement, to solve the combinatorial optimization problem of the \(L0\)-regularized estimation efficiently. Our Monte Carlo simulations yield two key results: (i) The local combinatorial search algorithm substantially improves solution quality compared to the initial projection-based hard-thresholding algorithm while offering greater computational efficiency than directly solving the mixed integer optimization problem. (ii) The L0-regularized estimator demonstrates superior performance in terms of bias reduction, estimation accuracy, and out-of-sample prediction errors compared to L1-regularized alternatives. We illustrate the practical value of our method through an empirical application to stock return forecasting.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.