(Ab)Using Regression for Data Adjustment

Abstract

In various economic applications, people want to compare n units with respect to certain quantities Y1, Y2, …, Yn measuring their performance. The latter, however, is often influenced by certain factors which are beyond control of the units, and one would like to extract an adjusted performance from the data. Specifically, let Xi ∈ X summarize the factors of the i-th unit. Then one could think of a model equation Yi = fo(Xi) + εi with a regression function fo : X R describing the unavoidable influence of the factors Xi and εi being the adjusted performance of the i-th unit. Now a common proposal is to estimate fo via regression methods by a function f depending on the current data (Xi,Yi), possibly augmented by additional past data, and to use the residuals εi := Yi - f(Xi) as surrogates for the adjusted performances εi. In the present report we discuss this approach, its potential pitfalls and (mis)interpretation. In particular, an unavoidable property of the residuals εi is that they measure only parts of the adjusted performance while the remaining parts get hidden in the estimated function f. Possible alternatives are mentioned briefly.

0

Discussion (0)

Sign in to join the discussion.

Loading comments…