The Derivative of Influence Function, Location Breakdown Point, Group Leverage and Regression Residuals' Plots
Abstract
In several linear regression data sets, Y (∈ R) on X (∈ Rp), visual comparisons of L1 and L2-residuals' plots indicate bad leverage cases. The phenomenon is confirmed theoretically by introducing Location Breakdown Point (LBP) of a functional T: any point where the derivative of T's Influence Function either takes values at infinities or does not exist. Guidelines for the plots' visual comparisons as diagnostic are provided. The new tools used include E-matrix and suggest influence diagnostic RINFIN which measures the distance in the derivatives of L2-residuals at ( x,y) from model F and from gross-error model Fε, x,y. The larger RINFIN( x,y) is, the larger ( x,y)'s influence in L2-regression residual is. RINFIN allows measuring group influence of k x-neighboring data cases in a size n sample using their average, ( xk, yk), as one case with weight ε=k/n. For high dimensional, simulated data, the misclassification proportion of bad leverage cases in data's RINFIN-ordering decreases to zero as p increases, thus reconfirming the blessing of high dimensionality in the detection of remote clusters. The visual diagnostic and RINFIN are successful in applications and complement each other.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.