Kling-Gupta linear regression

Abstract

Kling-Gupta efficiency (KGE) is a model performance evaluation metric widely used in hydrology, but its properties as a statistical estimator have remained unexplored. We formalize the Kling-Gupta loss LKG = (1 - KGE)2 in an extremum estimation framework (maximizing KGE) for multiple linear regression. We give explicit formulas showing that Kling-Gupta regression scales the ordinary least squares (OLS) coefficient vector by a variance-inflation factor depending on sample variances and covariances. Its predictions reproduce the training set response variance, unlike OLS's variance reduction, while both maintain the response mean and achieve the same sample correlation. We prove that no estimator simultaneously maximizes Nash-Sutcliffe efficiency (NSE) and KGE: OLS maximizes NSE but not KGE, whereas Kling-Gupta regression maximizes KGE at the expense of NSE. We establish almost-sure convergence of the Kling-Gupta estimator to well-defined population limits. The training and test set performance metrics for both estimators converge asymptotically to identical limits (different for OLS vs. Kling-Gupta). In a single-predictor model with fixed intercept, we identify conditions where a global minimum of LKG does not exist because of discontinuity at zero slope. This work establishes a mathematical foundation for KGE-based estimation and clarifies its effects on predictive performance in hydrologic modeling.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…