Stop using root-mean-square error as a precipitation target!

Abstract

Root-mean-square error (RMSE) remains the default training loss for data-driven precipitation models, despite precipitation being semi-continuous, zero-inflated, strictly non-negative, and heavy-tailed. This Gaussian-implied objective misspecifies the data-generating process because it tolerates negative predictions, underpenalises rare heavy events, and ignores the mass at zero. We propose replacing RMSE with the Tweedie deviance, a likelihood-based and differentiable loss from the exponential--dispersion family with variance function V(μ)=μp. For 1<p<2 it yields a compound Poisson--Gamma distribution with a point mass at zero and a continuous density for y>0, matching observed precipitation characteristics. We (i) estimate p from the variance--mean power law and show that precipitation across temporal aggregations is far from Gaussian, with the Tweedie power p increasing with accumulation length towards a Gamma limit; and (ii) demonstrate consistent skill gains when training deep data-driven models with Tweedie deviance in place of RMSE. In diffusion-model downscaling over Beijing, Tweedie loss improves wet-pixel MAE and extreme recall (0.60 vs 0.50 at the 99th percentile). In ConvLSTM nowcasting over Kolkata, Tweedie loss yields improved wet-pixel MAE and dry-pixel hit rates, with improvements that compound autoregressively with lead time (for MAE, 2% at t+1 growing to 16% at t+4). Because the Tweedie deviance is continuous in p, it adapts smoothly across scales, offering a statistically justified, practical replacement for RMSE in precipitation-based learning tasks.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…