A Statistical Interpretation of the Maximum Subarray Problem

Abstract

Maximum subarray is a classical problem in computer science that given an array of numbers aims to find a contiguous subarray with the largest sum. We focus on its use for a noisy statistical problem of localizing an interval with a mean different from background. While a naive application of maximum subarray fails at this task, both a penalized and a constrained version can succeed. We show that the penalized version can be derived for common exponential family distributions, in a manner similar to the change-point detection literature, and we interpret the resulting optimal penalty value. The failure of the naive formulation is then explained by an analysis of the estimated interval boundaries. Experiments further quantify the effect of deviating from the optimal penalty. We also relate the penalized and constrained formulations and show that the solutions to the former lie on the convex hull of the solutions to the latter.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…