Mixtures, envelopes, and hierarchical duality
Abstract
We develop a connection between mixture and envelope representations of objective functions that arise frequently in statistics. We refer to this connection using the term "hierarchical duality." Our results suggest an interesting and previously under-exploited relationship between marginalization and profiling, or equivalently between the Fenchel--Moreau theorem for convex functions and the Bernstein--Widder theorem for Laplace transforms. We give several different sets of conditions under which such a duality result obtains. We then extend existing work on envelope representations in several ways, including novel generalizations to variance-mean models and to multivariate Gaussian location models. This turns out to provide an elegant missing-data interpretation of the proximal gradient method, a widely used algorithm in machine learning. We show several statistical applications in which the proposed framework leads to easily implemented algorithms, including a robust version of the fused lasso, nonlinear quantile regression via trend filtering, and the binomial fused double Pareto model. Code for the examples is available on GitHub at https://github.com/jgscott/hierduals.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.