Sharp MSE Bounds for Proximal Denoising
Abstract
Denoising has to do with estimating a signal x0 from its noisy observations y=x0+z. In this paper, we focus on the "structured denoising problem", where the signal x0 possesses a certain structure and z has independent normally distributed entries with mean zero and variance σ2. We employ a structure-inducing convex function f(·) and solve x\12\|y-x\|22+σλ f(x)\ to estimate x0, for some λ>0. Common choices for f(·) include the 1 norm for sparse vectors, the 1-2 norm for block-sparse signals and the nuclear norm for low-rank matrices. The metric we use to evaluate the performance of an estimate x* is the normalized mean-squared-error NMSE(σ)=E\|x*-x0\|22σ2. We show that NMSE is maximized as σ→ 0 and we find the exact worst case NMSE, which has a simple geometric interpretation: the mean-squared-distance of a standard normal vector to the λ-scaled subdifferential λ∂ f(x0). When λ is optimally tuned to minimize the worst-case NMSE, our results can be related to the constrained denoising problem f(x)≤ f(x0)\\|y-x\|2\. The paper also connects these results to the generalized LASSO problem, in which, one solves f(x)≤ f(x0)\\|y-Ax\|2\ to estimate x0 from noisy linear observations y=Ax0+z. We show that certain properties of the LASSO problem are closely related to the denoising problem. In particular, we characterize the normalized LASSO cost and show that it exhibits a "phase transition" as a function of number of observations. Our results are significant in two ways. First, we find a simple formula for the performance of a general convex estimator. Secondly, we establish a connection between the denoising and linear inverse problems.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.