A Variational-Flow Analysis of StoRM under Noise-Power Mismatch

Abstract

Diffusion-based speech enhancement architectures that pair a deterministic predictor with a learned score network, exhibit a sharp non-smooth transition (``kink'') in the SI-SDR degradation curve at the training-time noise amplitude. We give a pathwise variational-flow analysis that localizes this non-smoothness to the predictor stage. The central identity is an exact factorization of the parametric sensitivity, ∂ (M) / ∂ M = K(M) · ∂ CM / ∂ M, where K(M) is a continuous matrix-valued functional of the score Jacobian along the reverse trajectory and CM = Π(y(M)) is the predictor output. Under three hypotheses on the reverse-process flow (score-Jacobian continuity, conditioning-Jacobian continuity, non-degeneracy of K), failure of M (M) to be C1 at M holds if and only if M Π(y(M)) fails to be C1 at M. We extend the localization to the finite-step Euler--Maruyama sampler actually run at inference. The hypotheses translate into a concrete experimental program; this paper specifies the program and presents the variational structure. The empirical validation is deferred to a companion experimental report.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…