Fokker-Planck Analysis and Invariant Laws for a Continuous-Time Stochastic Model of Adam-Type Dynamics
Abstract
We develop a continuous-time model for the long-term dynamics of adaptive stochastic optimization, focusing on bias-corrected Adam-type methods. Starting from a finite-sum setting, we identify a canonical scaling of learning rates, decay parameters, and gradient noise that yields a coupled, time-inhomogeneous stochastic differential equation for the parameters xt, first-moment tracker zt, and second-moment tracker yt. Bias correction persists via explicit time-dependent coefficients, and the dynamics becomes asymptotically time-homogeneous. We analyze the associated Fokker-Planck equation and, under mild regularity and dissipativity assumptions on f, prove existence and uniqueness of invariant measures. Noise propagation is governed by A(x)=Diag(∇ f(x))Hf(x). Hypoellipticity may fail on DA× Rm×( R+)m, where \[ DA=\x∈ Rm:∃ j,\ ej A(x)=0\⊂\x: A(x)=0\= DA, \] and critical points of f lie in DA. We show DA≠ Rm and use this to prove exponential convergence of the Markov semigroup μ0Pt to a unique invariant measure, uniformly in μ0. The proof uses a Harris-type argument, minorization on Lyapunov sublevel sets, control constructions, and hypoellipticity on ( Rm DA)× Rm×( R+)m. This provides a transparent continuous-time view of Adam-type dynamics.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.