RanSOM: Second-Order Momentum with Randomized Scaling for Constrained and Unconstrained Optimization

El Mahdi Chayti

RanSOM: Second-Order Momentum with Randomized Scaling for Constrained and Unconstrained Optimization

Abstract

Momentum methods, such as Polyak's Heavy Ball, are the standard for training deep networks but suffer from curvature-induced bias in stochastic settings, limiting convergence to suboptimal O(ε-4) rates. Existing corrections typically require expensive auxiliary sampling or restrictive smoothness assumptions. We propose RanSOM, a unified framework that eliminates this bias by replacing deterministic step sizes with randomized steps drawn from distributions with mean ηt. This modification allows us to leverage Stein-type identities to compute an exact, unbiased estimate of the momentum bias using a single Hessian-vector product computed jointly with the gradient, avoiding auxiliary queries. We instantiate this framework in two algorithms: RanSOM-E for unconstrained optimization (using exponentially distributed steps) and RanSOM-B for constrained optimization (using beta-distributed steps to strictly preserve feasibility). Theoretical analysis confirms that RanSOM recovers the optimal O(ε-3) convergence rate under standard bounded noise, and achieves optimal rates for heavy-tailed noise settings (p ∈ (1, 2]).

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…