A Split-Client Approach to Second-Order Optimization
Abstract
Second-order optimization methods offer superior convergence rates but are often bottlenecked by the wall-clock cost of Hessian computation and factorization. In the moderate-dimensional regime where the full Hessian fits in memory, factorization O(d3) typically dominates gradient evaluation O(nd), creating a synchronization barrier that negates the per-iteration progress of classical second-order methods. We propose the Split-Client framework, which decouples optimization into parallel gradient and curvature processes. Unlike Lazy Hessian approaches, whose arithmetic-complexity analysis does not charge factorization time and whose optimal reuse frequency requires tuning, our method is fully delay-adaptive: its wall-clock complexity scales with the average delay τ, and it matches the optimally-tuned Lazy rate of O(-3/2τ) without any tuning. For persistent curvature error, we provide a noise-adaptive schedule with O(T-3/4) rate (on E[\|∇ f\|]3/2), recovering the rate that uniform-error analyses such as Kamzolov et al (2023) achieve via inflated regularization. Under a verifiable subspace-alignment condition, an additional structured analysis based on the secant condition of L-BFGS gives a faster O(T-1) rate, with a hybrid theorem interpolating smoothly between the two regimes. We extend the framework to Subsampled Cubic Newton with adaptive batch sizes and an aggregate sampling budget linear in T. Experiments on two non-convex problems show wall-clock speedups of up to 800× over Vanilla and 30× over Lazy in the strongly factorization-dominated regime.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.