Incorporating Preconditioning into Accelerated Approaches: Theoretical Guarantees and Practical Improvement

Abstract

Machine learning and deep learning are widely researched fields that provide solutions to many modern problems. Due to the complexity of new problems related to the size of datasets, efficient approaches are obligatory. In optimization theory, the Heavy Ball and Nesterov methods use momentum in their updates of model weights. On the other hand, the minimization problems considered may be poorly conditioned, which affects the applicability and effectiveness of the aforementioned techniques. One solution to this issue is preconditioning, which has already been investigated in approaches such as AdaGrad, RMSProp, Adam and others. Despite this, momentum acceleration and preconditioning have not been fully explored together. Therefore, we propose the Preconditioned Heavy Ball (PHB) and Preconditioned Nesterov method (PN) with theoretical guarantees of convergence under unified assumption on the scaling matrix. Furthermore, we provide numerical experiments that demonstrate superior performance compared to the unscaled techniques in terms of iteration and oracle complexities.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…