Complexity and performance for two classes of noise-tolerant first-order algorithms

Abstract

Two classes of algorithms for optimization in the presence of noise are presented, that do not require the evaluation of the objective function. The first generalizes the well-known Adagrad method. Its complexity is then analyzed as a function of its parameters. A second class of algorithms is then derived whose complexity is at least as good as that of the first class. Initial numerical experiments on finite-sum problems arising from deep-learning applications suggest that methods of the second class may outperform those of the first.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…