Local Convergence Properties of SAGA/Prox-SVRG and Acceleration

Abstract

Over the past ten years, driven by large scale optimisation problems arising from machine learning, the development of stochastic optimisation methods have witnessed a tremendous growth. However, despite their popularity, the theoretical understandings of these methods are quite limited in contrast to the deterministic optimisation methods. In this paper, we present a local convergence analysis for a typical type of stochastic optimisation methods: proximal variance reduced stochastic gradient methods, and mainly focus on the SAGA [12] and Prox-SVRG [43] algorithms. Under the assumption that the non-smooth component of the optimisation problem is partly smooth relative to a smooth manifold, we present a unified framework for the local convergence analysis of the SAGA/Prox-SVRG algorithms: (i) the sequences generated by the SAGA/Prox-SVRG are able to identify the smooth manifold in a finite number of iterations; (ii) then the sequence enters a local linear convergence regime. Beyond local convergence analysis, we also discuss various possibilities for accelerating these algorithms, including adapting to better local parameters, and applying higher-order deterministic/stochastic optimisation methods which can achieve super-linear convergence. Concrete examples arising from machine learning are considered to verify the obtained results.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…