On stochastic gradient Langevin dynamics with dependent data streams: the fully non-convex case
Abstract
We consider the problem of sampling from a target distribution, which is not necessarily logconcave, in the context of empirical risk minimization and stochastic optimization as presented in Raginsky et al. (2017). Non-asymptotic analysis results are established in the L1-Wasserstein distance for the behaviour of Stochastic Gradient Langevin Dynamics (SGLD) algorithms. We allow the estimation of gradients to be performed even in the presence of dependent data streams. Our convergence estimates are sharper and uniform in the number of iterations, in contrast to those in previous studies.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.