Path Length Bounds for Gradient Descent and Flow
Abstract
We derive bounds on the path length ζ of gradient descent (GD) and gradient flow (GF) curves for various classes of smooth convex and nonconvex functions. Among other results, we prove that: (a) if the iterates are linearly convergent with factor (1-c), then ζ is at most O(1/c); (b) under the Polyak-Kurdyka-Lojasiewicz (PKL) condition, ζ is at most O(), where is the condition number, and at least (d 1/4); (c) for quadratics, ζ is (\d, \) and in some cases can be independent of ; (d) assuming just convexity, ζ can be at most 24d d; (e) for separable quasiconvex functions, ζ is (d). Thus, we advance current understanding of the properties of GD and GF curves beyond rates of convergence. We expect our techniques to facilitate future studies for other algorithms.