A Theory of the NEPv Approach for Optimization On the Stiefel Manifold
Abstract
The NEPv approach has been increasingly used lately for optimization on the Stiefel manifold arising from machine learning. General speaking, the approach first turns the first order optimality condition, also known as the KKT condition, into a nonlinear eigenvalue problem with eigenvector dependency (NEPv) or a nonlinear polar decomposition with orthogonal factor dependency (NPDo) and then solve the nonlinear problem via some variations of the self-consistent-field (SCF) iteration. The difficulty, however, lies in designing a proper SCF iteration so that a maximizer is found at the end. Currently, each use of the approach is very much individualized, especially in its convergence analysis to show that the approach does work or otherwise. In this paper, a unifying framework is established. The framework is built upon some basic assumptions. If the basic assumptions are satisfied, globally convergence is guaranteed to a stationary point and during the SCF iterative process that leads to the stationary point, the objective function increases monotonically. Also a notion of atomic functions is proposed, which include commonly used matrix traces of linear and quadratic forms as special ones. It is shown that the basic assumptions are satisfied by atomic functions and by convex compositions of atomic functions. Together they provide a large collection of objectives for which the NEPv/NPDo approach is guaranteed to work.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.