Generalisations and improvements of New Q-Newton's method Backtracking
Abstract
In this paper, we propose a general framework for the algorithm New Q-Newton's method Backtracking, developed in the author's previous work. For a symmetric, square real matrix A, we define minsp(A):= ||e||=1 ||Ae||. Given a C2 cost function f:Rm→ R and a real number 0<τ , as well as m+1 fixed real numbers δ 0,… ,δ m, we define for each x∈ Rm with ∇ f(x)= 0 the following quantities: := i= j|δ i-δ j|; A(x):=∇ 2f(x)+δ ||∇ f(x)||τId, where δ is the first element in the sequence \δ 0,… ,δ m\ for which minsp(A(x))≥ ||∇ f(x)||τ; e1(x),… ,em(x) are an orthonormal basis of Rm, chosen appropriately; w(x)= the step direction, given by the formula: w(x)=Σ i=1m<∇ f(x),ei(x)>||A(x)ei(x)||ei(x); (we can also normalise by w(x)/ \1,||w(x)||\ when needed) γ (x)>0 learning rate chosen by Backtracking line search so that Armijo's condition is satisfied: f(x-γ (x)w(x))-f(x)≤ -13γ (x)<∇ f(x),w(x)>. The update rule for our algorithm is x H(x)=x-γ (x)w(x). In New Q-Newton's method Backtracking, the choices are τ =1+α >1 and e1(x),… ,em(x)'s are eigenvectors of ∇ 2f(x). In this paper, we allow more flexibility and generality, for example τ can be chosen to be <1 or e1(x),… ,em(x)'s are not necessarily eigenvectors of ∇ 2f(x). New Q-Newton's method Backtracking (as well as Backtracking gradient descent) is a special case, and some versions have flavours of quasi-Newton's methods. Several versions allow good theoretical guarantees. An application to solving systems of polynomial equations is given.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.