Efficient Task Graph Scheduling for Parallel QR Factorization in SLSQP

Abstract

Efficient task scheduling is paramount in parallel programming on multi-core architectures, where tasks are fundamental computational units. QR factorization is a critical sub-routine in Sequential Least Squares Quadratic Programming (SLSQP) for solving non-linear programming (NLP) problems. QR factorization decomposes a matrix into an orthogonal matrix Q and an upper triangular matrix R, which are essential for solving systems of linear equations arising from optimization problems. SLSQP uses an in-place version of QR factorization, which requires storing intermediate results for the next steps of the algorithm. Although DAG-based approaches for QR factorization are prevalent in the literature, they often lack control over the intermediate kernel results, providing only the final output matrices Q and R. This limitation is particularly challenging in SLSQP, where intermediate results of QR factorization are crucial for back-substitution logic at each iteration. Our work introduces novel scheduling techniques using a two-queue approach to execute the QR factorization kernel effectively. This approach, implemented in high-level C++ programming language, facilitates compiler optimizations and allows storing intermediate results required by back-substitution logic. Empirical evaluations demonstrate substantial performance gains, including a 10x improvement over the sequential QR version of the SLSQP algorithm.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…