Analysis of A Splitting Approach for the Parallel Solution of Linear Systems on GPU Cards

Abstract

We discuss an approach for solving sparse or dense banded linear systems A x = b on a Graphics Processing Unit (GPU) card. The matrix A ∈ RN × N is possibly nonsymmetric and moderately large; i.e., 10000 ≤ N ≤ 500000. The split\ and\ parallelize ( SaP) approach seeks to partition the matrix A into diagonal sub-blocks Ai, i=1,…,P, which are independently factored in parallel. The solution may choose to consider or to ignore the matrices that couple the diagonal sub-blocks Ai. This approach, along with the Krylov subspace-based iterative method that it preconditions, are implemented in a solver called SaP::GPU, which is compared in terms of efficiency with three commonly used sparse direct solvers: PARDISO, SuperLU, and MUMPS. SaP::GPU, which runs entirely on the GPU except several stages involved in preliminary row-column permutations, is robust and compares well in terms of efficiency with the aforementioned direct solvers. In a comparison against Intel's MKL, SaP::GPU also fares well when used to solve dense banded systems that are close to being diagonally dominant. SaP::GPU is publicly available and distributed as open source under a permissive BSD3 license.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…