Adaptive Sketching Based Construction of H2 Matrices on GPUs
Abstract
We develop a novel linear-complexity bottom-up sketching-based algorithm for constructing a H2 matrix, and present its high performance GPU implementation. The construction algorithm requires both a black-box sketching operator and an entry evaluation function. The novelty of our GPU approach centers around the design and implementation of the above two operations in batched mode on GPU with accommodation for variable-size data structures in a batch. The batch algorithms minimize the number of kernel launches and maximize the GPU throughput. When applied to covariance matrices, volume IE matrices and H2 update operations, our proposed GPU implementation achieves up to 13× speedup over our CPU implementation, and up to 1000× speedup over an existing GPU implementation of the top-down sketching-based algorithm from the H2Opus library. It also achieves a 660× speedup over an existing sketching-based H construction algorithm from the ButterflyPACK library. Our work represents the first GPU implementation of the class of bottom-up sketching-based H2 construction algorithms.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.