A novel algorithm for GPU-accelerated particle-mesh interactions implemented in the QUOKKA code
Abstract
We present a novel, GPU-optimized algorithm for particle-mesh interactions in grid-based hydrodynamics simulations, designed for massively parallel architectures. This approach overcomes the inefficiency of particle neighbour searches or sorts across multiple GPU nodes by using a new "particle-mesh-particle" interaction scheme, which extends the particle-mesh method for self-gravity. The algorithm proceeds in two main stages: first, quantities exchanged between particles and the mesh -- such as mass, energy, and momentum added by stellar feedback or removed by accretion onto a sink -- are deposited into a buffer mesh equipped with ghost zones, where multiple contributions per cell are accumulated using atomic additions and then communicated across distributed memory ranks. In the second stage, the buffer states are applied to real mesh states, incorporating cell-wise limiters to enforce physical constraints such as positive density. We implement this scheme in the GPU-native radiation-magnetohydrodynamics code QUOKKA and validate it through a comprehensive suite of tests, including Bondi and Bondi-Hoyle accretion, and single and multiple supernova remnant evolution at varying spatial resolutions. We show that the algorithm achieves ≈ 50\% weak-scaling efficiency running on up to 8192 GPUs on the Frontier supercomputer. This scheme enables efficient, scalable particle-mesh coupling for GPU-optimized simulations of star formation and feedback in galaxies.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.