On the performance of two-sided MPI, MPI-3 RMA and SHMEM in a Lagrangian particle cluster algorithm
Abstract
In this paper, we compare the parallel performance of three distributed-memory communication models for a cluster algorithm based on a nearest neighbour search algorithm for N-body simulations. The nearest neighbour is defined by the Euclidean distance in three-dimensional space. The resulting directed nearest neighbour graphs that are used to define the clusters are pruned in an iterative procedure where we use either point-to-point message passing interface (MPI), MPI-3 remote memory access (RMA), or SHMEM communication. The original algorithm has been developed and implemented as part of the elliptical parcel-in-cell (EPIC) method targeting geophysical fluid flows. The parallel scalability of the algorithm is discussed by means of an artificial and a standard fluid dynamics test case. Performance measurements were carried out on three different computing systems with InfiniBand FDR, Hewlett Packard Enterprise (HPE) Slingshot 10 or HPE Slingshot 200 interconnect.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.