New features of parallel implementation of N-body problems on GPU

Abstract

This paper focuses on the parallel implementation of a direct N-body method~(particle-particle algorithm) and the application of multiple GPUs for galactic dynamics simulations. Application of a hybrid OpenMP-CUDA technology is considered for models with a number of particles N 105 107. By means of N-body simulations of gravitationally unstable stellar galactic we have investigated the algorithms parallelization efficiency for various Nvidia Tesla graphics processors~(K20, K40, K80). Particular attention was paid to the parallel performance of simulations and accuracy of the numerical solution by comparing single and double floating-point precisions~(SP and DP). We showed that the double-precision simulations are slower by a factor of~1.7 than the single-precision runs performed on Nvidia Tesla K-Series processors. We also claim that application of the single-precision operations leads to incorrect result in the evolution of the non-axisymmetric gravitating N-body systems. In particular, it leads to significant quantitative and even qualitative distortions in the galactic disk evolution. For instance, after 104 integration time steps for the single-precision numbers the total energy, momentum, and angular momentum of a system with N = 220 conserve with accuracy of 10-3, 10-2 and 10-3 respectively, in comparison to the double-precision simulations these values are 10-5, 10-15 and 10-13, respectively. Our estimations evidence in favour of usage of the second-order accuracy schemes with double-precision numbers since it is more efficient than in the fourth-order schemes with single-precision numbers.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…