cunuSHT: GPU Accelerated Spherical Harmonic Transforms on Arbitrary Pixelizations

Abstract

We present cunusht, a general-purpose Python package that wraps a highly efficient CUDA implementation of the nonuniform spin-0 spherical harmonic transform. The method is applicable to arbitrary pixelization schemes, including schemes constructed from equally-spaced iso-latitude rings as well as completely nonuniform ones. The algorithm has an asymptotic scaling of O( max3) for maximum multipole max and achieves machine precision accuracy. While cunusht is developed for applications in cosmology in mind, it is applicable to various other interpolation problems on the sphere. We outperform the fastest available CPU algorithm by a factor of up to 5 for problems with a nonuniform pixelization and max>4·103 when comparing a single modern GPU to a modern 32-core CPU. This performance is achieved by utilizing the double Fourier sphere method in combination with the nonuniform fast Fourier transform and by avoiding transfers between the host and device. For scenarios without GPU availability, cunusht wraps existing CPU libraries. cunusht is publicly available and includes tests, documentation, and demonstrations.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…