SMC-AI: Scaling Monte Carlo Simulation to Four Trillion Atoms with AI Accelerators

Abstract

The rapid advancement of deep learning is reshaping the hardware design landscape toward AI tasks, posing fundamental challenges for HPC workloads such as atomistic simulation. Here we present SMC-AI, a general algorithmic framework that extends the SMC-X method for efficient canonical Monte Carlo simulation on AI accelerators, including GPUs and NPUs, while maintaining extreme scalability. The implementation of SMC-AI on an NPU cluster reaches unprecedented performance, achieving MC simulation of 4 trillion atoms on 4096 NPU dies. This represents the largest ML-accelerated atomistic simulation reported, delivering 32X system size and 1.3X throughput than previous records, with a relatively small computational budget. Excellent strong and weak scaling efficiency are reached for both the NPU and GPU implementation. By decoupling ML models from simulation, SMC-AI creates an abstraction that facilitates integration and porting of diverse ML models, laying a foundation for the future development of scalable scientific software.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…