Linear-Scaling Potential-Free Data-Driven Molecular Dynamics for Arbitrary-Sized Water Clusters (H2O)n
Abstract
Conventional molecular dynamics (MD) simulation approaches, such as ab initio MD (AIMD) and empirical force field MD (EFFMD), face significant trade-offs between physical accuracy and computational efficiency. This work presents a linear-scaling potential-free data-driven molecular dynamics (PDMD) framework for predicting system energy and atomic forces of arbitrary-sized water clusters (H2O)n. Specifically, PDMD employs a Gaussian-based atomic geometry descriptor to generate high-dimensional, equivariant features, then leverages ChemGNN, a graph neural network model that adaptively learns the atomic chemical environments without requiring a priori knowledge. Through an iterative self-consistent training approach, the converged PDMD achieves a mean absolute error of 1.39 meV/atom for energy, outperforming other state-of-the-art models such as DeepMD, MACE, NequIP, and SevenNet by at least 2.6x in accuracy with the same dataset. As a result, the linear-scaling PDMD can reproduce the AIMD properties of water clusters at orders-of-magnitude lower computational cost, as illustrated by simulations of systems consisting of thousands or more molecules. These results demonstrate that the proposed PDMD offers multiphase predictive power and enables ultra-fast, general-purpose MD simulations while retaining AIMD-level accuracy. This accuracy is achieved by efficiently capturing many-body potentials that are critical in numerous polyatomic systems but are often missing in EFFMD. Moreover, we have constructed an ab initio dataset with over 300,000 (H2O)n structures, standardized in a unified PyTorch Geometric framework, to support scalable evaluation of artificial intelligence methods for molecular dynamics.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.