Refinement and Performance Benchmark for Range-Separated Water Force Field

Abstract

In our previous work, we developed a CCSD(T)-level range-separated water force field that combines the power of physics-driven and machine learning models. However, it was found that expensive CCSD(T)/CBS calculations lead to limited number of QM data as well as the missing of force labels, both of which lead to training instability issues. Bulk properties show large variations that cannot be resolved by simply reducing the fitting error in small cluster QM dataset. Such instability in bulk phase simulation is a universal problem in the training of machine learning potentials (MLPs), and is particularly severe at CCSD(T) level of theory.In this work, using our range-separated water model as an example, we aim to overcome these limitations by developing a new training workflow. It is composed by several techniques including: 1. an active learning protocol that ensures more thorough sampling in different temperatures and densities; 2. an intermediate force label technique employing machine learning density functional; and 3. an ensemble knowledge distillation (EKD) method. These techniques significantly stabilize the resulting water model, consistently achieving sub-chemical accuracies in both cluster energies and experimental properties. Benchmarks are carried out for various properties including densities, radial distribution functions (RDFs), dielectric constants, diffusivity, and infrared spectra, all showing state-of-the-art (SOTA) performances and proving the effectiveness of the training protocol.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…