Learning Implicit Bias in Generative Spaces for Accelerating Protein Dynamics Emulation

Yuan Qi

Learning Implicit Bias in Generative Spaces for Accelerating Protein Dynamics Emulation

Abstract

Generative emulators of protein dynamics produce plausible trajectories at a fraction of the cost of molecular dynamics, but they inherit their training distribution and tend to revisit known states rather than reach rare ones under long-horizon extrapolation. Inspired by classical enhanced sampling, we introduce an implicit, history-dependent bias in the generative space of a pretrained emulator. Specifically, a history-aware score estimator augments the frozen emulator with a distance-weighted bias that steers reverse-time sampling away from previously generated structures, regularized by an environment-support term. To preserve structural validity at long horizons, a score-based refinement step re-projects drifted samples onto the data manifold using the frozen emulator. Our experiments demonstrate that the method (i) raises diversity by 35\% on DynamicPDB-80; (ii) on 12 zero-shot Fast-Folding proteins, the learned bias alone reaches the unbiased emulator's coverage up to 15× faster, and pairing it with refinement reaches the coverage up to 37× faster while covering 3× as many low-energy states. Code will be released soon.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…