Data-Driven Weak-form Discovery of Stochastic Systems

Abstract

We present an algorithm for learning the governing equations of a stochastic dynamical system from trajectory data. It recovers interpretable symbolic expressions for both the drift b(x) and the diffusion a(x) in a single pass, yielding a model that can be queried directly for relaxation timescales, metastable escape rates, and stationary distributions. Rather than estimating the dynamics one time step at a time, the algorithm averages each candidate term across the whole trajectory before regressing; a drift-informed correction further removes the finite-sampling bias in the diffusion estimate, cutting it from 4.6% to 0.6% for state-dependent noise. We also show that the trajectory averaging must use a spatial rather than a temporal weighting: temporal weighting, as in existing weak-form methods, is biased for stochastic data with an error that grows with dataset size. On three benchmark systems -- the Ornstein--Uhlenbeck process, a double-well Langevin system, and a multiplicative-noise system -- the algorithm recovers all coefficients to within 5%, stationary densities to within 0.01 in total variation, and escape rates that match the true dynamics.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…