Integrating prior knowledge in equation discovery: Interpretable symmetry-informed neural networks and symbolic regression via characteristic curves

Abstract

Data-driven equation discovery aims to reconstruct governing equations directly from empirical observations. A fundamental challenge in this domain is the ill-posed nature of the inverse problem, where multiple distinct mathematical models may yield similar errors, thus complicating model selection and failing to guarantee a unique representation of the underlying mechanisms. This issue can be addressed by incorporating inductive biases to constrain the search space and discard the undesirable models. The characteristic curves-based (CCs) framework offers a modular approach ideally suited to this aim. This approach is based on the specification of structural families that possess provable identifiability properties. Crucially, this framework enables practitioners to embed domain expertise directly into the learning process and facilitates the integration of diverse post-processing tools. In this work, we build upon the recent neural network implementation of this formalism (NN-CC), which benefits from the universal approximation capabilities of NNs. Specifically, we extend NN-CC by introducing two inductive biases: (i) symmetry constraints and (ii) post-processing with symbolic regression. Using a chaotic Duffing oscillator and a discontinuous stick-slip model under varying Gaussian noise levels, we show how these extensions systematically improve the discovery process. We also analyze the integration of sparse and symbolic regression (using SINDy and PySR) into the CC-based formalism. These extensions (SINDy-CC and SR-CC) consistently show improvements as prior information is incorporated. By enabling the integration of prior or hypothesized knowledge into the learning and post-processing stages, the CC-based formalism emerges as a promising candidate to address identifiability issues in purely data-driven methods, advancing the goal of interpretable and reliable system identification.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…