Circula-based multivariate distributions on the flat torus, with applications in structural biology

Abstract

Modeling dependencies between random variables independently from their marginals is fundamental in applications ranging from finance to (structural) biology. In this work, we undertake this problem using circula to model data living on the d-dimensional flat torus Td, making two contributions. First, using a low rank covariance structure to define circulae based on a latent variable model, we design the first closed-form normalized distribution on the flat torus Td--with covariance structure. Second, building on this framework, we propose the first models for joint distributions of torsion angles (backbone and side-chains) for neighboring amino-acids in proteins. In practice, we fit mixtures on flat torii from T2 to T14, and show they are SOTA in terms of likelihood and sparsity. We anticipate that these models will prove fundamental to move from discrete structural studies like in AlphaFold2, to thermodynamics and kinetics, which are the ultimate goals in theoretical biophysics.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…