Measurement-limited learning of conformational heterogeneity in cryo-electron microscopy
Abstract
Cryogenic electron microscopy images sample individual biomolecules from their conformational landscapes, offering a route to infer the distributions underlying molecular mechanisms. However, because images are indirect measurements, they limit which features of an underlying landscape are statistically identifiable. In ensemble reweighting, this problem appears as a choice of resolution: conformational space is discretized into representative structures whose population weights are inferred from images. Adding structures increases nominal resolution, but nearby conformations may generate overlapping image distributions and indistinguishable weights. Here, we develop an information-theoretic framework that selects representative conformations by maximizing mutual information between ensemble weights and images under a probabilistic forward model. Analytically, we show in a one-dimensional Gaussian model that measurement noise sets the optimal spacing. Applied to molecular conformations sampled from simulation, the framework constructs near-optimal ensembles that span heterogeneity while avoiding redundancy. Thus, the measurement process induces a maximally learnable coarse graining of conformation space.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.