Biophysical inference of epistasis and the effects of mutations on protein stability and function

Abstract

Understanding the relationship between protein sequence, function, and stability is a fundamental problem in biology. While high-throughput methods have produced large numbers of sequence-function pairs, functional assays do not distinguish whether mutations directly affect function or are destabilizing the protein. Here, we introduce a statistical method to infer the underlying biophysics from a high-throughput binding assay by combining information from many mutated variants. We fit a thermodynamic model describing the bound, unbound, and unfolded states to high quality data of protein G domain B1 binding to IgG-Fc. We infer an energy landscape with distinct folding and binding energies for each substitution providing a detailed view of how mutations affect binding and stability across the protein. We accurately infer folding energy of each variant in physical units, validated by independent data, whereas previous high-throughput methods could only measure indirect changes in stability. While we assume an additive sequence-energy relationship, the binding fraction is epistatic due its non-linear relation to energy. Despite having no epistasis in energy, our model explains much of the observed epistasis in binding fraction, with the remaining epistasis identifying conformationally dynamic regions.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…