Category theory for genetics II: genotype, phenotype and haplotype
Abstract
The overarching goal of this paper is to solve the word problem for a class of idempotent commutative monoids whose elements model population haplotypes. More specifically, we design an algebraic framework in which it is possible to unravel population stratification relationships and infer linkage disequilibrium in terms of algebraic equations of haplotypes expressed in idempotent commutative monoids. We show how these relations can be used to clarify haplotype-phenotype associations through the consideration of intermediate phenotypes and genetic mechanisms such as segregation and homologous recombination. The present work paves the way for the implementation of combinatorial GWAS in the study of complex traits, and for a framework in which one can infer genetic variants interactions along with the corresponding regulatory circuitry. Throughout the paper, we formalize concepts such as genotypes, haplotypes and haplogroups, and model segregation and homologous recombination in the language of pedigrads (introduced in previous work). The benefit of using pedigrads is that they provide a computational framework in which one can reason about haplotype dynamics over several generations.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.