Flexible aggregation of compositional predictors with shared effects for microbiome association analysis
Abstract
Ongoing advancements in microbiome profiling have provided unprecedented insights into the molecular dynamics of microbial communities, sparking a surge of interest in uncovering the microbiome's critical role in human health. Identifying microbial features linked to clinical outcomes, however, remains challenging due to the high-dimensional, sparse, and compositional nature of microbiome data. Additionally, many microbial taxa, although classified as distinct, may share functional roles, complicating traditional variable selection methods. To overcome these obstacles, we introduce Bayesian Regression with Agglomerated Compositional Effects (BRACE), a novel approach using a spike-and-cluster prior combining Bernoulli activity indicators, an Ewens exchangeable partition prior on the finite active set, and a projection-based constrained Gaussian prior on cluster effects to perform data-adaptive clustering and variable selection. The methodological innovation of our work lies in how we combine the Ewens partition prior with a projection-based constrained Gaussian on the cluster atoms to enforce the sum-to-zero constraint. BRACE groups microbial taxa with similar effects on the outcome, yielding more interpretable models while enabling effective dimension reduction. Through comprehensive simulations and a real-world application examining the influence of oral microbiome composition on insulin resistance, we demonstrate BRACE's superior performance over existing methods, particularly in identifying key features with shared effects on outcomes.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.