Principal Components Decomposition of Fraction of Variance Explained in High Dimensional Linear Models with Strong Correlation

Abstract

The fraction of variance explained (FVE) in a linear model quantifies the extent to which predictors account for outcome variability. In high-dimensional settings, where traditional FVE estimators do not apply, modern FVE estimators such as GWASH or linear mix-effect model estimated through the restricted maximum likelihood (LMM-REML) struggle with strong correlation among predictors, often found, for example, in brain imaging data. We propose a decomposition framework that partitions the FVE into two components: a low-dimensional component capturing the strong correlation, estimable by low dimensional methods, and a high-dimensional component with remaining weak correlation, estimable by high dimensional methods. Simulations demonstrate that decomposing dominant principal components (PCs) and estimating the high-dimensional FVE using GWASH or LMM-REML leads to improved bias reduction compared to directly applying standard approaches such as GWASH and LMM-REML. Our method shows consistent performance asymptotically as both the number of predictors and the number of samples increase. We illustrate the method in an analysis of the Adolescent Brain Cognitive Development (ABCD) brain imaging dataset, capturing nuanced heritability signals in the FVE of cognitive measures predicted by high-resolution brain imaging data.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…