Bootstrap inference for linear regression between variables that are never jointly observed: application in in vivo experiments

Abstract

In modern experimental science, there is a common problem of estimating the coefficients of a linear regression in a context where the variables of interest cannot be observed simultaneously. When there is a categorical variable that is observed on all statistical units, we consider two estimators of linear regression that take this additional information into account: an estimator based on moments and an estimator based on optimal transport theory. These estimators are shown to be consistent and asymptotically Gaussian under weak hypotheses. The asymptotic variance has no explicit expression, except in some special cases, for which reason a stratified bootstrap approach is developed to construct confidence intervals for the estimated parameters, whose consistency is also shown. A simulation study evaluating and comparing the finite sample performance of these estimators demonstrates the advantages of the bootstrap approach in several realistic scenarios. An application to in vivo experiments, conducted in the context of studying radio-induced adverse effects in mice, revealed important relationships between the biomarkers of interest that could not be identified with the considered naive approach.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…