Confounder-robust causal discovery and inference in Perturb-seq using proxy and instrumental variables

Abstract

Emerging single-cell technologies that combine CRISPR-based genetic perturbations with single-cell RNA sequencing, such as Perturb-seq, offer unprecedented opportunities to uncover cause-and-effect relationships among genes. Nonetheless, Perturb-seq experiments are subject to unobserved factors that, if not properly handled, can severely bias the inferred causal relationships between genes. These latent factors may arise not only from intrinsic molecular features of the regulatory elements, but also from unmeasured genes omitted due to cost-constrained experimental designs. Although methods for analyzing large-scale Perturb-seq data are rapidly maturing, approaches that explicitly account for such unobserved confounders when inferring causal gene networks are still lacking. Here, we propose a novel approach to accurately reconstruct causal gene networks from Perturb-seq data even when important confounders are missing. Our framework leverages proxy and instrumental variable strategies to exploit the rich information embedded in the perturbations, enabling unbiased estimation of the underlying directed acyclic graph (DAG) of gene expression. Applications to both comprehensive synthetic data and real CRISPR interference experiments in K562 cells demonstrate that our method outperforms baseline approaches that lack principled adjustments for unmeasured confounding, yielding more accurate and biologically relevant recovery of the true causal DAGs.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…