Evaluation of tools for differential gene expression analysis by RNA-seq on a 48 biological replicate experiment

Abstract

An RNA-seq experiment with 48 biological replicates in each of 2 conditions was performed to determine the number of biological replicates (nr) required, and to identify the most effective statistical analysis tools for identifying differential gene expression (DGE). When nr=3, seven of the nine tools evaluated give true positive rates (TPR) of only 20 to 40 percent. For high fold-change genes (|log2(FC)|2) the TPR is 85 percent. Two tools performed poorly; over- or under-predicting the number of differentially expressed genes. Increasing replication gives a large increase in TPR when considering all DE genes but only a small increase for high fold-change genes. Achieving a TPR 85% across all fold-changes requires nr20. For future RNA-seq experiments these results suggest nr6, rising to nr12 when identifying DGE irrespective of fold-change is important. For 6 nr 12, superior TPR makes edgeR the leading tool tested. For nr 12, minimizing false positives is more important and DESeq outperforms the other tools.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…