Sparse-limit approximation for t-statistics

Abstract

In a range of genomic applications, it is of interest to quantify the evidence that the signal at site~i is active given conditionally independent replicate observations summarized by the sample mean and variance ( Y, s2) at each site. We study the version of the problem in which the signal distribution is sparse, and the error distribution has an unknown site-specific variance so that the null distribution of the standardized statistic is Student-t rather than Gaussian. The main contribution of this paper is a sparse-mixture approximation to the non-null density of the t-ratio. This formula demonstrates the effect of low degrees of freedom on the Bayes factor, or the conditional probability that the site is active. We illustrate some differences on a HIV dataset for gene-expression data previously analyzed by Efron (2012).

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…