Sequential Sampling for Optimal Bayesian Classification of Sequencing Count Data
Abstract
High throughput technologies have become the practice of choice for comparative studies in biomedical applications. Limited number of sample points due to sequencing cost or access to organisms of interest necessitates the development of efficient sample collections to maximize the power of downstream statistical analyses. We propose a method for sequentially choosing training samples under the Optimal Bayesian Classification framework. Specifically designed for RNA sequencing count data, the proposed method takes advantage of efficient Gibbs sampling procedure with closed-form updates. Our results shows enhanced classification accuracy, when compared to random sampling.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.