Efficient and error-tolerant schemes for non-adaptive complex group testing and its application in complex disease genetics
Abstract
The goal of combinatorial group testing is to efficiently identify up to d defective items in a large population of n items, where d n. Defective items satisfy certain properties while the remaining items in the population do not. To efficiently identify defective items, a subset of items is pooled and then tested. In this work, we consider complex group testing (CmplxGT) in which a set of defective items consists of subsets of positive items (called positive complexes). CmplxGT is classified into two categories: classical CmplxGT (CCmplxGT) and generalized CmplxGT (GCmplxGT). In CCmplxGT, the outcome of a test on a subset of items is positive if the subset contains at least one positive complex, and negative otherwise. In GCmplxGT, the outcome of a test on a subset of items is positive if the subset has a certain number of items of some positive complex, and negative otherwise. For CCmplxGT, we present a scheme that efficiently identifies all positive complexes in time t × poly(d, n) in the presence of erroneous outcomes, where t is a predefined parameter. As d n, this is significantly better than the currently best time of poly(t) × O(n n). Moreover, in specific cases, the number of tests in our proposed scheme is smaller than previous work. For GCmplxGT, we present a scheme that efficiently identifies all positive complexes. These schemes are directly applicable in various areas such as complex disease genetics, molecular biology, and learning a hidden graph.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.