Are Thousands of Samples Really Needed to Generate Robust Gene-List for Prediction of Cancer Outcome?

Abstract

The prediction of cancer prognosis and metastatic potential immediately after the initial diagnoses is a major challenge in current clinical research. The relevance of such a signature is clear, as it will free many patients from the agony and toxic side-effects associated with the adjuvant chemotherapy automatically and sometimes carelessly subscribed to them. Motivated by this issue, Ein-Dor (2006) and Zuk (2007) presented a Bayesian model which leads to the following conclusion: Thousands of samples are needed to generate a robust gene list for predicting outcome. This conclusion is based on existence of some statistical assumptions. The current work raises doubts over this determination by showing that: (1) These assumptions are not consistent with additional assumptions such as sparsity and Gaussianity. (2) The empirical Bayes methodology which was suggested in order to test the relevant assumptions doesn't detect severe violations of the model assumptions and consequently an overestimation of the required sample size might be incurred.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…