False Discovery Rate.

作者: John D. Storey

DOI:

关键词: StatisticsMathematicsType I and type II errorsStatistical significanceFalse positive paradoxNull hypothesisFrequentist inferenceStatistical hypothesis testingFalse discovery rateMultiple comparisons problem

摘要: In hypothesis testing, statistical significance is typically based on calculations involving p-values and Type I error rates. A p-value calculated from a single test can be used to determine whether there statistically significant evidence against the null hypothesis. The upper threshold applied in making this determination (often 5% scientific literature) determines rate; i.e., probability of when true. Multiple testing concerned with several hypotheses simultaneously. Defining more complex problem setting. longstanding definition for multiple tests involves one or errors among family tests, called family-wise rate. However, exist other well established formulations tests. Bayesian framework classification naturally allows calculate that each true given observed data (Efron et al. 2001, Storey 2003), frequentist definitions are also (Shaffer 1995). Soric (1989) proposed quantifying proportion all significant. He discoveries about rate false discoveries1 hypotheses. This discovery robust positive paradox particularly useful exploratory analyses, where having mostly findings set rather than guarding positives. Benjamini & Hochberg (1995) provided first implementation rates known operating characteristics. idea directly related pre-existing ideas, such as misclassification predictive value (Storey 2003).

参考文章(17)
Newton E. Morton, Sequential tests for the detection of linkage American Journal of Human Genetics. ,vol. 7, pp. 277- 318 ,(1955)
Daniel Yekutieli, Yoav Benjamini, THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY Annals of Statistics. ,vol. 29, pp. 1165- 1188 ,(2001) , 10.1214/AOS/1013699998
J. D. Storey, R. Tibshirani, Statistical significance for genomewide studies Proceedings of the National Academy of Sciences of the United States of America. ,vol. 100, pp. 9440- 9445 ,(2003) , 10.1073/PNAS.1530509100
B. Devlin, Kathryn Roeder, Genomic control for association studies. Biometrics. ,vol. 55, pp. 997- 1004 ,(1999) , 10.1111/J.0006-341X.1999.00997.X
J. T. Leek, J. D. Storey, A general framework for multiple testing dependence Proceedings of the National Academy of Sciences of the United States of America. ,vol. 105, pp. 18718- 18723 ,(2008) , 10.1073/PNAS.0808709105
Bradley Efron, Robert Tibshirani, John D Storey, Virginia Tusher, Empirical Bayes analysis of a microarray experiment Journal of the American Statistical Association. ,vol. 96, pp. 1151- 1160 ,(2001) , 10.1198/016214501753382129
John D. Storey, The optimal discovery procedure: a new approach to simultaneous significance testing Journal of The Royal Statistical Society Series B-statistical Methodology. ,vol. 69, pp. 347- 368 ,(2007) , 10.1111/J.1467-9868.2007.005592.X
John D. Storey, A direct approach to false discovery rates Journal of The Royal Statistical Society Series B-statistical Methodology. ,vol. 64, pp. 479- 498 ,(2002) , 10.1111/1467-9868.00346
Branko Sorić, Statistical “Discoveries” and Effect-Size Estimation Journal of the American Statistical Association. ,vol. 84, pp. 608- 610 ,(1989) , 10.1080/01621459.1989.10478811