作者: John D. Storey
DOI:
关键词: Statistics 、 Mathematics 、 Type I and type II errors 、 Statistical significance 、 False positive paradox 、 Null hypothesis 、 Frequentist inference 、 Statistical hypothesis testing 、 False discovery rate 、 Multiple comparisons problem
摘要: In hypothesis testing, statistical significance is typically based on calculations involving p-values and Type I error rates. A p-value calculated from a single test can be used to determine whether there statistically significant evidence against the null hypothesis. The upper threshold applied in making this determination (often 5% scientific literature) determines rate; i.e., probability of when true. Multiple testing concerned with several hypotheses simultaneously. Defining more complex problem setting. longstanding definition for multiple tests involves one or errors among family tests, called family-wise rate. However, exist other well established formulations tests. Bayesian framework classification naturally allows calculate that each true given observed data (Efron et al. 2001, Storey 2003), frequentist definitions are also (Shaffer 1995). Soric (1989) proposed quantifying proportion all significant. He discoveries about rate false discoveries1 hypotheses. This discovery robust positive paradox particularly useful exploratory analyses, where having mostly findings set rather than guarding positives. Benjamini & Hochberg (1995) provided first implementation rates known operating characteristics. idea directly related pre-existing ideas, such as misclassification predictive value (Storey 2003).