作者: Halbert White
关键词: Null hypothesis 、 Model selection 、 Artificial intelligence 、 Statistical theory 、 Set (abstract data type) 、 Benchmark (computing) 、 Mathematics 、 Machine learning 、 Inference 、 Statistical hypothesis testing 、 Econometrics 、 Mistake
摘要: Data snooping occurs when a given set of data is used more than once for purposes inference or model selection. When such reuse occurs, there always the possibility that any satisfactory results obtained may simply be due to chance rather merit inherent in method yielding results. This problem practically unavoidable analysis time-series data, as typically only single history measuring phenomenon interest available analysis. It widely acknowledged by empirical researchers dangerous practice avoided, but fact it endemic. The main has been lack sufficiently simple practical methods capable assessing potential dangers situation. Our purpose here provide specifying straightforward procedure testing null hypothesis best encountered specification search no predictive superiority over benchmark model. permits undertaken with some degree confidence one will not mistake could have generated genuinely good