作者: Stéphanie Manel , H. Ceri Williams , S.J. Ormerod
DOI: 10.1046/J.1365-2664.2001.00647.X
关键词: Mathematics 、 Test data 、 Mode (statistics) 、 Environmental data 、 Cohen's kappa 、 Kappa 、 Sampling (statistics) 、 Statistic 、 Logistic regression 、 Ecology
摘要: 1. Models for predicting the distribution of organisms from environmental data are widespread in ecology and conservation biology. Their performance is invariably evaluated percentage success at occurrence test locations. 2. Using logistic regression with real 34 families aquatic invertebrates 180 Himalayan streams, we illustrate how this measure predictive accuracy affected systematically by prevalence (i.e. frequency occurrence) target organism. Many evaluations presence-absence models ecologists inherently misleading. 3. With same invertebrate models, examined alternative measures used remote sensing medical diagnostics. We particularly explored receiver-operating characteristic (ROC) plots, which were derived (i) area under each curve (AUC), considered an effective indicator model independent threshold probability presence organism accepted, (ii) optimized thresholds that maximize true absences presences correctly identified. also Cohen's kappa, a proportion all possible cases or absence predicted after accounting chance effects. 4. AUC ROC plots prevalence, but highly significantly correlated much more easily computed kappa. Moreover, when applied mode to data, erroneously overestimated among scarcer organisms, often those greatest interest. advocate caution using methods optimize required prediction. 5. Our strongest recommendation reduce their reliance on prediction as modelling. kappa provides simple, effective, standardized appropriate statistic evaluating comparing even based different statistical algorithms. None tests significance accuracy, identify priority research development.