blockCV: an R package for generating spatially or environmentally separated folds for k-fold cross-validation of species distribution models

作者: Roozbeh Valavi , Jane Elith , José J. Lahoz-Monfort , Gurutzeta Guillera-Arroita

DOI: 10.1101/357798

关键词: CovariateCross-validationData miningModel selectionSpecies distributionSpatial analysisFold (geology)Environmental niche modellingComputer science

摘要: When applied to structured data, conventional random cross-validation techniques can lead underestimation of prediction error, and may result in inappropriate model selection. We present the R package blockCV, a new toolbox for species distribution modelling. The generate spatially or environmentally separated folds. It includes tools measure spatial autocorrelation ranges candidate covariates, providing user with insights into structure these data. also offers interactive graphical capabilities creating blocks exploring data Package blockCV enables modellers more easily implement range evaluation approaches. will help modelling community learn about impacts approaches on our understanding predictive performance models.

参考文章(21)
Ana M.F. Bio, Piet De Becker, Els De Bie, Willy Huybrechts, Martin Wassen, Prediction of plant species distribution in lowland river valleys in Belgium: Modelling species response to site conditions Biodiversity and Conservation. ,vol. 11, pp. 2189- 2216 ,(2002) , 10.1023/A:1021346712677
Robert Tibshirani, Trevor Hastie, Jerome H. Friedman, The Elements of Statistical Learning ,(2001)
Gurutzeta Guillera-Arroita, José J. Lahoz-Monfort, Jane Elith, Ascelin Gordon, Heini Kujala, Pia E. Lentini, Michael A. McCarthy, Reid Tingley, Brendan A. Wintle, Is my species distribution model fit for purpose? Matching data and models to applications Global Ecology and Biogeography. ,vol. 24, pp. 276- 292 ,(2015) , 10.1111/GEB.12268
Mathias Trachsel, Richard J. Telford, Technical note: Estimating unbiased transfer-function performances in spatially structured environments Climate of The Past. ,vol. 12, pp. 1215- 1223 ,(2016) , 10.5194/CP-12-1215-2016
Michael F. Goodchild, Paul A. Longley, David W. Rhind, David J. Maguire, Geographic Information Science and Systems John Wiley & Sons, Inc.. ,(2015)
J. A. Hartigan, M. A. Wong, A K-Means Clustering Algorithm Journal of The Royal Statistical Society Series C-applied Statistics. ,vol. 28, pp. 100- 108 ,(1979) , 10.2307/2346830
R.J. Telford, H.J.B. Birks, Evaluation of transfer functions in spatially structured environments Quaternary Science Reviews. ,vol. 28, pp. 1309- 1316 ,(2009) , 10.1016/J.QUASCIREV.2008.12.020
Paul H. Hiemstra, Edzer J. Pebesma, Chris J.W. Twenhöfel, Gerard B.M. Heuvelink, Real-time automatic interpolation of ambient gamma dose rates from the Dutch radioactivity monitoring network Computers & Geosciences. ,vol. 35, pp. 1711- 1721 ,(2009) , 10.1016/J.CAGEO.2008.10.011
Alexander Brenning, Spatial cross-validation and bootstrap for the assessment of prediction rules in remote sensing: The R package sperrorest 2012 IEEE International Geoscience and Remote Sensing Symposium. pp. 5372- 5375 ,(2012) , 10.1109/IGARSS.2012.6352393
Ian W. Renner, Jane Elith, Adrian Baddeley, William Fithian, Trevor Hastie, Steven J. Phillips, Gordana Popovic, David I. Warton, Point process models for presence-only analysis Methods in Ecology and Evolution. ,vol. 6, pp. 366- 379 ,(2015) , 10.1111/2041-210X.12352