作者: Andy Stock , Ajit Subramaniam
关键词:
摘要: Monitoring phytoplankton community composition from space is an important challenge in ocean remote sensing. Researchers have proposed several algorithms for this purpose. However, the in-situ data used to train and validate such at global scale are often clustered along ship cruise tracks some well-studied locations, whereas many large marine regions no all. Furthermore, oceanographic variables typically spatially auto-correlated. In situation, common practice of validating with randomly chosen held-out observations can underestimate errors. Based on a database HPLC data, we applied supervised learning methods test empirical predicting relative concentrations eight diagnostic pigments that serve as biomarkers different types. For each pigment, trained three types satellite distinguished by their input data: abundance-based (using only chlorophyll-a input), spectral sensing reflectance), ecological (combining reflectance environmental variables). The were implemented statistical models (smoothing splines, polynomials, random forests boosted regression trees). To address clustering spatial auto-correlation, tested means block cross-validation. This provided less confident picture potential mapping hence associated using existing than suggested previous research 5-fold cross-validation conducted comparison. Of pigments, two (fucoxanthin zeaxanthin) could be predicted not considerably lower errors constant null model. Thus, global-scale based existing, multi-spectral commonly available estimate pigment distinguish broad classes, but likely inaccurate classes regions. Overall, had lowest prediction Finally, our results suggest more discussion best approaches training needed if unevenly distributed study region clustered.