Optimised use of data fusion and memory-based learning with an Austrian soil library for predictions with infrared data
Sponsor
Citation
In: European Journal of Soil Science Volume 74 / Issue 4 (2023-06-24) , S. ; eissn:1365-2389
Collections
Infrared spectroscopy in the visible to near-infrared (vis–NIR) and mid-infrared (MIR) regions is a well-established approach for the prediction of soil properties. Different data fusion and training approaches exist, and the optimal procedures are yet undefined and may depend on the heterogeneity present in the set and on the considered scale. The objectives were to test the usefulness of partial least squares regressions (PLSRs) for soil organic carbon (SOC), total carbon (Ct), total nitrogen (Nt) and pH using vis–NIR and MIR spectroscopy for an independent validation after standard calibration (use of a general PLSR model) or using memory-based learning (MBL) with and without spiking for a national spectral database. Data fusion approaches were simple concatenation of spectra, outer product analysis (OPA) and model averaging. In total, 481 soils from an Austrian forest soil archive were measured in the vis–NIR and MIR regions, and regressions were calculated. Fivefold calibration-validation approaches were carried out with a region-related split of spectra to implement independent validations with n ranging from 47 to 99 soils in different folds. MIR predictions were generally superior over vis–NIR predictions. For all properties, optimal predictions were obtained with data fusion, with OPA and spectra concatenation outperforming model averaging. The greatest robustness of performance was found for OPA and MBL with spiking with R2 ≥ 0.77 (N), 0.85 (SOC), 0.86 (pH) and 0.88 (Ct) in the validations of all folds. Overall, the results indicate that the combination of OPA for vis–NIR and MIR spectra with MBL and spiking has a high potential to accurately estimate properties when using large-scale soil spectral libraries as reference data. However, the reduction of cost-effectiveness using two spectrometers needs to be weighed against the potential increase in accuracy compared to a single MIR spectroscopy approach.
Highlights
@article{doi:10.17170/kobra-202308058580, author ={Ludwig, Bernard and Greenberg, Isabel and Vohland, Michael and Michel, Kerstin}, title ={Optimised use of data fusion and memory-based learning with an Austrian soil library for predictions with infrared data}, keywords ={004 and 500 and Datenfusion and Validierung and Infrarotspektroskopie and Maschinelles Lernen and Stickstoff and Wasserstoffionenkonzentration and Kohlenstoff and TOC and Physikochemische Bodeneigenschaft}, copyright ={http://creativecommons.org/licenses/by-nc-nd/4.0/}, language ={en}, journal ={European Journal of Soil Science}, year ={2023-06-24} }