TY - JOUR
T1 - Using soil library hyperspectral reflectance and machine learning to predict soil organic carbon
T2 - Assessing potential of airborne and spaceborne optical soil sensing
AU - Wang, Sheng
AU - Guan, Kaiyu
AU - Zhang, Chenhui
AU - Lee, Do Kyoung
AU - Margenot, Andrew J.
AU - Ge, Yufeng
AU - Peng, Jian
AU - Zhou, Wang
AU - Zhou, Qu
AU - Huang, Yizhi
N1 - Publisher Copyright:
© 2022 Elsevier Inc.
PY - 2022/3/15
Y1 - 2022/3/15
N2 - Soil organic carbon (SOC) is a key variable to determine soil functioning, ecosystem services, and global carbon cycles. Spectroscopy, particularly optical hyperspectral reflectance coupled with machine learning, can provide rapid, efficient, and cost-effective quantification of SOC. However, how to exploit soil hyperspectral reflectance to predict SOC concentration, and the potential performance of airborne and satellite data for predicting surface SOC at large scales remain relatively underknown. This study utilized a continental-scale soil laboratory spectral library (37,540 full-pedon 350–2500 nm reflectance spectra with SOC concentration of 0–780 g·kg−1 across the US) to thoroughly evaluate seven machine learning algorithms including Partial-Least Squares Regression (PLSR), Random Forest (RF), K-Nearest Neighbors (KNN), Ridge, Artificial Neural Networks (ANN), Convolutional Neural Networks (CNN), and Long Short-Term Memory (LSTM) along with four preprocessed spectra, i.e. original, vector normalization, continuum removal, and first-order derivative, to quantify SOC concentration. Furthermore, by using the coupled soil-vegetation-atmosphere radiative transfer model, we simulated twelve airborne and spaceborne hyper/multi-spectral remote sensing data from surface bare soil laboratory spectra to evaluate their potential for estimating SOC concentration of surface bare soils. Results show that LSTM achieved best predictive performance of quantifying SOC concentration for the whole data sets (R2 = 0.96, RMSE = 30.81 g·kg−1), mineral soils (SOC ≤ 120 g·kg−1, R2 = 0.71, RMSE = 10.60 g·kg−1), and organic soils (SOC > 120 g·kg−1, R2 = 0.78, RMSE = 62.31 g·kg−1). Spectral data preprocessing, particularly the first-order derivative, improved the performance of PLSR, RF, Ridge, KNN, and ANN, but not LSTM or CNN. We found that the SOC models of mineral and organic soils should be distinguished given their distinct spectral signatures. Finally, we identified that the shortwave infrared is vital for airborne and spaceborne hyperspectral sensors to monitor surface SOC. This study highlights the high accuracy of LSTM with hyperspectral/multispectral data to mitigate a certain level of noise (soil moisture <0.4 m3·m−3, green leaf area < 0.3 m2·m−2, plant residue <0.4 m2·m−2) for quantifying surface SOC concentration. Forthcoming satellite hyperspectral missions like Surface Biology and Geology (SBG) have a high potential for future global soil carbon monitoring, while high-resolution satellite multispectral fusion data can be an alternative.
AB - Soil organic carbon (SOC) is a key variable to determine soil functioning, ecosystem services, and global carbon cycles. Spectroscopy, particularly optical hyperspectral reflectance coupled with machine learning, can provide rapid, efficient, and cost-effective quantification of SOC. However, how to exploit soil hyperspectral reflectance to predict SOC concentration, and the potential performance of airborne and satellite data for predicting surface SOC at large scales remain relatively underknown. This study utilized a continental-scale soil laboratory spectral library (37,540 full-pedon 350–2500 nm reflectance spectra with SOC concentration of 0–780 g·kg−1 across the US) to thoroughly evaluate seven machine learning algorithms including Partial-Least Squares Regression (PLSR), Random Forest (RF), K-Nearest Neighbors (KNN), Ridge, Artificial Neural Networks (ANN), Convolutional Neural Networks (CNN), and Long Short-Term Memory (LSTM) along with four preprocessed spectra, i.e. original, vector normalization, continuum removal, and first-order derivative, to quantify SOC concentration. Furthermore, by using the coupled soil-vegetation-atmosphere radiative transfer model, we simulated twelve airborne and spaceborne hyper/multi-spectral remote sensing data from surface bare soil laboratory spectra to evaluate their potential for estimating SOC concentration of surface bare soils. Results show that LSTM achieved best predictive performance of quantifying SOC concentration for the whole data sets (R2 = 0.96, RMSE = 30.81 g·kg−1), mineral soils (SOC ≤ 120 g·kg−1, R2 = 0.71, RMSE = 10.60 g·kg−1), and organic soils (SOC > 120 g·kg−1, R2 = 0.78, RMSE = 62.31 g·kg−1). Spectral data preprocessing, particularly the first-order derivative, improved the performance of PLSR, RF, Ridge, KNN, and ANN, but not LSTM or CNN. We found that the SOC models of mineral and organic soils should be distinguished given their distinct spectral signatures. Finally, we identified that the shortwave infrared is vital for airborne and spaceborne hyperspectral sensors to monitor surface SOC. This study highlights the high accuracy of LSTM with hyperspectral/multispectral data to mitigate a certain level of noise (soil moisture <0.4 m3·m−3, green leaf area < 0.3 m2·m−2, plant residue <0.4 m2·m−2) for quantifying surface SOC concentration. Forthcoming satellite hyperspectral missions like Surface Biology and Geology (SBG) have a high potential for future global soil carbon monitoring, while high-resolution satellite multispectral fusion data can be an alternative.
KW - Hyperspectral reflectance
KW - Long short-term memory
KW - Machine learning
KW - Radiative transfer modeling
KW - SBG
KW - Soil organic carbon
KW - Spectroscopy
UR - http://www.scopus.com/inward/record.url?scp=85123635557&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85123635557&partnerID=8YFLogxK
U2 - 10.1016/j.rse.2022.112914
DO - 10.1016/j.rse.2022.112914
M3 - Article
AN - SCOPUS:85123635557
SN - 0034-4257
VL - 271
JO - Remote Sensing of Environment
JF - Remote Sensing of Environment
M1 - 112914
ER -