Machine Learning for Modeling Soil Organic Carbon as Affected by Land Cover Change in the Nebraska Sandhills, USA

Lidong Li, Wanwan Liang, Tala Awada, Jeremy Hiller, Michael Kaiser

Research output: Contribution to journalArticlepeer-review


Land cover change can affect soil organic carbon (SOC) concentrations in both top- and subsoils. Here, we propose to implement emerging machine learning (ML) techniques to predict SOC concentrations in the soil profile (0–300 cm). Specifically, we assessed the use of the newly developed light gradient boosting machine (LGBM) and compared its accuracy in predicting SOC to seven commonly used algorithms including gradient boosting machine (GBM), extreme gradient boosting machine, multilayer perceptron, random forest, support vector machine, LASSO, and multiple linear regression. Soil samples were collected under three vegetation covers (native open C4 grass community; eastern redcedar, Juniperus virginiana; and ponderosa pine, Pinus ponderosa) at six soil depths (0–10, 10–30, 30–100, 100–170, 170–240, 240–300 cm). We determined SOC concentration, soil bulk density, plant-available nutrients, pH, Na, electrical conductivity, and cation exchange capacity. We used these soil properties as predictors to model SOC. Compared to the native grasslands, ponderosa pine and eastern redcedar stands exhibited increases in the measured SOC in topsoil but declines in subsoil. The ML models accurately simulated these variations. The best performing models were the gradient boosted tree-based models, i.e., GBM and LGBM (R2 = 0.920 and 0.918). The GBM- and the LGBM-predicted SOC, averaged across all the vegetation and soil depths, were 2.362 and 2.343 g kg−1, compared to the measured mean SOC of 2.360 g kg−1. Our study has confirmed the negative effects of land cover change on SOC concentrations in subsoil. We also validated the responsiveness to land cover change and soil depth and the accuracy of a newly developed machine learning algorithm, i.e., LGBM, in SOC modeling. Our results contribute to better understanding of SOC dynamics and the development of integrative environmental management plans.

Original languageEnglish (US)
Pages (from-to)535-547
Number of pages13
JournalEnvironmental Modeling and Assessment
Issue number3
StatePublished - Jun 2024


  • Eastern redcedar
  • Light gradient boosting machine
  • Multilayer perceptron
  • Ponderosa pine
  • Random forest
  • Semiarid native grasslands

ASJC Scopus subject areas

  • General Environmental Science


Dive into the research topics of 'Machine Learning for Modeling Soil Organic Carbon as Affected by Land Cover Change in the Nebraska Sandhills, USA'. Together they form a unique fingerprint.

Cite this