TY - JOUR
T1 - Using random forests to explore the feasibility of groundwater knowledge transfer between the contiguous US and Denmark
AU - Ma, Yueling
AU - Koch, Julian
AU - Maxwell, Reed M.
N1 - Publisher Copyright:
© 2024 The Author(s). Published by IOP Publishing Ltd.
PY - 2024/12
Y1 - 2024/12
N2 - Groundwater is our largest freshwater reservoir, playing an important role in the global hydrologic cycle. Lack of reliable groundwater data restricts the development of global groundwater monitoring systems linking observations with modeling at spatial scales relevant for local decision making. Despite the growing interests in machine learning (ML) for groundwater resource modeling, taking ML models to the global scale is still outstanding due to sparse groundwater data. The contiguous US (CONUS) has extensive groundwater information covering a wide range of hydrogeologic settings. We hypothesize that a ML model trained on the CONUS is transferable to other regions, and thus can be used to produce a global water table depth (WTD) map within the bounds of transferability. To test this hypothesis, we conduct a study on transferring groundwater knowledge between the CONUS and Denmark, using several random forest models trained against ∼30 m resolution long-term mean WTD data. The joint model trained on data from the CONUS and Denmark outperforms the individual models trained separately, implying similarities within global groundwater systems. The largest improvement occurs in Denmark, where the testing Nash-Sutcliffe efficiency rises from 0.68 to 0.95. SHapley Additive exPlanations (SHAP) values are utilized to express the importance of input variables. While annual mean precipitation plays a key role in the joint model and the model for the CONUS, it is the second least important input variable in the model for Denmark where local processes dominate. Moreover, Köppen-Geiger climate classification shows a significant impact on the model testing performance and the importance ranking of input variables, which might be a missing input variable in the applied random forest models. This study provides unique insights into future ML model developments towards global groundwater monitoring and improves our confidence in producing a hyper-resolution global WTD map for sustainable freshwater management.
AB - Groundwater is our largest freshwater reservoir, playing an important role in the global hydrologic cycle. Lack of reliable groundwater data restricts the development of global groundwater monitoring systems linking observations with modeling at spatial scales relevant for local decision making. Despite the growing interests in machine learning (ML) for groundwater resource modeling, taking ML models to the global scale is still outstanding due to sparse groundwater data. The contiguous US (CONUS) has extensive groundwater information covering a wide range of hydrogeologic settings. We hypothesize that a ML model trained on the CONUS is transferable to other regions, and thus can be used to produce a global water table depth (WTD) map within the bounds of transferability. To test this hypothesis, we conduct a study on transferring groundwater knowledge between the CONUS and Denmark, using several random forest models trained against ∼30 m resolution long-term mean WTD data. The joint model trained on data from the CONUS and Denmark outperforms the individual models trained separately, implying similarities within global groundwater systems. The largest improvement occurs in Denmark, where the testing Nash-Sutcliffe efficiency rises from 0.68 to 0.95. SHapley Additive exPlanations (SHAP) values are utilized to express the importance of input variables. While annual mean precipitation plays a key role in the joint model and the model for the CONUS, it is the second least important input variable in the model for Denmark where local processes dominate. Moreover, Köppen-Geiger climate classification shows a significant impact on the model testing performance and the importance ranking of input variables, which might be a missing input variable in the applied random forest models. This study provides unique insights into future ML model developments towards global groundwater monitoring and improves our confidence in producing a hyper-resolution global WTD map for sustainable freshwater management.
KW - contiguous US
KW - Denmark
KW - global groundwater modeling
KW - groundwater
KW - machine learning
KW - transfer learning
KW - water table depth
UR - http://www.scopus.com/inward/record.url?scp=85212613248&partnerID=8YFLogxK
U2 - 10.1088/2515-7620/ad9b08
DO - 10.1088/2515-7620/ad9b08
M3 - Article
AN - SCOPUS:85212613248
SN - 2515-7620
VL - 6
JO - Environmental Research Communications
JF - Environmental Research Communications
IS - 12
M1 - 121005
ER -