Expert Systems With Applications Chakraborty Et Al 2021

Expert Systems With Applications 170 (2021) 114498
Contents lists available at ScienceDirect
Expert Systems With Applications

journal homepage: www.elsevier.com/locate/eswa
Interpretable vs. noninterpretable machine learning models for data-driven

hydro-climatological process modeling
Debaditya Chakraborty a, *, Hakan Başağaoğlu b, James Winterle b
a
Department of Construction Science, University of Texas at San Antonio, 501 W César E Chávez Blvd., San Antonio, TX 78207, USA
b
Edwards Aquifer Authority, 900 E. Quincy St., San Antonio, TX 78215, USA
A R T I C L E I N F O A B S T R A C T
Keywords: Due to their enhanced predictive capabilities, noninterpretable machine learning (ML) models (e.g. deep
Deep learning learning) have recently gained a growing interest in analyzing and modeling earth & planetary science data.
Boosting However, noninterpretable ML models are often treated as “black boxes” by end-users, which could limit their
Transfer learning
applicability in critical decision making processes. In this paper, we compared the predictive capabilities of three
Hydroclimate
Reference crop evapotranspiration
interpretable ML models with three noninterpretable ML models to answer the overarching question: Is it essential
Model explainability to use noninterpretable ML models for enhanced model predictions from hydro-climatological datasets? The ML model
development and comparative analysis were performed using measured climate data and synthetic reference
crop evapotranspiration (ET o ) data, with varying levels of missing values, from five weather stations across the
karstic Edwards aquifer region in semi-arid south-central Texas. Our analysis revealed that interpretable tree-
based ensemble models produce comparable results to noninterpretable deep learning models on structured
hydro-climatological datasets. We showed that the tree-based ensemble model is also capable of imputing
varying levels of missing climate data at the weather stations, employing the newly developed sequential
transfer-learning technique. We applied an explainable machine learning (eXML) framework to quantify the
global order of importance of hydro-climatic (predictor) variables on ET o , while highlighting the local de
pendencies and interactions amongst the predictors and ET o . The eXML framework also revealed the inflection
points of the climate variables at which the transition from low to high daily ET o rates occur. The ancillary
explainability of ML models are expected to increase users’ confidence and support any future decision-making
process in water resource management.
1. Introduction Deka, 2014; Goyal, Bharti, Quilty, Adamowski, & Pandey, 2014; Feng,
Cui, Gong, Zhang, & Zhao, 2017; Cramer, Kampouridis, Freitas, &
Modern machine learning (ML) algorithms can generate data-driven Alexandridis, 2017; Saggi & Jain, 2019; Chia, Huang, Koo, & Fung,
models and create their own representations of the physical world to 2020; Granata, Gargano, & de Marinis, 2020). However, the structural
produce accurate predictions. They have exhibited significant progress complexity of the ML models that bestows such high predictive accuracy
in modeling and predicting nonlinear hydrological variables such as is the reason why they are difficult to interpret (Adadi & Berrada, 2018).
evapotranspiration, evaporation, rainfall, streamflow, runoff, and Formally, interpretability of ML models is determined by whether the
groundwater level, and capturing the hidden patterns in noisy earth & model has process transparency and discernible internal logic to allow
planetary datasets (Behzad, Asghari, Eazi, & Palhang, 2009; Guo, Zhou, the users to study and understand how inputs are mathematically
Qin, Zou, & Li, 2011; Taormina, Chau, & Sethi, 2012; Sánchez-Mon mapped to outputs (Doran, Schulz, & Besold, 2017; Guidotti et al.,
edero, Salcedo-Sanz, Gutiérrez, Casanova-Mateo, & Hervás-Martínez, 2018). For example, a linear regression (LR) model can be interpreted
2014; Nourani, Baghanam, Adamowski, & Kisi, 2014; Raghavendra & using the covariate weights to evaluate the relative importance of the
The code (and data) in this article has been certified as Reproducible by the CodeOcean: https://codeocean.com. More information on the Reproducibility Badge
Initiative is available at https://www.elsevier.com/physical-sciences-and-engineering/computer-science/journals.
* Corresponding author.
E-mail addresses: [email protected] (D. Chakraborty), [email protected] (H. Başağaoğlu), [email protected]
(J. Winterle).
https://doi.org/10.1016/j.eswa.2020.114498
Received 28 May 2020; Received in revised form 11 December 2020; Accepted 16 December 2020
Available online 24 December 2020
0957-4174/© 2020 Elsevier Ltd. All rights reserved.
D. Chakraborty et al. Expert Systems With Applications 170 (2021) 114498
features on the model predictions. In contrast, deep learning (DL) neural and has been used for the quantification of future drought risk under
networks (Bengio et al., 2009) automatically learn the input features climate change (Rind, Goldberg, Hansen, Rosenzweig, & Ruedy, 1990;
that subsequently undergo several layers of nonlinear transformations, Dewes, Rangwala, Barsugli, Hobbins, & Kumar, 2017). ET o can be used
making them noninterpretable to the end-users. From a theoretical point to calculate ET when combined with soil moisture levels or to estimate
of view, interpretable models such as extreme gradient boosting crop evapotranspiration under normal growing conditions when com
(XGBoost) (Chen & Guestrin, 2016) and random forest (RF) (Breiman, bined with a crop coefficient that varies as a function of crop charac
2001), provide certain advantages over the “black-box” type DL or SVM teristics and soil moisture condition (McMahon et al., 2013). Thus, it is
(Boser, Guyon, & Vapnik, 1992) or long short-term-memories (LSTM) critical to understand the variations and drivers of ETo & ET, especially
(Hochreiter & Schmidhuber, 1997). These advantages include enhanced under the impacts of climate change (Kingston, Todd, Taylor, Thomp
model interpretability, in addition to reduced complexity and compu son, & Arnell, 2009; Xu, Yu, Yang, Ji, & Zhang, 2018; Sun et al., 2020),
tational time, which could increase their broad adoption in diverse which is expected to intensify the hydrological cycle, and thus, alter ET
applications. (Jung et al., 2010). Consequently, it is imperative to be able to produce
Recent research suggests that DL approaches, such as recurrent interpretable data-driven ETo models that better explain the watershed-
neural networks (RNN) & LSTM, could potentially overcome the mem scale relationship between the ETo and climatic features while main
ory or lag effects limitations in regression-based tasks associated with taining high-level long-term prediction accuracy. Such long-term pre
other ML methods such as RF & kernel methods (Reichstein et al., 2019). dictions of ETo , using predictive ML models, could serve as a reliable
However, if the end goal is the long-term prediction of hydrological indicator of the intensity of the hydrological cycle & enhance our un
processes under climate change scenarios, then including a lag effect in derstanding of the hydro-climatic changes (Xu et al., 2018). Unfortu
the model may cause more harm than good due to error accumulation nately, due to spatially and temporally limited data, the long-term
effects, as shown in this paper. Therefore, we evaluate in this paper if regional trends in ETo & ET, and consequently the future freshwater
interpretable ML models such as XGBoost (an ensemble of simple availability (Kingston et al., 2009), remain uncertain under the impacts
regression trees) can perform as well as noninterpretable DL models on of climate change-induced temperature increases (Rigden & Salvucci,
structured tabular hydroclimatic datasets. 2017).
In order to evaluate whether it is essential to apply noninterpretable FAO Penman–Monteith Method for Synthetic ETo Data
ML models to produce good predictions, we developed three interpret Generation:
able ML models, including linear regression (LR), RF, & XGBoost, and To facilitate the comparative analysis of predictive accuracy of
three noninterpretable ML models, including support vector machine interpretable and noninterpretable ML models, we generated time-series
(SVM), LSTM, & DL models, to predict the reference crop evapotrans of ETo data using the FAO Penman–Monteith method (PMM) (Allen,
piration (ET o ) from local climate data, with varying levels of missing Pereira, Raes, & Smith, 1998), which is commonly used to calculate
values, acquired from five weather stations across the karstic Edwards watershed-scale ETo from a surface with abundant water and covered by
aquifer region in semi-arid south-central Texas. Subsequently, we a hypothetical grass reference crop fully shading the ground. The PMM-
analyzed and compared their predictive performances to evaluate computed ETo represents the atmospheric evaporative demand. Scheff
whether the noninterpretable ML models provide any advantage over and Frierson (2015) reported 10–30% deviations in PMM-based ETo
the interpretable ML models or vice versa. We also investigated whether estimates when monthly climate data were used instead of 3-hourly
the memory or lag effects in LSTM models could lead to enhanced pre climate data. Therefore, hourly climate data was used in this study to
diction accuracy when compared to predictions from other ML models minimize deviations in ETo estimates that may arise from coarser tem
without memory or lag effects. Another novel contribution of this paper poral resolutions of the input data. In the FAO PMM, hourly ETo [mm/
is the training of custom grid-search cross-validated XGBoost models by hr] is computed by
employing a sequential transfer-learning technique to predict and 37
0.408▵(Rn − G) + γ Ta +273u2 (eo − ea )
impute ∼2-month and ∼11- month-long continuous spectrum of missing ET o = , (1)
data at 15-min intervals at two weather stations, using data from the ▵ + γ(1 + 0.34u2 )
neighboring weather stations.
where ▵ is the slope of the saturation vapor pressure [kPa ◦ C− 1], Rn is
The rest of the paper is organized as follows: Section 2 provides a
the net solar radiation [MJ/(m2 hr)], G is the heat flux density [MJ/(m2
domain background on ETo and an existing method that we use to
hr)], γ is the psychrometric constant [kPa ◦ C− 1], Ta is the air tempera
generate synthetic ETo data from historical local climate data; Section 3
ture [◦ C], eo is the saturated vapor pressure [kPa], ea is the actual vapor
outlines the eXplainable machine learning (eXML) framework that we
pressure [kPa], and u2 is the wind speed at 2 m above the ground surface
adopted for this research; Section 4 describes the data sources across the
Edwards aquifer region and data processing steps; Section 5 outlines and [m/s]. γ = 0.665 × 10− 3 P, in which P is the atmospheric pressure [kPa].
reports the development and implementation of the sequential transfer- Rns = (1 − α)Rs , in which α is the albedo that determines the fraction of
learning models to impute continuous spectrum of missing data; Section the measured solar radiation, Rs [MJ/(m2 hr)], reflected by the surface.
eo = 0.6108eTa , ea = eo (RH)/100, and ▵ = 2503.058eTa /(Ta + 237.3)2 ,
* *
6 reports the settings and the predictive performance of interpretable
and noninterpretable ML methods; Section 7 describes the imple in which RH is the relative humidity [–] and T*a = 17.27Ta /(Ta +
mentation of SHAP and LIME eXML framework to understand the 237.3). Rn = Rns − Rnl , in which Rns is the measured net incoming
interpretable models in addition to evaluating the explainability and shortwave radiation and Rnl is the outgoing longwave radiation [MJ/
trustworthiness of the eXML method via domain knowledge and statis (m2 hr)] computed as
tical inference; and Section 8 presents our main conclusions. [ ] ( )
T 4a,max + T 4a,min √̅̅̅̅̅ Rs
Rnl = σ (0.34 − 0.14 ea ) 1.35 − 0.35 , (2)
2. Domain background 2 Rso
Reference Crop Evapotranspiration (ETo ): where σ is the Stefan–Boltzmann constant (2.043 × 10− 10 MJ/ (K4 m2
Evapotranspiration (ET) is a key hydrological variable that accounts hr)), T4a,max and T4a,min are the maximum and minimum absolute air
for 60–70% of the global precipitation and affects the water, energy, and temperatures during the 24-h period [K]. Rso is the clear-sky radiation
carbon cycles (Mishra et al., 2017; Lemordant, Gentine, Swann, Cook, & [MJ/(m2 hr)]. Linearized Beer’s radiation law leads to Rso =
( )
Scheff, 2018; McMahon, Peel, Lowe, & Srikanthan, 2013). Reference 0.75 + 2 × 10− 5 z Ra , in which z is the elevation of the weather station
crop evapotranspiration (ET o ) represents the atmospheric evaporative above the sea level [m] and Ra is the extraterrestrial radiation [MJ/(m2
demand, which accounts for local climate-associated uncertainties in ET,
2
hr)]. (Rs /Rso ) is the relative shortwave radiation, representing the cloud being increasingly applied to extract patterns and insights from the ever-
cover. 0.33 < Rs /Rso < 1.0, in which the lower bound represent the increasing stream of spatial and temporal data for ET o predictions
dense cloud cover and the upper represent the clear sky on a particular (Gocic, Petković, Shamshirband, & Kamsin, 2016; Antonopoulos &
day. Ra is given by Antonopoulos, 2017; Mehdizadeh, Behmanesh, & Khalili, 2017; Wu,
Zhou, Ma, Fan, & Zhang, 2019). The daily ETo prediction is a regression-
720Gsc dr
Ra = [(ω2 − ω1 )sin(φ)sin(δ) + cos(φ)cos(δ)(sin(ω2 ) − sin(ω1 )) ], type problem aiming at forecasting continuous-response values from
π
multivariate input sequences and feedback, and hence, require super
(3)
vised ML methods.
where Gsc is the solar constant [0.0820 MJ/(m2 min)], dr is inverse
relative distance earth–sun [–], δ is the solar declination [rad], φ is the 3. The eXplainable Machine Learning (eXML) framework
latitude of the weather station [rad], ω1 and ω2 are the solar time angle
at the beginning and end of the period [rad]. Here, dr = The eXplainable ML (eXML) framework used to predict watershed-
1 +0.033cos(2πJ/365) and δ = 0.409sin(2πJ/365 − 1.39), in which J is scale ETo in this paper is displayed in Fig. 1. The first step is the data
the day count of the year. Solar time angle at midpoint of hourly period, processing, in which hourly climate data (predictor variables) from local
ω [rad], is given by weather stations, and the hourly ETo computed by Eq. (1) (target vari
able) were combined and formatted to create the database as discussed
ω = (π/12)([t + 0.06667(Lz − Lm ) + Sc ) − 12 ], (4) in Section 4. Subsequently, data quality checks were performed to
identify and eliminate potential outliers and find the missing values in
in which t is the standard clock time at an half-and-hour intervals [hr], each data type. The newly developed sequential transfer-learning tech
Lz = 90 for central Texas, Lm is the longitude of the weather station
◦
nique was used next to impute long stretches of missing values, which is
[degrees], and Sc is the seasonal correction for solar time [hr], given by described in Section 5. The complete data were split into training and
Sc = 0.1645sin(2b) − 0.1255cos(b) − 0.025sin(b), in which b = testing data sets, which were employed to train and test the predictive
2π(J − 81)/364. ω1 = ω − (πt1 /24) and ω2 = ω + (πt1 /24). In hourly performance of 3 interpretable and 3 noninterpretable ML models, dis
ETo calculations, Ra = 0 when the sun is below the horizon at ω < − ωs or cussed in Section 6, in estimating ETo from local climate variables.
ω > ωs . When the sun is above the horizon (Ra > 0), G = 0.1Rn corre The global and local explainability of the interpretable ML model
sponds to smaller heat outfluxes, resulting in soil warming during day with the highest predictive accuracy was analyzed using the SHAP and
times. In contrast, when the sun is below the horizon (Ra = 0), G = LIME eXML frameworks discussed in Section 7. SHAP revealed the
0.5Rn corresponds to larger heat outfluxes, resulting in soil cooling at global order of importance of the predictor variables in estimating the
nights. Moreover, u2 ⩾0.5 m/s to account for the effects of boundary target (ET o ) and highlighted the local dependencies and interactions
layer instability and buoyancy of air in allowing the exchange of vapour amongst the predictor and target variables. We further quantified the
at the surface when air is calm. Because daily ETo were used in the inflection points of the predictor variables that triggers the transition
( )/
subsequent ML analyses, σ T 4a,max + T 4a,min 2 was implemented instead between high and low ETo values, and assessed the fidelity of these re
sults based on inferential statistical techniques.
of σ T4a in [K] in Eq. 2 to simplify the calculations, which resulted in
negligible root mean square errors of < 1 × 10− 3 mm/day.
4. Data processing and evaluation
Hourly-averaged Ta ,RH,P,u2 ,ea , and eo , and hourly-aggregated Rs are
used to calculate the radiative and aerodynamic terms in Eq. (1). Hourly
We acquired the local climate data from five weather stations,
ETo calculations are tedious and operate on long time-series of local
operated by the Edwards Aquifer Authority (EAA), to compute the ETo
climate data. In addition, when working with input data at low temporal
across the Edwards aquifer region in the semi-arid south-central Texas.
resolution, any sensor failure at the weather station could lead a large
The weather stations included the Cibolo Nature Center (CNC), Bandera
number of missing data, which could hamper the accuracy of hourly
County River Authority and Groundwater District Office (BCRAGDO),
ETo , and hence, ET and drought predictions. Therefore, ML models are
Fig. 1. Graphical description of the eXplainable ML (eXML) framework.
3
Fig. 2. The location of the EAA’s weather stations at which historical climate data at the 15-min intervals are acquired. Seco Sinkhole Dam (SSD) and San Geronimo
Dam (SGD) stations with the most missing climate data are located on or near the aquifer recharge zone. Groundwater flow is from west to east. The Comal springs
and the San Marcos springs in the east are the spring outlets.
Four Sisters Ranch (FSR), Seco Sinkhole Dam (SSD), and San Geronimo ∼ 20% of the climate records at the 15-min intervals are absent and ∼
Dam (SGD) (Fig. 2). At these stations, the air temperature (Ta ), relative 10% of Rs at the 15-min intervals from available records are missing.
humidity (RH), atmospheric pressure (P), wind speed (u2 ), and incoming Specifically, 32,247 consecutive records from 4/30/2019 to 4/1/2020
shortwave radiation (Rs ) were recorded at the 15-min intervals since (∼ 11 months) were unavailable. In addition, 5,485, 3,341, and 3,125
August 2015. The exact start day of data recording varied among the consecutive Rs data from 1/26/2016 to 3/23/2016, from 6/2/216 to 7/
stations from 8/19/2015 to 8/31/2015. We used the local climate data 7/2016, and from 4/29/2017 through 6/1/2017, respectively, were
until 4/1/2020 for ETo computations. missing at SSD.
Among the weather stations, FSR has the least missing records From a practical decision-making standpoint, ETo with enhanced
(0.003%); therefore, the climate data at FSR (Fig. 3) can be used to accuracy at the SGD & SSD sites are critical, as they are located at (or
quantify the order of importance of climatic features on ETo . Similar to near) the recharge zone (Fig. 2). ETo along the recharge zone would
FSR, BCRAGDO & CNC have missing records less than 0.1% (Table 1), affect the availability of water for aquifer recharge, which could have
and hence, the data from these stations can be used to impute missing direct consequences on the sustainability of the aquifer for consumptive
data at neighboring stations. The variation in the daily and monthly ETo water uses and protection of habitats for groundwater-bound endan
at FSR, computed using Eq. (1), were found within the range of 0.1–7.8 gered and threatened species at the major spring outlets (Devitt, Wright,
mm and 37–200 mm, respectively (Fig. 3), which are the representative Cannatella, & Hillis, 2019). When long stretches of missing climate data
ranges in the south-central Texas. SGD, however, has ∼ 3.8% of missing were imputed using standard linear interpolation schemes, missing Rs in
records, including 6,079 consecutive records at the 15-min intervals late spring and summer months resulted in suspicious spikes in
from 6/2/2016 to 8/4/2016. SSD is the most challenging station, where computed daily ETo with 80–100% deviations at SGD & SSD (Fig. 4).
Fig. 3. Hourly local climate data at FSR, and daily and monthly ET o computed from Eq. (1).
4
Table 1
Climate data availability at the chosen weather stations.
Weather Missing Missing Missing Missing Missing Missing
Station Records Ta (%)* RH (%)* u2 (%)* P (%)* Rs (%)*
(%)
CNC 0.019 0.0 0.0 0.0 0.0 0.0

BCRAGDO 0.011 0.0 0.0 0.0 0.0 0.0
FSR 0.003 0.0 0.0 0.0 0.0 0.0
SGD 3.764 0.0 0.0 0.0 0.12 0.0
SSD 20.131 0.0 0.0 0.0 0.0 9.99
*
Missing fraction of the particular data type in the available climate records.
Fig. 4. Daily ET o at the SGD & SSD sites when long stretches of missing climate data were imputed by linear interpolation. Time intervals with long missing data are
highlighted.
Such large deviations could result in inaccurate soil moisture, irrigation impute randomly missing values. But, the pivotal challenge in our study
water demand, and aquifer recharge estimates, and hence, ill-informed arises from having ∼2-month of continuous missing records at SGD, and
water management plans. Moreover, ETo at SSD are not computable ∼11-month of continuous missing records, in addition to ∼ 10% missing
beyond 4/30/2019 due to missing climate records. Thus, underscoring Rs data, at SSD. Although a multivariate imputation by chained equa
the need for an advanced data imputation technique at these logistically tions (MICE) (Buuren & Groothuis-Oudshoornand, 2010) hybridized
important weather stations for more accurate ETo estimates. with ML estimators showed potential to handle missing (water quality)
data (Ratolojanahary et al., 2019), our analysis revealed that a similar
5. Missing data imputation using sequential transfer-learning off-the-shelf iterative imputation algorithm (Pedregosa et al., 2011)
hybridized with a random forest estimator was ill-suited for imputing
From a ML implementation standpoint, missing data negatively im the missing climate data at SGD & SSD, as shown in Fig. 5. In order to
pacts the prediction accuracy and may lead to erroneous results & overcome this challenge, we trained custom grid-search cross-validated
conclusions. In a recent study, Saggi and Jain (2019) used an off-the- XGBoost models using the sequential transfer learning technique to
shelf MissForest iterative algorithm with a random forest estimator to predict the continuous spectrum of missing data at 15-min intervals at
Fig. 5. (a) Daily Ta , RH, Rs at SGD & SSD after imputing the missing climate data using an off-the-shelf iterative algorithm with a RF estimator, which shows that off-
the-shelf iterative algorithms are ill-suited for imputing long stretches of missing climate data. Note that besides these long stretches, there were other occurrences of
randomly missing datapoints (few hours to few days) for the variables in the datasets. (b) Daily Ta , RH, Rs at SGD & SSD after imputing the missing climate data using
a sequential transfer learning technique with custom grid-search cross-validated XGBoost models.
5
SGD & SSD, using data from the neighboring weather stations (CNC, The performance of the ML models (i.e. the predictive accuracy and
BCRAGDO & FSR). the capability to extrapolate on unseen (testing) data) are reported and
We trained a XGBoost model to learn the dynamic relationship be compared in terms of the root mean square error (RMSE) and correlation
tween the non-missing Ta data at the SGD, BCRAGDO, and CNC stations. coefficient (R2 ),
Subsequently, we predicted the missing Ta data at SGD using the √(̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅)
∑n [∑n ]
respective trained XGBoost models. Similar steps were implemented to k=1 (Tk − Pk )
2
(Tk − Pk )2
impute the missing P values in the SGD dataset. Next, we trained a model RMSE = ; R2 = 1 − ∑k=1
n 2
(5)
n k=1 (Tk − μ)
to learn the dynamic relationship between the non-missing RH, Ta , P at
SGD, and then using the trained model we predicted the missing RH where Tk & Pk are the kth target & predicted ETo , respectively, n is the
from the corresponding Ta & P data that were predicted in the previous total number of data points in the testing dataset, 1 ⩽k ⩽n, and μ is the
steps. Then, we modeled the non-missing Rs with respect to the Ta , P, & mean of the targets in the testing dataset. Lower values of RMSE and
RH at SGD, and subsequently predicted the missing Rs using the pre higher values of R2 are desirable.
dicted Ta ,RH, and P. We modeled the non-missing u2 with respect to the The predictive performance of the interpretable ML models (LR, RF,
Ta , P, RH, & Rs at SGD, and then predicted the missing u2 using the & XGBoost) and noninterpretable ML models (SVM, LSTM, & DL) are
predicted Ta , P, RH, and Rs . For imputing the missing data in the SSD compared in Table 3, based on RMSE and R2 measures on the test data.
dataset, we first trained a XGBoost models to learn the dynamic rela Table 3 revealed that the predictive accuracy of the noninterpretable DL
tionship between the non-missing Ta data at the SSD, BCRAGDO, and model is the best, while the performance of the interpretable XGBoost is
FSR stations. Then the missing data at SSD were predicted using the comparable to the DL. RF, the second best interpretable model resulted
same sequential transfer learning technique as that for the SGD station. in comparable, but slightly lower predictive accuracy than DL and
We also added the monthly, daily, and hourly variations of the climatic XGBoost. The third best interpretable predictor, LR, displayed compa
features in the XGBoost models by adding the month of year, day of rable predictive accuracy as the second best noninterpretable predictor
month, and hour of day as individual features. SVM. Moreover, the predictive accuracy of a simpler linear model, LR,
Using the novel sequential transfer learning technique described surpassed the predictive accuracy of the more sophisticated LSTM model
above, the ML model was able to ‘learn’ the relationships amongst the with 1–14 day-lags at all weather stations. The comparatively poor
local climate variables at the neighboring stations with more complete performance of the LSTM can be attributed to the accumulation of errors
data, and then ‘transfer’ this knowledge to the targeted SGD & SSD over longer periods due to its autoregressive nature. This is a major
stations to impute long stretches of continuous spectrum of missing data. drawback from a long-term hydrological decision-making perspective
The predictive accuracy of the trained XGBoost models reported in because the error accumulation could make model predictions untrust
Table 2 shows that the R2 > 0.82 for each imputed climatic feature. In worthy in real-life applications. Since the random initial conditions for
addition, the predictive accuracy (R2 ) at both weather stations are LSTM and DL models can result in different results each time a given
comparable, underscoring the robustness and generalizability of our configuration is trained, we ran the experimental scenarios 10 times for
novel transfer learning technique. This sequential prediction of the each model. The variance in the RMSE and R2 with the LSTM models and
missing Ta , P, RH, Rs , u2 data took about 8.8 & 16.1 hr for the SGD & SSD DL models are reported in Table 3. It should be noted that the entire
model training, respectively. For each of the five climate variables at the testing data at the SSD site was imputed using the technique described in
respective sites, we evaluated 108 unique hyperparameter candidates Section 5. The corresponding models were trained on the actual sensor
over 3-fold cross validations generating 324 + 1 fits for each, which data and tested on the imputed data. For the SGD site, the models were
equates to 1,625 trials for each site on an Intel Core i9-9980XE CPU and trained on both the actual sensor & the imputed data, and subsequently
64 GB RAM. tested on the actual sensor data.
After imputing the missing data at SGD & SSD with the novel From a computational efficiency standpoint, the training times for
sequential transfer learning-based XGBoost model, ETo at these sites the LSTM ETo model for CNC, BCRAGDO, FSR, SGD, and SSD were
computed from Eq. (1) (Fig. 6) were free of suspicious spikes, unlike in recorded as 10, 9, 7.3, 7.13, 7 min, respectively. The training times for
Fig. 4, in which the imputation was carried out by linear interpolation. the DL were 0.9, 0.9, 0.81, 0.8, 0.8 min at CNC, BCRAGDO, FSR, SGD,
Moreover, the overall trend of the monthly ETo computed using the and SSD, respectively. RF ETo models took 0.01 min to train for all sites.
imputed data at SGD & SSD conform well to the monthly ETo at CNC, SVM, LR, and XGBoost took a fraction of a second to train on the daily
BCRAGDO & FSR in Fig. 7. These results suggest that our proposed ETo data for all sites. It is worth noting that XGBoost is built with
sequential transfer learning framework can be applied at different sites OpenMP support, which is a shared memory multiprocessing application
to overcome missing data related challenges as long as there is adequate programming interface. In other words, OpenMP allows XGBoost to
data from nearby station(s). efficiently use all the CPU/GPU cores in parallel while training, making
it extremely fast. In addition, XGBoost presorts the independent vari
6. Performance analysis of predictive ML models ables at the beginning of the training process, which further reduces the
training complexity and computational time (Chen & He, 2015). As
In this section, we comparatively analyze the interpretable and discussed in Section 5, missing data imputation at the SGD & SSD sites
noninterpretable ML approaches, including Linear Regression (LR), was the most computationally demanding process (∼8—16 h) in this
Random Forest (RF), & eXtreme Gradient Boosting (XGBoost), Support entire research project.
Vector Machine (SVM), Long Short Term Memory (LSTM), & Deep Although LSTM models have demonstrated reasonable performance
Learning Neural Networks (DL). The model details are shown in Fig. 8. in financial market predictions (Fischer & Krauss, 2018), our results
suggest that autoregressive type models for ETo predictions may not be
the most suitable, especially if the objective is to employ the resulting
Table 2
model predictions for high stake hydrological decisions geared towards
Predictive accuracy of the sequential transfer learning-based XGBoost models
long-term sustainability. Moreover, an in-depth inspection revealed that
used to impute the missing data at SGD & SSD, reported on a testing dataset that
was unseen during the model training. the LR model made unrealistic negative daily ETo predictions for the
winter months in the semi-arid region, thus failing to represent the
Ta P RH Rs u2
hydroclimatic processes accurately. In an agreement with our findings
R2 – SGD 0.96 0.98 0.99 0.87 0.83 from ML-based hydroclimatic analyses in Table 3, comparable predic
R2 – SSD 0.99 1.00 0.96 0.88 0.84 tive accuracy of XGBoost to DL was reported by Fernández-Delgado et al.
6
Fig. 6. Daily ET o at SGD & SSD after missing long stretches of climate data were imputed using the sequential transfer-learning model. The imputed portions of the
data are highlighted.
Talebiesfandarani, 2019). Vanhaeren et al. (2020) also reported that

XGBoost achieved better performance than DL in systematic prediction
of long-range chromatin interactions in the role of three-dimensional
genome organization as a critical regulator of gene expression. In
genomic prediction of complex traits, gradient boosting decision trees
were reported to be a more robust method with better predictive per
formance than RF and DL (Abdollahi-Arpanahi, Gianola, &
Peñagaricano, 2020), and hence, the authors noted that DL is not
necessary a panacea for genome-enabled prediction of complex traits.
Fig. 7. Comparison of monthly ET o computed using the imputed climate data After all, it is a myth that explainable ML models are not as accurate
at SSD & SGD against monthly ET o computed using nearly-complete climate as more complex DL models. A significant trade-off between predictive
data at CNC, BCRAGDO, & FSR. accuracy and model explainability is believed to be refraining re
searchers and practitioners from attempting to develop explainable
(2019) based on ML predictions on 33 ‘large-difficult’ data sets (Table models (Rudin, 2019). To eliminate such notions in hydroclimatic ap
10 in their paper) and by Fernández-Delgado et al. (2019) for solar ra plications, we demonstrate in this paper that explainable ML models can
diation predictions. be just as accurate, if not more, as complex DL models, especially in
In other domains, XGBoost model reportedly outperformed RF in applications involving structured datasets. Explainable ML models
predicting 30-day mortality in critically ill influenza patients (Hu et al., provide the domain experts or end-users with information about what
2020), and DL exhibited high prediction accuracy, outperforming the the ML models can learn and how they make predictions, which could
SVM model in stroke prediction from a large-scale population-based enhance their broader adoption in hydroclimatic applications. In the
electronic medical claims database (Hung, Chen, Lai, Lin, & Lee, 2017). next section, we show that the explainable XGBoost model along with
Moreover, XGBoost and RF reportedly outperformed DL and SVM in the SHAP and LIME eXML framework provide simple explanations that
forecasting the peak demand days for cardiovascular disease-related are human-understandable in addition to accurately representing the
hospital admissions based on historical hospital admissions data, air real-world hydroclimatic processes.
quality data and meteorological data (Qiu et al., 2020). Similarly,
XGBoost achieved higher accuracy than RF and DL for predicting the 7. Model explainability and evaluation of explainability
concentration of fine particulate matters with diameter less than 2.5 μm
in the air that are associated with lung cancer, cardiovascular, respira Explainability here refers to fidelity (i.e. the prediction produced by
tory, and metabolic diseases (Zamani Joharestani, Cao, Ni, Bashir, & an explanation should agree with the real-world hydro-climatological
Fig. 8. Details of the ML models.
7
Table 3
Comparison of Predictive Accuracy of the ML Models.
Models Metrics Stations
CNC BCRAGDO FSR SGD SSD
LR RMSE* 0.26 0.21 0.28 0.20 0.26

R2 0.98 0.99 0.98 0.99 0.98
LSTM RMSE * 1.23 (0.1) †

1.05 (0.07)† 1.11 (0.07)† 0.88 (0.02)† 0.87 (0.06)†
R2 0.59 (0.07)† 0.62 (0.05)† 0.66 (0.04)† 0.72 (0.01)† 0.73 (0.04)†
DL RMSE* 0.15 (0.006)† 0.13 (0.012)† 0.14 (0.006)† 0.13 (0.01)† 0.19 (0.006)†
R2 0.99 (0)† 0.99 (0.001)† 0.99 (0.001)† 0.99 (0.001)† 0.99 (0.001)†
RF RMSE* 0.22 0.18 0.23 0.17 0.20

R2 0.99 0.99 0.99 0.99 0.99
XGBoost RMSE* 0.19 0.16 0.19 0.16 0.21

R2 0.99 0.99 0.99 0.99 0.99
SVM RMSE* 0.27 0.24 0.27 0.23 0.24

R2 0.98 0.98 0.98 0.98 0.98
†
The numbers in parenthesis indicate the variance when the experiments were repeated 10 times.
*
Root Mean Squared Error (mm/day).
processes as much as possible) and interpretability (i.e. the explanation (Rs > Ta > RH > u2 > P), but also reports the models’ well-informed
should be simple enough to understand). Such explanations can be used conscious predictions, while capturing the underlying hydroclimatic
to generate testable hypotheses (Lipton, 2018) or confirm that the model processes accurately. For example, the ML model pushes the ETo pre
resonates with real-world physical systems. Recently, Moscato, Picar dictions higher (i.e., represented by higher Shapley values on the x-axis),
iello, and Sperlí (2021) compared different eXML framework to explain when the Rs , Ta , & u2 are high, represented by red dots, and RH & P are
the predictions from classification models and reported that SHAP low, represented by blue dots. Such accurate representation of the un
(Shapley, 1953; Lundberg et al., 2020) obtains statistically more reliable derlying physical processes by the SHAP analysis indicates that tree-
outcomes and LIME (Ribeiro, Singh, & Guestrin, 2016) achieves good based ensemble models should not be treated as “black-boxes” by
coverage values since it models the predictions as a weighted sum domain experts as such models can extract and generalize meaningful
making it easy to understand how the predictions are generated. physical interactions between the features (predictors) and the target
In this paper, we employed an integrated SHAP and LIME eXML (ET o in this study).
framework to understand the XGBoost models’ predictions at the While the SHAP “global interpretability analysis” gives a general
BCRAGDO site (with nearly complete data initially) and the SSD site overview of the influence of each feature on the model predictions, the
(with the most missing records prior to data imputation). The SHAP SHAP “local interpretability” analysis in Fig. 9 elaborates on how the
“global interpretability” analysis shown in Fig. 9 unveils not only the model predictions vary with the feature values and interactions. Note
relative order of importance of climatic variables that each dot is a data point, and the vertical spread at each feature value
Fig. 9. Global and local interpretation plots of the XGBoost ET o model at the BCRAGDO and SSD sites.
8
is due to the interaction and dependency among the features in the • H1 : if Rs ⩽17.16, Ta ⩽20.62, and RH > 72.17 at BCRAGDO, the rate of
model. For example, Fig. 9 reveals that Rs interacts with Ta , and high Rs ETo will be low;
and high Ta values drive up the SHAP values, corresponding to higher • H2 : if Rs ⩽17.8, Ta ⩽20.42, and RH > 73.82 at SSD, the rate of ETo
ETo values, as expected in real world systems. will be low,
To further investigate the critical climatic inflection points that drive
the ETo rates, we performed a detailed “local interpretability” analysis in which low ETo rates are classified when the ETo values are less than
with LIME shown in Fig. 10. These plots explain the predictions of 32 the historical median ETo values at the respective sites. Using the
instances from the XGBoost ETo models for the BCRAGDO and SSD sites recorded climatic and ETo data described in Section 4, we computed that
to enhance their explainability. It reveals that the pertinent daily Ta ,Rs , the conditional probabilities, P(H1 ) and P(H2 ), are ∼ 99.3% and 99.8%,
and RH inflection points are 20.62 ◦ C, 17.16 kW/m2, and 72.17 at respectively. The conditional probabilities are given by: (i) P(Low ETo |
BCRAGDO, and 20.42◦ C, 17.8 kW/m2, and 73.82 at SSD, respectively. H1 ) = 100 × P(Low ETo and H1 )/P(H1 ), and (ii) P(Low ETo | H2 ) =
The u2 and P can be ignored since they have a very small impact on ETo , 100 × P(Low ETo and H2 )/P(H2 ). Such high probabilities (> 99%)
according to the “global interpretability analysis” in Fig. 9. Based on indicate high fidelity of the explanations, i.e. the hypotheses, derived
these interpretations, we can hypothesize: from the eXML framework. Moreover, from a practical hydroclimatic
standpoint, it is very reasonable to assert that lower Ta and Rs , but higher
RH would result in lower ETo , and vice versa, which is consistent with
Fig. 10. Local interpretation plots of the XGBoost ET o model at the BCRAGDO and SSD sites. ET o,m is the median historical value.
9
the SHAP and LIME explanations outlined in this section. Therefore, Data and code availability
highlighting that the combined SHAP and LIME global and local ex
planations provide accurate representations of the real-world hydro To aid in the reproducibity of this research, the authors have shared
climatic processes and enhances the explainability of the nonlinear ML the raw data and codes on Code Ocean (https://codeocean.com/c
models by providing “rule-based” explanations that are simple enough apsule/4195021/tree).
for human-understanding.
CRediT authorship contribution statement
8. Conclusion
Debaditya Chakraborty: Conceptualization, Data Curation, Formal
The overarching goal of this paper was to evaluate whether or not Analysis, Investigation, Methodology, Validation, Writing - original
more interpretable machine learning (ML) models – e.g. extreme draft, Writing - review & editing. Hakan Başağaoğlu: Conceptualiza
gradient boosting (XGBoost) – can perform as well as deep learning (DL) tion, Data Curation, Formal Analysis, Investigation, Methodology,
neural networks on structured tabular hydroclimatic datasets. We ac Validation, Writing - original draft, Writing - review & editing. James
quired the local climate data at 15-min intervals over the past five years Winterle: Data Curation, Supervision, Validation, Writing - review &
from five weather stations across the Edwards aquifer region (EAR), and editing.
developed a new XGBoost-based sequential transfer-learning technique
to learn the relationships among climatic variables from nearby weather Declaration of Competing Interest
stations with more complete data, and then transfer the new knowledge
to impute long arrays of missing data (e.g., ∼ 2 months &11 months of The authors declare that they have no known competing financial
15-min interval data at two of the five weather stations). The models interests or personal relationships that could have appeared to influence
demonstrated good performance with R2 > 0.82 for the sequential the work reported in this paper.
knowledge transfer on the testing data at both weather stations with
missing data. Acknowledgements
Using the imputed climate data, the top-performing interpretable
model (XGBoost) exhibited comparable performance to the top- The authors would like to thank to Newfel Mazari and Marcus Gary
performing noninterpretable model (DL) in predicting daily ETo from of EAA for their help with acquisition of climate data from EAA weather
structured tabular dataset. Even the second best interpretable model stations; Ned Throshanov and Sarah Eason of EAA for their help with
(RF) showed comparable predictive accuracy to DL. The analysis further preparation of the location map.
revealed that the noninterpretable LSTM model was outperformed by
the relatively simple & interpretable LR model, which displayed com Appendix A. Supplementary data
parable predictive accuracy to SVM.
We applied a SHAP- and LIME-based explainable machine learning Supplementary data associated with this article can be found, in the
(eXML) framework to enhance the explainability of the XGBoost models. online version, at https://doi.org/10.1016/j.eswa.2020.114498.
The global interpretability component of the SHAP eXML framework
revealed that the relative importance of the climate variables on ETo References
predictions across the study area is in the order of
Rs > Ta > RH > u2 > P. The framework accurately captured the un Abdollahi-Arpanahi, R., Gianola, D., & Peñagaricano, F. (2020). Deep learning versus
derlying physics that drives the daily ETo such that, the model predicted parametric and ensemble methods for genomic prediction of complex phenotypes.
Genetics Selection Evolution, 52, 12.
higher ETo when Rs ,Ta , & u2 were high, and RH & P were low, and vice Adadi, A., & Berrada, M. (2018). Peeking inside the black-box: A survey on explainable
versa. The local interpretability component of the SHAP eXML frame artificial intelligence (XAI). IEEE Access, 6, 52138–52160.
work demonstrated how the model predictions vary with the feature Allen, R. G., Pereira, L. S., Raes, D., Smith, M., et al. (1998). Crop evapotranspiration-
guidelines for computing crop water requirements-FAO irrigation and drainage
values and interactions using SHAP dependence and interaction plots paper 56. FAO, Rome, 300, D05109.
coupled with the ML models. The local interpretability of the LIME Antonopoulos, V. Z., & Antonopoulos, A. V. (2017). Daily reference evapotranspiration
eXML framework quantified the critical inflection points in each pre estimates by artificial neural networks technique and empirical equations using
limited input climate variables. Computers and Electronics in Agriculture, 132, 86–96.
dictor that led to transition from low ETo to high ETo rates, where low
Behzad, M., Asghari, K., Eazi, M., & Palhang, M. (2009). Generalization performance of
and high ETo rates were defined in reference to the median historical support vector machines and neural networks in runoff modeling. Expert Systems with
ETo . The hydroclimatic feasibility of the results and trustworthiness of Applications, 36, 7624–7629.
these eXML frameworks were corroborated through statistical in Bengio, Y., et al. (2009). Learning deep architectures for AI. Foundations and Trends in
Machine Learning, 2, 1–127.
ferences. Our analysis confirmed that the explainability of the XGBoost Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992). A training algorithm for optimal
models were enhanced when the models were coupled with the SHAP & margin classifiers. In Proceedings of the fifth annual workshop on computational learning
LIME eXML framework. theory (pp. 144–152).
Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.
In conclusion, the XGBoost models integrated with the eXML Buuren, S. V., & Groothuis-Oudshoorn, K. (2010). mice: Multivariate imputation by
framework displayed outstanding performance in (i) imputing long chained equations in R. Journal of Statistical Software, 45, 1–68.
stretches of continuous missing data, (ii) identifying and quantifying the Chen, T., & He, T. (2015). Higgs boson discovery with boosted trees. In NIPS 2014
workshop on high-energy physics and machine learning (pp. 69–80).
order of importance of predictors affecting the target (ET o ) while Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings
holding physical interpretability of the input–output dynamics, (iii) of the 22nd ACM SIGKDD international conference on knowledge discovery and data
determining the inflection points in the climatic predictors at which the mining (pp. 785–794).
Chia, M. Y., Huang, Y. F., Koo, C. H., & Fung, K. F. (2020). Recent advances in
transition from low ETo to high ETo rates occurs, and (iv) predictability evapotranspiration estimation using artificial intelligence approaches with a focus
of ETo at different weather stations. More importantly, XGBoost per on hybridization techniques-A review. Agronomy, 10, 101.
formed as well as DL in predicting the watershed-scale ETo , suggesting Cramer, S., Kampouridis, M., Freitas, A. A., & Alexandridis, A. K. (2017). An extensive
evaluation of seven machine learning methods for rainfall prediction in weather
that interpretable ML models can produce model predictions as good as
derivatives. Expert Systems with Applications, 85, 169–181.
noninterpretable ML models on structured tabular hydroclimatic Devitt, T., Wright, A., Cannatella, D., & Hillis, D. (2019). Species delimitation in
datasets. endangered groundwater salamanders: Implications for aquifer management and
biodiversity conservation. PNAS, 116, 2624–2633.
Dewes, C. F., Rangwala, I., Barsugli, J. J., Hobbins, M. T., & Kumar, S. (2017). Drought
risk assessment under climate change is sensitive to methodological choices for the
estimation of evaporative demand. PLoS One, 3, Article e0174045.
10
Doran, D., Schulz, S., & Besold, T. R. (2017). What does explainable AI really mean? A Moscato, V., Picariello, A., & Sperlí, G. (2021). A benchmark of machine learning
new conceptualization of perspectives. arXiv preprint arXiv:1710.00794. approaches for credit score prediction. Expert Systems with Applications, 165, Article
Feng, Y., Cui, N., Gong, D., Zhang, Q., & Zhao, L. (2017). Evaluation of random forests 113986.
and generalized regression neural networks for daily reference evapotranspiration Nourani, V., Baghanam, A. H., Adamowski, J., & Kisi, O. (2014). Applications of hybrid
modelling. Agricultural Water Management, 193, 163–173. wavelet–artificial intelligence models in hydrology: A review. Journal of Hydrology,
Fernández-Delgado, M., Sirsat, M., Cernadas, E., Alawadi, S., Barro, S., & Febrero- 514, 358–377.
Bande, M. (2019). An extensive experimental survey of regression methods. Neural Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O.,
Networks, 111, 11–34. Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A.,
Fernández-Delgado, M., Sirsat, M., Cernadas, E., Alawadi, S., Barro, S., & Febrero- Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, E. (2011). Scikit-learn:
Bande, M. (2019). Regression tree ensembles for wind energy and solar radiation Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
prediction. Neural Networks, 111, 11–34. Qiu, H., Luo, L., Su, Z., Zhou, L., Wang, L., & Chen, Y. (2020). Machine learning
Fischer, T., & Krauss, C. (2018). Deep learning with long short-term memory networks approaches to predict peak demand days of cardiovascular admissions considering
for financial market predictions. European Journal of Operational Research, 270, environmental exposure. BMC Medical Informatics and Decision Making, 20, 83.
654–669. Raghavendra, S. N., & Deka, P. C. (2014). Support vector machine applications in the
Gocic, M., Petković, D., Shamshirband, S., & Kamsin, A. (2016). Comparative analysis of field of hydrology: a review. Applied Soft Computing, 19, 372–386.
reference evapotranspiration equations modelling by extreme learning machine. Ratolojanahary, R., Ngouna, R. H., Medjaher, K., Junca-Bourié, J., Dauriac, F., &
Computers and Electronics in Agriculture, 127, 56–63. Sebilo, M. (2019). Model selection to improve multiple imputation for handling high
Goyal, M. K., Bharti, B., Quilty, J., Adamowski, J., & Pandey, A. (2014). Modeling of rate missingness in a water quality dataset. Expert Systems with Applications, 131,
daily pan evaporation in sub tropical climates using ANN, LS-SVR, Fuzzy Logic, and 299–307.
ANFIS. Expert Systems with Applications, 41, 5267–5276. Reichstein, M., Camps-Valls, G., Stevens, B., Jung, M., Denzler, J., Carvalhais, N., et al.
Granata, F., Gargano, R., & de Marinis, G. (2020). Artificial intelligence based (2019). Deep learning and process understanding for data-driven earth system
approaches to evaluate actual evapotranspiration in wetlands. Science of the Total science. Nature, 566, 195–204.
Environment, 703, Article 135653. Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). Why should I trust you? Explaining the
Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Pedreschi, D., & Giannotti, F. (2018). predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international
A survey of methods for explaining black box models. ACM Computing Surveys, 51, conference on knowledge discovery and data mining (pp. 1135–1144).
Article 93. Rigden, A. J., & Salvucci, G. D. (2017). Stomatal response to humidity and CO2
Guo, J., Zhou, J., Qin, H., Zou, Q., & Li, Q. (2011). Monthly streamflow forecasting based implicated in recent decline in US evaporation. Global Change Biology, 23,
on improved support vector machine model. Expert Systems with Applications, 38, 1140–1151.
13073–13081. Rind, D., Goldberg, R., Hansen, J., Rosenzweig, C., & Ruedy, R. (1990). Potential
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, evapotranspiration and the likelihood of future drought. Journal of Geophysical
9, 1735–1780. Research, 95, 9983–10004.
Hu, C.-A., Chen, C.-M., Fang, Y.-C., Liang, S.-J., Wang, H.-C., Fang, W.-F., Sheu, C.-C., Rudin, C. (2019). Stop explaining black box machine learning models for high stakes
Perng, W.-C., Yang, K.-Y., Kao, K.-C., Wu, C.-L., Tsai, C.-S., Lin, M.-Y., & Chao, W.-C. decisions and use interpretable models instead. Nature Machine Intelligence, 1,
(2020). Using a machine learning approach to predict mortality in critically ill 206–215.
influenza patients: a cross-sectional retrospective multicentre study in Taiwan. BMJ Saggi, M. K., & Jain, S. (2019). Reference evapotranspiration estimation and modeling of
Open, 10, Article e033898. the Punjab northern India using deep learning. Computers and Electronics in
Hung, C., Chen, W., Lai, P., Lin, C., & Lee, C. (2017). Comparing deep neural network and Agriculture, 156, 387–398.
other machine learning algorithms for stroke prediction in a large-scale population- Sánchez-Monedero, J., Salcedo-Sanz, S., Gutiérrez, P. A., Casanova-Mateo, C., & Hervás-
based electronic medical claims database. In 2017 39th annual international Martínez, C. (2014). Simultaneous modelling of rainfall occurrence and amount
conference of the ieee engineering in medicine and biology society (EMBC), Seogwipo (pp. using a hierarchical nominal–ordinal support vector classifier. Engineering
3110–3113). Applications of Artificial Intelligence, 34, 199–207.
Jung, M., Reichstein, M., Ciais, P., Seneviratne, S. I., Sheffield, J., Goulden, M. L., Scheff, J., & Frierson, D. (2015). Terrestrial aridity and its response to greenhouse
Bonan, G., Cescatti, A., Chen, J., De Jeu, R., et al. (2010). Recent decline in the warming across CMIP5 climate models. Journal of Climate, 28, 5583–5600.
global land evapotranspiration trend due to limited moisture supply. Nature, 467, Shapley, L. (1953). A value for n-person games. Contributions to the Theory of Games (pp.
951–954. 307–317).
Kingston, D., Todd, M., Taylor, R., Thompson, J., & Arnell, N. (2009). Uncertainty in the Sun, S., Song, Z., Chen, X., Wang, T., Zhang, Y., Zhang, D., Zhang, H., Hao, Q., & Chen, B.
estimation of potential evapotranspiration under climate change. Geophysical (2020). Multimodel-based analyses of evapotranspiration and its controls in China
Research Letters, 6, L20403. over the last three decades. Ecohydrology, 13, Article e2195.
Lemordant, L., Gentine, P., Swann, A. S., Cook, B. I., & Scheff, J. (2018). Critical impact Taormina, R., Chau, K.-W., & Sethi, R. (2012). Artificial neural network simulation of
of vegetation physiology on the continental hydrologic cycle in response to hourly groundwater levels in a coastal aquifer system of the Venice lagoon.
increasing CO2. PNAS, 115, 4093–4098. Engineering Applications of Artificial Intelligence, 25, 1670–1676.
Lipton, Z. C. (2018). The mythos of model interpretability. Queue, 16, 31–57. Vanhaeren, T., Divina, F., García-Torres, M., Gómez-Vela, F., Vanhoof, W., & Martínez-
Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B., Katz, R., García, P. M. (2020). A comparative study of supervised machine learning
Himmelfarb, J., Bansal, N., & Lee, S.-I. (2020). From local explanations to global algorithms for the prediction of long-range chromatin interactions. Genes, 11, 985.
understanding with explainable AI for trees. Nature Machine Intelligence, 2, Wu, L., Zhou, H., Ma, X., Fan, J., & Zhang, F. (2019). Daily reference evapotranspiration
2522–5839. prediction based on hybridized extreme learning machine model with bio-inspired
McMahon, T., Peel, M., Lowe, L., & Srikanthan, R. (2013). TR, M. Estimating actual, optimization algorithms: Application in contrasting climates of China. Journal of
potential, reference crop and pan evaporation using standard meteorological data: a Hydrology, 577, Article 123960.
pragmatic synthesis. Hydrology and Earth System Sciences, 17, 1331–1363. Xu, S., Yu, Z., Yang, C., Ji, X., & Zhang, K. (2018). Trends in evapotranspiration and their
Mehdizadeh, S., Behmanesh, J., & Khalili, K. (2017). Using MARS, SVM, GEP and responses to climate change and vegetation greening over the upper reaches of the
empirical equations for estimation of monthly mean reference evapotranspiration. Yellow River Basin. Agricultural and Forest Meteorology, 263, 118–129.
Computers and Electronics in Agriculture, 139, 103–114. Zamani Joharestani, M., Cao, C., Ni, X., Bashir, B., & Talebiesfandarani, S. (2019). Pm2. 5
Mishra, V., Kumar, R., Shah, H. L., Samaniego, L., Eisner, S., & Yang, T. (2017). prediction based on random forest, xgboost, and deep learning using multisource
Multimodel assessment of sensitivity and uncertainty of evapotranspiration and a remote sensing data. Atmosphere, 10, 373.
proxy for available water resources under climate change. Climate Change, 141,
451–465.
11

Expert Systems With Applications Chakraborty Et Al 2021

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Expert Systems With Applications Chakraborty Et Al 2021

Uploaded by

Copyright:

Available Formats

Expert Systems With Applications 170 (2021) 114498

Contents lists available at ScienceDirect

Expert Systems With Applications

Interpretable vs. noninterpretable machine learning models for data-driven

Fig. 1. Graphical description of the eXplainable ML (eXML) framework.

CNC 0.019 0.0 0.0 0.0 0.0 0.0

Talebiesfandarani, 2019). Vanhaeren et al. (2020) also reported that

Fig. 8. Details of the ML models.

CNC BCRAGDO FSR SGD SSD

LR RMSE* 0.26 0.21 0.28 0.20 0.26

LSTM RMSE * 1.23 (0.1) †

RF RMSE* 0.22 0.18 0.23 0.17 0.20

XGBoost RMSE* 0.19 0.16 0.19 0.16 0.21

SVM RMSE* 0.27 0.24 0.27 0.23 0.24

You might also like