Environmental Modelling & Software: F.T. Andrews, B.F.W. Croke, A.J. Jakeman

Environmental Modelling & Software 26 (2011) 1171e1185
Contents lists available at ScienceDirect
Environmental Modelling & Software

journal homepage: www.elsevier.com/locate/envsoft
An open software environment for hydrological model assessment

and development
F.T. Andrews a, c, *, B.F.W. Croke a, b, c, A.J. Jakeman a, b
a
Integrated Catchment Assessment and Management (iCAM) Centre, Fenner School of Environment and Society, The Australian National University,
Building 48a, Linnaeus Way, ACT 0200, Australia
b
National Centre for Groundwater Research and Training, School of the Environment, Flinders University, GPO Box 2100, Adelaide SA 5001, Australia
c
Department of Mathematics, The Australian National University, Building 27, ACT 0200, Australia
a r t i c l e i n f o a b s t r a c t
Article history: The hydromad (Hydrological Model Assessment and Development) package provides a set of functions
Received 3 September 2010 which work together to construct, manipulate, analyse and compare hydrological models. The class of
Received in revised form hydrological models considered are dynamic, spatially-aggregated conceptual or statistical models. The
16 March 2011
package functions are designed to fit seamlessly into the R system, and builds on its powerful data
Accepted 12 April 2011
Available online 20 May 2011
manipulation and analysis capabilities. The framework used in the package encourages a separation of
model components based on Unit Hydrograph theory; many published models are consistent with this
and implementations of several are included. For comparative assessment, model performance can be
Keywords:
Model evaluation
analysed over time and with respect to covariates to reveal systematic biases. Support has been built in
Hydrological models for event-based analysis of data and assessment of model performance. Fit statistics can be defined by
Modelling frameworks choices of (1) temporal scale and aggregation function; (2) weighting and transformation; and (3)
Unit hydrograph reference model. One can define new Soil Moisture Accounting models, routing models, calibration
Event separation methods, objective functions, and evaluation statistics, while retaining as much of the default framework
R as is useful. And as the package code is available under a free software licence, one always has the
freedom to adapt it as required. Use of the software is demonstrated in a case study of the Queanbeyan
River catchment in South-East Australia.
Crown Copyright Ó 2011 Published by Elsevier Ltd. All rights reserved.
Software availability represent processes at the catchment scale and evaluate the
implications of hypotheses of different model structures.
Name of software: hydromad In different contexts, the level of detail may vary from simple
Developers: Felix Andrews statistical or conceptual models to complex spatially distributed or
Software required: R physics-based models (Wheater et al.,1993). In practice all catchment
Licence: GNU General Public Licence version 2 hydrology models need to be calibrated to measured data; model
Availability: http://hydromad.catchment.org/ and http://github. parameters do not have a precise physical analogue when applied at
com/floybix/hydromad large scales (Wagener et al., 2009). The simplest models of catchment
hydrology dynamics are spatially lumped, and conceptual or empir-
ical in their approach. Such models can be used to address questions
1. Introduction and motivation in terms of aggregate effects, without considering the detail of the
processes involved. Often a single dominant mode or process will be
Catchment hydrology is the study of the water cycle at the scale identified (Young, 2003). We will focus on this class of models.
of a drainage basin, focussing for its practical importance on flow at Models, although they are usually given names, should not be set
the catchment outlet. Central to this is the use of models to in stone. They encode sets of assumptions, which may be more or
less valid at different times, places, and scales; and, importantly, for
different purposes. Accordingly, models should be tested and eval-
* Corresponding author. Integrated Catchment Assessment and Management
(iCAM) Centre, Fenner School of Environment and Society, The Australian National
uated in the unique context of each application (Fenicia et al., 2008).
University, Building 48a, Linnaeus Way, ACT 0200, Australia. Given the need for parsimonious models to address a range of
E-mail address: [email protected] (F.T. Andrews). management and research problems, many have advocated flexible,
1364-8152/$ e see front matter Crown Copyright Ó 2011 Published by Elsevier Ltd. All rights reserved.
doi:10.1016/j.envsoft.2011.04.006
1172 F.T. Andrews et al. / Environmental Modelling & Software 26 (2011) 1171e1185
iterative model development processes (Fenicia et al., 2008; Wagener estimate streamflow at a catchment outlet, given inputs of areal
et al., 2001). The top-down modelling (Sivapalan et al., 2003; rainfall and potential evaporation (or, more commonly, tempera-
Littlewood et al., 2003) and Data-Based Mechanistic modelling ture data as an indicator of this), and possibly other inputs. These
(Young and Ratto, 2008) agendas are particularly prominent, and in inputs and outputs are time series, typically at a daily time step, and
fact these are quite similar to each other (Young, 2003). Such processes extending for many months or years.
involve intensive data analysis to drive model development, with As spatially lumped models, they do not explicitly represent
detailed comparison and evaluation of model performance. Effective spatial variation over the catchment area. In particular, the stan-
software support for such tasks is crucial. There is a consequent need dard formulations do not attempt to model effects of changes in
for flexible software environments for hydrological modelling, tightly land cover. These models are usually calibrated to a period of
linked with data analysis and model analysis methods. observed streamflow, and the parameters defining the modelled
A major challenge for modelling frameworks is to be flexible relationship between rainfall, evaporation and flow are assumed to
enough to support creative problem solving in hydrology; As Savenije be stationary in this period.
(2009) argues, there is an art to hydrology that is often not recognised. The model framework used in the hydromad package is very
However, flexibility ultimately needs to be constrained by rigour. general, but encourages a separation of model components based on
Many authors argue for standardised tests and comparisons of models Unit Hydrograph theory. This implies a two-component structure of
(e.g. Jakeman et al., 2006; Beven, 2008; Dawson et al., 2007, 2010). a soil moisture accounting (SMA) module and a routing or unit
A number of software frameworks for hydrological modelling hydrograph module (Fig. 1). The SMA module converts rainfall and
have been developed and are in active use, such as OMS (Leavesley temperature into effective rainfall: the amount of rainfall which
et al., 2006), PREVAH (Viviroli et al., 2009), FUSE (Clark et al., 2008) eventually reaches the catchment outlet as streamflow (i.e. that
and RRMT (Wagener et al., 2001). Most such frameworks are designed which is not lost as evapotranspiration etc). The routing module
for complex problem types, and necessarily restrict the analysis converts effective rainfall into streamflow, which usually amounts
options during model development. An exception is RRMT, the Rain- to convolving it with a constant recession curve, the unit hydro-
fall-Runoff Modelling Toolbox, which is used within the MATLAB graph. This structure is consistent with RRMT (Wagener et al., 2001).
environment, and is therefore able to leverage the powerful data In fact, it is not strictly required to decompose a model this way:
manipulation functions it provides. Indeed, the software described a full model could be defined for the SMA component, with the
in this paper has been influenced by the design of RRMT. However, the routing component omitted. It is worth noting, also, that the two-
MATLAB and RRMT products are proprietary, which obscures poten- components are not necessarily simple models but may be
tially important methodological details, and withholds from users composite models, and the whole model may also be arranged into
the freedom to adapt the code and share their innovations. a composite structure.
This paper will introduce a software package for top-down A notable feature of the two-component model structure is that
modelling of catchment hydrology, hydromad. It is based loosely on it permits identification of the routing model in a way that is
the unit hydrograph theory of rainfall-runoff modelling, as described in somewhat de-coupled from identification of the soil moisture
Section 2. hydromad is an open source software package for the R accounting model. There are a number of different strategies that
system which is introduced in Section 3. As such it can be used cohe- can be used to calibrate a full hydrological model. The typical
sively with workflows based on this increasingly popular software. approach is a joint optimisation of all parameters. Alternatively the
Section 4 covers the scope of the hydromad package and the functions it unit hydrograph could be estimated directly from streamflow data
provides. Two areas of focus for the package, and this paper, are (once only, after which it can be fixed), using inverse filtering
discrete event separation and the design of fit statistics, discussed in (Andrews et al., 2010), or average event unit hydrograph estimation
Sections 5 and 6 respectively. Section 7 demonstrates how event-based (Croke, 2006). Or a simple data-based method could be used in the
data analysis can be useful in a modelling context. In Section 8 we SMA component to estimate effective rainfall, and this used for
demonstrate simple conceptual modelling; integral to this is a detailed a preliminary calibration of the routing model.
assessment of model performance, with a view to further model A large number of published models are consistent with the unit
development by discovering systematic biases, in Section 9. hydrograph framework. Several of these have already been
The example we will look at is the Queanbeyan River at Tinderry implemented in the hydromad package and are listed in Section 4.
streamflow gauge, near Canberra in South-Eastern Australia. It has
a catchment area of 490 square kilometres and is part of the Upper
Murrumbidgee catchment. It has seen much reduced river flow 3. R: the synergies of community
levels in recent years (2000e2009). This catchment is unusual in
that it displays marked non-stationary response characteristics, R is a language and environment for statistical computing and
and extended drying periods with intermittent baseflow, making it graphics (R Development Core Team, 2010; Ihaka and Gentleman,
difficult to model (Kim et al., 2007). Non-stationarity can be major 1996). It is based on the high-level S language (Becker et al.,
problem when applying simple conceptual models; for the 1988), designed for working with data and models.
purposes of this paper it allows us to demonstrate methods for As it aims for maximum power and flexibility, the primary mode
assessment of variable model performance. of use is typing commands interactively, or writing scripts, rather
Daily streamflow volume records are available from 1966-08- than pointing and clicking. The appeal of the system lies in the
04, and the data used here extend to 2005-12-31. Corresponding ability to fluidly transform data, by passing it through chains of
estimates of areal rainfall were derived by spatial interpolation functions. Analysis steps can quickly be automated and extended by
from several rain gauges operated by the Australian Bureau of defining high-level functions; and by using features like argument
Meteorology and EcoWise. Daily maximum temperature records
from Canberra Airport were also used.
rainfall
Soil Moisture
temp. / PET effective (unit hydrograph) streamflow
2. The hydrological model framework Accounting
rainfall routing model
other inputs (SMA) model
The class of hydrological models considered here are dynamic,

spatially-aggregated conceptual or statistical models. They Fig. 1. The hydromad model framework, based on unit hydrograph theory.
F.T. Andrews et al. / Environmental Modelling & Software 26 (2011) 1171e1185 1173
default values and catch-all arguments, such functions can be made Table 1
easy to use while retaining maximum flexibility. Core features and functions provided by the hydromad package. See text for more.
As free (open source) software, R has become a melting pot for hydromad() specifies a model, with fixed and/or free parameters
computational statistics, as evidenced by the number of contrib- update() modifies the structure, parameters or data of an existing model
uted packages available, which has grown exponentially since fitBy.() calibrates a model to data (several methods)
predict() simulates streamflow (etc.) from a fitted model
records began in 2001 (Fox, 2008). This opens up powerful syner- simulate() generates a set of models by sampling over parameter ranges
gies within and between research communities.1 As Bates (2008) summary() calculates fit statistics and other information
puts it, “I do not think of R as a statistical package or a product. runlist() constructs a named list of models for comparative analysis
To me, R is a community.”
The R community has built a reputation for excellence in statistical
graphics (Sarkar, 2008; Wickham, 2009), and has developed world- The core features the package are listed in Table 1 according to
class implementations of, for example, generalised additive models the names of the corresponding functions. Of course, each of these
(Wood, 2004), time-varying linear models (Petris et al., 2009), fore- functions has several arguments. The details are given in help
casting methods (Hyndman, 2010), optimisation algorithms (Ardia pages, which can be accessed within R, or online at http://
and Mullen, 2010), and data mining algorithms (Williams, 2009). hydromad.catchment.org/. A tutorial document is also available.
The R system thus provides a rich software ecosystem in which Other standard R methods are provided for convenience to work
a hydrological modelling framework can grow and evolve. with hydromad objects: for example one can extract parameter
values with coef(), and model results with fitted(), observed(), or
4. The hydromad package residuals(). Several plotting functions are also available. Most
methods will work either on a single model object or on a runlist
4.1. Core functions (a named list of models).
The hydromad package provides a set of functions which work

together to construct, manipulate, analyse and compare hydro- 4.2. The models
logical models. It is intended for:
Currently, the package currently includes implementations of
defining spatially lumped hydrological models and fitting them several published SMA models
to data;
simulating outputs of these models, including any state IHACRES Catchment Wetness Index (CWI) model (Jakeman and
variables; Hornberger, 1993) including the generalisation of Ye et al.

evaluating and comparing these models: summarising perfor- (1997), modified according to Croke et al. (2005). A
mance by different measures and over time, using graphical temperature-dependent drying rate is used to estimate
displays and statistics; a wetness index, which defines the runoff ratio. This model
straightforward integration with other types of data analysis includes a scale factor which is estimated by mass balance with
and model analysis in R, including larger composite modelling observed streamflow (or based on the gain of the transfer
studies in which rainfall-runoff is just one part. function used for routing, if applicable).
IHACRES Catchment Moisture Deficit (CMD) model (Croke and
Development of hydromad was motivated by the need to Jakeman, 2004). Accounts for evapotranspiration and changes
understand and improve the results from several simple, concep- in catchment storage. The version used here includes a power
tual models. As such, the design was driven by the practical need to law form, and is described in Section 8.1;
work with those models in a research context. The R environment Sacramento Soil Moisture Accounting model (Burnash, 1995)
provides the flexibility necessary for effective research. The developed by the US National Weather Service. With 13
package itself was designed to be fairly general, such that one can parameters it is more complex than the other models listed
define new Soil Moisture Accounting models, new routing models, here. Many published studies have used this model, often with
new calibration methods, new objective functions, and new eval- good results. This implementation uses code from the Univer-
uation statistics, while retaining as much of the default framework sity of Arizona.
as is useful. And as the package code is available under an open the GR4 J model (Perrin et al., 2003), modèle du Génie Rural à 4
source licence, one always has the freedom to adapt it as required. paramètres Journalier. In fact this is split up into a 1-parameter
Particularly strong support has been built in for event-based SMA model and a 3-parameter routing model. The non-linear
analysis of data and model performance. This involves isolating routing component is based on a groundwater reservoir.
relatively discrete events from time series, and analysing statistical the AWBM, Australian Water Balance Model (Boughton, 2004).
properties of the events, rather than the traditional approach of In its simplest form this consists of a 1-parameter SMA model,
using every time step at which data were recorded. although the full form has 6 parameters. It is traditionally used
The package functions are designed to fit seamlessly into the R with a two store (3 parameter) routing component.
system. Consequently, many of the functions in hydromad are the single-bucket models of Bai et al. (2009), including inter-
implementations of standard generic functions. A constructor ception, saturation excess runoff and subsurface flow.
function creates a hydromad object, and this can be passed on to a degree-day factor snowmelt model of Kokkonen et al. (2006).
other functions for calibration, analysis, reporting, etc. The object A fraction of rainfall becomes snow, based on temperature
encapsulates a model (composed of a SMA model and/or a routing thresholds, and a snow reservoir is estimated. Discharge from the
model), with specified parameter values or parameter ranges, along snow model is, currently, fed into the CMD model listed above.
with the data and model outputs.
Models are implemented as R functions which accept time
series input data and parameter values, and return corresponding
1
The use of open source code also promotes academic integrity through simulated outputs. It is worth noting that models can be external to
“reproducible research” (Gentleman and Temple Lang, 2007). R if they are called from within a simple wrapper function.
A set of simple benchmark models is also available; these are River starting in 1983. The objective function used was the R2 using
useful for some kinds of calibration methods, and for null models in untransformed data (Nash and Sutcliffe, 1970). Note that these
a comparative analysis. Note that the dbm and runoffratio models results are only for illustration of the general point, and that the
make use of observed streamflow and so can not be used for algorithms were run only once with the somewhat arbitary default
general simulation. The benchmark models are: settings. While the differences shown may not be significant in
hydrological terms, they are significant within the specific problem
scalar: a constant runoff ratio (i.e. effective rainfall is a constant of optimising an objective function. In this case the PORT algorithm
fraction of rainfall). The fraction is estimated for mass balance found a good result relatively quickly. It is often the case that if a fit
with streamflow, or based on the gain of the transfer function is needed quickly, such as in a large simulation study or an inter-
used for routing, if applicable. active, exploratory analysis, then algorithms such as PORT or Nelder-
intensity: runoff ratio estimated by raising rainfall to a power, Mead are most appropriate.
up to threshold rainfall rate with maximum runoff. With The current set of general optimisation functions is:
a power of 0 this reduces to the scalar model.
runoffratio: a runoff ratio, estimated by a moving average fitBySampling: allows for Random, Latin Hypercube or regular
through the data, is used to scale rainfall. gridded sampling. The model with best objective function from
dbm: observed streamflow raised to a power defines an index the sample is selected. Note that these sampling methods can
of antecedent wetness. This index, possibly lagged, is used to be used for more general analysis with the simulate() function;
scale the rainfall. As a typical structure used in the early stages an example is given in Section 8.
of Data-Based Mechanistic modelling (Young, 2003), it is fitByOptim: uses R ’s built in optim and nlminb functions to
termed dbm in hydromad. optimise an objective function. The initial parameter values are
chosen from a preliminary sampling run, or alternatively,
Routing models currently include a number of samples can be used as different starting points
(multi-start mode). Some of the available methods are:
armax: ARMAX-type (auto-regressive, moving average, with e “PORT”, using functions from the Bell Labs PORT library2
exogenous inputs), also known as linear transfer functions e “Nelder-Mead”, a simplex method;
(Jakeman et al., 1990; Ljung, 1999); e “BFGS”, a quasi-Newton method;
expuh: exponential component configurations (up to 3 in e “SANN”, Simulated Annealing, designed to find a reasonable
parallel and/or series). The time constants of each are specified, global solution even in ill-conditioned, high-dimensional
as well as a choice of configuration. A loss term can be included solution spaces.
to represent simple groundwater exchange, similar to the form fitBySCE: the Shuffled Complex Evolution algorithm developed
of Herron and Croke (2009). at the University of Arizona (Duan et al., 1992);
powuh: a power law form of the unit hydrograph, para- fitByDE Differential Evolution (Price et al., 2005), provided by
meterised according to Croke (2006). the DEoptim package; and
varuh: a variable partitioning extension of a 2-store model: this fitByDream DiffeRential Evolution Adaptive Metropolis (Vrugt
is an example of a routing model which is not a constant unit et al., 2009). This is a Markov Chain Monte Carlo method,
hydrograph, but rather depends on the level of rainfall. giving probabilistic results. However, it can also be used simply
as an optimisation algorithm.
Note that armax models up to third order can be converted into
expuh form within hydromad. For these models, efficient estimation In addition, some specialised calibration functions are available
methods are available: notably the Simple Refined Instrumental for specific models. It is straightforward, too, to make use of any other
Variable (SRIV) algorithm (Young, 2008), and an inverse filtering general optimisation function to calibrate hydromad model objects.
algorithm which estimates the parameters directly from stream- The important issues of event-based analysis and design of
flow data. In the case of expuh, if the solution does not make sense objective functions are discussed in the following two sections.
physicallydhaving negative or imaginary recession ratesdthen
these are re-fitted with constraints. 5. Discrete event separation
4.3. Optimisation algorithms We are often interested in hydrological response properties, and
modelling these, at the event scale rather than at the level of the
Several optimisation functions are available in the hydromad raw data. Furthermore, model residuals are typically highly auto-
package for calibrating models to observed data. The different correlated, which is problematic when attempting to assess model
algorithms may each be preferred for different types of problems, performance. An attractive approach is to separate the streamflow
and most have settings to tune their performance. There is gener- record into relatively isolated events, and work with attributes of
ally a trade-off between rapid convergence to a moderately good events rather than time steps. Events are most often used in the
result, versus a time-consuming search for the best possible solu- literature for extreme value analysis (Katz et al., 2002) or for
tion. The choice in this regard will depend on the task at hand. ephemeral flow systems in arid environments (McIntyre and
The performance of optimisation algorithms, defined further on, Al-Qurashi, 2009), but can also be applied more generally. For
can be visualised with an optimisation trace plot, showing the instance Willems (2009) uses event windows to extract local peaks
improvement in the solution found with increasing effort. Effort in and troughs of a streamflow series. Boyle et al. (2000) separate
this sense means the number of times the model simulation streamflow series into events, although they do model assessment
function is run, which is generally proportional to the total running using the raw time step data rather than event scale properties.
time. One example is given in Fig. 2, which shows results from The hydromad package provides functions for identifying events
calibrating the IHACRES CMD model with a three store unit hydro- based on thresholds and various timing criteria, and applying
graph, involving 3 free parameters in the SMA component and
a further 5 free parameters in the routing component. It was cali-
2
brated to a three-year period of streamflow in the Queanbeyan http://www.bell-labs.com/project/PORT/.
PORT DREAM
0.72
DE
objective function value

NM SCE
BFGS SANN
0.70
PORT
0.68
BFGS
NM
0.66 SANN
SCE
DE
DREAM
0 1000 2000 3000 4000 5000
Function evaluations
Fig. 2. Optimisation traces from 7 algorithms in calibrating an 8-parameter model (described in the text) using an R2 objective function. Dream was run using a likelihood function
analogue to the objective function. Note that each algorithm was run using default settings only, and some of the results could probably be improved by adjusting these settings.
aggregation functions to events. Events can be characterised by skipped, or extended? To help one decide these issues, the
attributes such as total flow or rainfall, antecedent conditions, hydromad package provides a graphical user interface for inter-
temperature or season. actively testing different event definitions on rainfall and
Temporal aggregation is more typically undertaken in terms of streamflow time series.
regular time steps, such as days or months. This is certainly easier, Several event definitions are used in some of the predefined
but is arguably less meaningful in hydrological terms, and is statistics listed in Appendix A, and throughout this paper. They are
susceptible to edge effects. Event-based aggregation requires also illustrated in Fig. 3. Note that they partition the whole time
consideration of how events are to be defined for the purposes at series, rather than leaving any unallocated gaps.
hand. Also, as events may have widely differing durations, in some
cases it may be appropriate to weight events by their duration for e.rain5: events are defined by rainfall exceeding 5 mm per day,
fitting or assessment. and continuing until the next such event. Each single event
Events may be defined either from a rainfall series or from continues at least until rainfall remains below 1 mm for 4 days:
a streamflow series (or both). Rainfall-based events are more i.e. events are not separated unless this condition is met.
suitable for assessing SMA model performance, because even when e.q90: events are defined by observed flow exceeding the 90
there is no streamflow response to rainfall, this is an important percentile level for at least 2 time steps, and continuing until
feature of the model. Streamflow-based events are more suitable the next such event. Each single event continues at least until
for investigating streamflow characteristics such as unit hydro- flow falls below the 90 percentile level for 4 time steps, and
graph estimation or flood frequency analysis. must be separated from the next event by a further 5 time
There are several other considerations too: are only the high steps.
periods above a threshold of interest, or should such a high period e.q90.all: events are defined by observed flow exceeding the 90
instead initiate an event window which continues until the next percentile level for at least 2 time steps, but unlike e.q90, are
such event; or should the high and low periods around a threshold not continuing until the next event; rather, these high flow
both be considered? Should events be terminated by flow (or periods and the low flow periods between them are both
rainfall) falling below a threshold, or falling below a threshold considered as events. The high events continue until the flow
for a certain time, or must they be separated from other events falls below the 90 percentile level for 4 time steps and must be
by a minimum time? Should events of too short a duration be separated from each other by a further 5 time steps.
15 0 20 40 60 80
(rainfall)
e.rain5
10
5
mm / day
15 0
e.q90
10
5
15 0
e.q90.all
10
5
0
1983−01 1983−07 1984−01 1984−07 1985−01 1985−07 1986−01
Fig. 3. Section of the Queanbeyan River dataset, showing areal rainfall (P) and streamflow (Q). The event windows corresponding to the e.rain5, e.q90 and e.q90.all statistics are
shown. The alternating shading is for visual clarity in marking the events: both light and dark periods represent events.
raw events
0.6
0.4
Autocorrelation
0.2
0.0
−0.2
−0.4
−0.6
0 5 10 15 20 25 30 5 10 15
Lag (days) Lag (events)
Fig. 4. Demonstration of event aggregation of the streamflow data from Queanbeyan River, period 1983-01-01e1986-01-01. Events were defined from the rainfall series using the
e.rain5 definition. Note, specifically, that the first-order auto-correlation evident in the raw data is reduced in the event series.
Finally, the choice of aggregation function is crucial: it is typically the sum or mean, but could also be a quantile or a set of
the sum or mean, but could also be a quantile or a set of quantiles. quantiles.
Fig. 4 shows that taking flow sums in the e.rain5 event windows does 2. data transformation: to adjust the implicit weighting put on
eliminate the auto-correlation evident in the raw time series. different flow levels, and the corresponding sensitivity of fit
statistics.
3. reference model: a null model to use as a reference point to
6. Statistics and objective functions compare model performance (after (Nash and Sutcliffe, 1970)).
This is typically as simple as a constant at the mean observed
At least as important as the choice of calibration algorithm is the level, but more informative choices are possible. The choice of
design of an objective function. This requires careful consideration reference model does not change the ranking of models, but
of the intended purpose of the model, how it will be applied, what does change the scale of the statistic.
derived values will be ultimately used, and the level of temporal
precision required. The hydromad package allows statistics and objective functions
Streamflow data tends to be highly skewed, and this leads to be specified as arbitrary R functions. A helper function is avail-
implicitly to a large weighting put on a few large observations, able to streamline the process of defining a statistic based on the
a weighting which is often inappropriate for the purposes of the above considerations: buildTsObjective() takes optional arguments
model and the uncertainty in those particular values. for grouping, an aggregation function, a data transformation in the
Croke (2009) describes an approach to incorporating uncer- Box-Cox family of transformations, and a reference model.
tainty of individual data values into the formulation of an objective The statistics defined by buildTsObjective(), and in most of the
function. This is useful where detailed information on data errors pre-defined statistics, are based on a general form of a fit statistic,
are available. Otherwise, weighting can be managed by a simple termed nseStat(). It is based on the familiar r2, a more general form of
transformation of the data: for instance, if we assume that errors in the Nash-Sutcliffe Efficiency (Nash and Sutcliffe, 1970), defined as
the data are multiplicative, an inverse-variance weighting would
P
correspond to a log transformation. To reduce skewness further, jQ X* j2
a Box-Cox transform (Box and Cox, 1964) can be chosen such that nseStatðQ ; XÞ ¼ r 2 ¼ 1 P * (1)
jQ* Z* j2
the observed data approximates a Normal distribution. This is
demonstrated in Fig. 5. where Q and X are the observed and modelled values respectively,
In terms of the typical assessment of goodness of fit of a simu- possibly in aggregated form; Z is the result from a reference model,
lated time series to the observed time series, there are three main which is the baseline for comparison. It defaults to the mean of
facets to the design of a statistic: observed data after transformation EðQ* Þ, corresponding to the
typical r2 statistic. Subscript * denotes an arbitrary transformation.
1. temporal precision/aggregation: either regular time steps may The set of pre-defined statistics is listed in Appendix A. These
be specified, or hydrological events may be separated in some statistics can be used, combined and adapted, either as objective
way. The choice of aggregation function is crucial: it is typically functions or for model evaluation (discussed later).
0.1 1 10 50 90 99 99.9
raw log box.cox

data (arbitrary scale)
0.1 1 10 50 90 99 99.9 0.1 1 10 50 90 99 99.9
cumulative probability % (Normal distribution)
Fig. 5. Demonstration of a Box-Cox transformation and log transform of the streamflow data from Queanbeyan River, period 1970-01-01e1980-01-01. A reference line is shown
through the 10th and 90th percentiles on the normal probability scale, showing deviations from normality.
7. Data analysis methods were extracted, and scaled for mass balance over the whole period
of record. This simple approach assumes that the unit hydrograph is
As the models considered here are data-driven, there is a core constant. A more sophisticated analysis could use a time-varying
role for data analysis in model development and assessment. parameter model (e.g. Norton and Chanat, 2005). Once effective
Exploratory data analysis is inherently open-ended and should rainfall is estimated, it is straightfoward to calculate the runoff ratio
adapt to the unique problems at hand. As a set of possible starting corresponding to each event, and relate this to other variables. The
points for analysis, these type of methods are often useful: obvious drivers are rainfall amount and antecedent wetness.
Streamflow itself can be used as an index of wetness, assuming
interactive inspection of time series data. The importance of a direct storage-discharge relationship (Young, 2003). Temperature
“eyeballing” the data in detail is sometimes forgotten. or season may also capture some residual effects related to dryness
cross correlation: reveals strength of the linear relationship or rainfall intensity.
between pairs of time series, and can be applied over time to Generalised Additive Models (GAMs) are suitable for this type of
identify non-stationarity of the relationship. analysis. Rather than assuming a functional form, it is possible to
trend estimation by smoothing, where trend can be considered allow the data to define the relationships: using splines as a basis,
generally as any systematic relationship with covariates, the degree of smoothness is chosen by generalised cross-validation.
including (but not limited to) time. This is implemented in the R package mgcv (Wood, 2004).
where spatially distributed data is available, such as records from Fig. 7 shows the estimated effects of the three variables
multiple rain gauges, it is often worth checking for spatial effects. mentioned above. The model was formulated as
Rainfall patterns are complex, with dynamics from time scales > gamðlog2ðrunoffÞwsðlog2ðanteflowÞÞ þ sðlog2ðpeakrainÞÞ
of hours up to decades, and, importantly, with semi-regular þ sðtemperatureÞÞ
seasonal effects. One can easily generate a seasonal and trend
decomposition of the rainfall series using the STL algorithm where s() is the smoothing operator. This implicitly includes an
(Cleveland et al., 1990), shown in Fig. 6. This reveals a semi-regular intercept term; therefore the estimated terms represent effects as
seasonal pattern and a long-term non-seasonal climatic pattern: deviations from the average runoff ratio (on a log scale). The
the period from 1992 onwards appears to be much less variable strongest relationship is with antecedent wetness, and this appears
across years. Unexplained, short-term variation remains and the to be a power law (linear on a logelog scale). It is also obvious that
wet period of 1991 for example appears as inconsistent with the rainfall intensity increases the runoff ratio for rainfall rates above
general seasonal pattern. about 20 mm/day. An additional effect of temperature can be seen,
Of course, many hydrological problems are concerned with the and is significant given the standard error bounds shown.
streamflow response to rainfall which occurs at shorter time scales. This type of analysis can be used to inform conceptual model
A good first step in characterising this response is to examine the development by revealing the form of the effects that need to be
auto-correlation and cross-correlation functions. These represent accounted for. We will revisit these effects when assessing model
average responses. It is also possible to look at distributions of performance in Section 9: by assessing the effectiveness of a model
responses in terms of discrete events. in accounting for each effect we can see which of the process
representations needs to be improved. A more sophisticated anal-
ysis might also estimate interactions between these variables.
7.1. Regression analysis
Rainfall-runoff dynamics were investigated in the Queanbeyan 7.2. Delay times

river dataset. Events were defined by rainfall exceeding 5 mm per
day, and continuing until the next such event: the e.rain5 definition Dynamic hydrological models are typically calibrated with an
from Section 5. To estimate the effective rainfall, rises in streamflow implicit emphasis on streamflow peaks, and are therefore
10
data
5
trend seasona
2
rainfall (mm/month)
0
−2
10
8
6
5
remainder
−5
1968 1970 1972 1974 1976 1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004
Fig. 6. Seasonal and Trend decomposition by Loess (STL) applied to the monthly areal rainfall series (with a moving average over 3 months) for Tinderry catchment. The Loess
window for seasonal extraction was set to 10 years.
6
6
4
4
2
2
0
0
−2
−2
−2
−4
−4
−4
−10 −8 −6 −4 −2 0 3 4 5 6 7 10 15 20 25 30 35 40
log2(anteflow) log2(peakrain) temperature
Fig. 7. rainfall-runoff proportion, estimated in each event window, related to three potentially relevant variables (antecedent streamflow as an index of wetness, peak rainfall, and
antecedent temperature) by fitting a GAM. The panels show additive effect sizes in terms of deviations from the average runoff ratio, in log units. Standard errors (shaded) and
partial residuals (points) are also shown.
particularly sensitive to timing mismatches between rainfall and 8.1. The CMD model
streamflow. Most models can only handle a constant delay time.
However, it is reasonable to expect that the delay between rainfall For a Soil Moisture Accounting model, we start with the IHACRES
and runoff is variable (e.g. due to rainfall intensity, and the timing of Catchment Moisture Deficit (CMD) model of (Croke and Jakeman,
rainfall within each time step). Timing mismatches may also in 2004). This accounts for evapotranspiration and changes in catch-
some cases be due to errors in the data. ment storage in a mass balance equation:
In some problem contexts, such as estimation of sediment load,
it is important to account for high flows but not their timing. In this M½t ¼ M½t 1 P½t þ ET ½t þ U½t (2)
case all flow peaks could be shifted in line with corresponding where M represents catchment moisture deficit (CMD) in mm,
rainfall peaks, to match the typical model assumptions. constrained below by 0 (the nominal fully saturated level). P is
Using the event-based approach, we can analyse delay times catchment areal rainfall, ET is evapotranspiration, and U is drainage
directly. In each event window, the delay time is estimated from the (effective rainfall), in units of mm per day.
cross correlation function of rainfall with streamflow rises. Event Rainfall effectiveness (i.e. drainage proportion) is a simple
windows with no streamflow response (or a response of less than instantaneous function of the CMD level, dropping to zero at M ¼ d.
0.1 mm/day) were ignored. In the default linear form of the model, the rainfall effectiveness
The estimated delay times are shown in Table 2. The table is dU=dP decreases linearly from the full runoff level at M ¼ 0. In the
divided according to the magnitude of peak streamflow: it is power law form it is:
clear that small runoff events have a longer and more variable
delay time (with a median time of 2 days), while large runoff h
dU M
events have a more consistent delay time of 1 day. Whether it is ¼ 1 min 1; (3)
dP d
worth trying to model this effect would depend on the problem
context. Also, some of the outlying delay times may be worth where h is the shape parameter. The actual drainage each time step
investigating as possible errors. For instance, negative delay involves the integral of Equation (3):
times do not make sense physically so may indicate problems in
the data. 8
Having satisfied some basic checks, we can move on to analyse >
> 11 h
>
> Pt ðMt1 dÞ
the dynamics in more detail using conceptual models. >
> d ð1 hÞ þ1 if Mt1 d
>
< d
Mf ¼ 11 h
>
> Pt Mt1
>
> d ð1 hÞ þ 1h if d < Mt1 d þ Pt
>
> d d
>
:
Mt1 Pt if Mt1 > d þ Pt
(4)
8. Model development and calibration

Similar solutions for the linear and trigonometric forms are
This section introduces simple conceptual modelling of the given in Croke and Jakeman (2004).
catchment system. A 3-year calibration period was chosen to focus Evapo-transpiration, as a proportion of the potential rate E[t], is
the model development: 1983-01-01e1986-01-01. also a simple function of the CMD level, with a threshold at M ¼ fd :
Table 2 Table 3
Distribution of lag times following identified rainfall events (above 5 mm/day), Parameters of the CMD Soil Moisture Accounting model, and values used in this
conditioned on the magnitude of the resulting streamflow peak. Each range of study.
magnitudes covers an equal number of events (197), and values shown are
percentages in each range. The lowest third of events have peaks below 24 ML/day, f stress threshold/wilting point (as fraction of d). Range ½0:01; 1
and the highest third have peaks above 184 ML/day. shape power in drainage equation. Values less than 1 select the linear form;
a value of 1 selects the trigonometric form. Range ½0; 100
5 4 3 2 1 0 1 2 3 4 5 Sum d flow threshold (mm). Fixed at 200.
lowest third 0 0 0 1 0 1 46 45 3 2 0 100 e evapotranspiration coefficient. Fixed at 0.166.
middle third 0 1 0 0 0 4 66 28 0 0 0 100
highest third 0 0 0 0 0 5 85 8 0 0 0 100
calculated fit statistics from each simulation: r.squared and

r.sq.log, defined in Appendix A. For each statistic, the best 3
simulations were identified, and also what might be termed
Mf
ET ½t ¼ eE½tmin 1; exp 2 1 (5) a 90% coverage set, following Blasone et al. (2008): the smallest
fd
set of the best-performing parameter values, such that 90% of the
where Mf is the CMD level M after precipitation and drainage have observed streamflow levels fall within the range of the ensemble
been accounted for. simulation.
Following Croke and Jakeman (2004), we fix the evapotranspi- The 90% coverage sets and optimum values are compared
ration coefficient e at a reasonable value for use with daily between the two objective functions for each pair of parameters in
maximum temperature data: 0.166. The flow threshold d has been Fig. 8. The results show a reasonable level of agreement between
found not to be a very sensitive parameter, and was fixed at a value the two statistics, as one would hope to be the case. The parameters
of 200 mm. This leaves the stress threshold f and shape parameters shape and tau_s can take on values over their full prior range,
to be calibrated. Reasonable ranges were selected for these, shown although they do constrain the parameter space through their
in Table 3. interactions (e.g. high shape and low tau_s is unacceptable
according to the r.sq.log statistic). There are differences between the
8.2. Routing component objective functions too: r.sq.log statistic favours longer time
constants and a higher fractional volume of slow-flow (vs).
For the routing component we use an exponential form of the Regarding the SMA component, there appears to be an interaction
unit hydrograph with two stores in parallel.3 This form has been between the two parameters (f and shape), and an ambiguity in
found to perform well in a variety of studies (e.g Jakeman et al., where the optimum lies. While most of the optima are around
1990). Each component is defined by a time constant s and frac- shape ¼ 0,f ¼ 0.7, which is a linear form of the drainage equation,
tional volume v, or equivalently a recession rate a and peak one result favoured by r.squared is shape ¼ 32, f ¼ 0.8, which is
response b. They are related as a ¼ expð1=sÞ and b ¼ vð1 aÞ for a power law form. In fact, with such a high power the drainage
each component. When there are two-components in parallel, equation is almost a step function. This ambiguity is explored in the
these are conventionally called slow (s) and quick (q) flow following sections.
components. The total simulated flow X at each time step t is the Random sampling is extremely inefficient for exploring high
sum of the two: probability density regions of the parameter space. For models with
more parameters, a random sampling approach will not be able to
Xs ½t ¼ as Xs ½t 1 þ ba U½t find optimum parameter sets in a reasonable amount of time.
Xq ½t ¼ aq Xq ½t 1 þ bq U½t (6) Adaptive sampling schemes, such as Markov Chain Monte Carlo
X½t ¼ Xs ½t þ Xq ½t (MCMC) methods, have been shown to be essential in such contexts
(e.g. Blasone et al., 2008). As such, the hydromad package has
Only one of the two fractional volume parameters needs to be a function to estimate feasible parameter sets from the output of
specified; the other is the remainder, ensuring that the total volume the DREAM MCMC algorithm, as well as from purely random
is unchanged. Therefore this routing component has 3 free sampling.
parameters, and the full model has 5 free parameters.
8.3. Simulation over parameter spaces 8.4. Calibration
The widespread observation of equifinality (Beven, 2006), In order to assess the model performance in detail it is useful
whereby parameter values can not be uniquely identified on the to identify two or three models which capture aspects of our
basis of the available data, should make us cautious when cali- uncertainty about the parameter values and representation of
brating models. It can be useful to visualise the objective function processes.
surface in parameter space corresponding to the calibration The CMD model structure as described above was fitted to the
problem. This is a non-trivial problem and there are various calibration period data using two contrasting objective functions:
possible approaches (Wagener and Kollat, 2007). Apart from the the typical r.squared using raw data (Nash and Sutcliffe, 1970); and
computational demands of random sampling, there are significant the same with log-transformed data (r.sq.log). A log transform
cognitive demands in attempting to visualise multi-dimensional would be appropriate when assuming multiplicative data errors,
functions. and naturally gives less weight to peak flows than with raw data. As
Of particular interest is comparing different objective func- above, the observed 10 percentile flow was used as an offset. In this
tions over the parameter space. We generated 2000 stratified case the PORT method of fitByOptim was used for calibration. The
random samples over pre-defined parameter ranges, and two resulting parameter sets capture the ambiguity in the SMA
parameters f and shape noted in the previous section, as can be
seen from the values given in Table 4.
3
This is using the expuh routing function, which currently handles up to three One more model was selected to act as a reference. The intensity
stores in parallel and/or series. SMA model estimates runoff from rainfall on the corresponding
Fig. 8. Optimum points and regions of parameter space, projected onto each pair of parameters. Symbols show the best 3 simulations according to each statistic, and lines show
convex hulls around the corresponding 90% coverage sets. Small points represent all 2000 simulations. These simulations used a CMD model with 2-store routing applied to the
period 1983-01-01e1986-01-01 of the Queanbeyan River data.
time step only, neglecting any consideration of antecedent condi- is shown in Fig. 9. As the plot is on a log scale, it highlights the
tions. In terms of the effects shown in Fig. 7, only the peak rainfall low flow dynamics; other aspects of the model fit are considered
effect in the middle panel is included. This provides a useful in Section 9. The most obvious feature of the plot is that the
reference, because any improvement over this can be largely r.squared calibration is over-estimating most of the low flows and
attributed to the dynamic process representation of the CMD lacking any response to the smaller peaks. This reflects the
model. The intensity model is defined as: skewness in the raw data and the consequent overpowering
g influence of peak values on the objective function. Somewhat
U P surprisingly, many of the large peaks in the period shown are
¼ cmin 1; (7)
P Pmax
Table 5
This model, together with the same two store routing component Fit statistics for three candidate models calibrated to the Queanbeyan River data.
used with the CMD model, was calibrated to the r.squared objective
(a) Calibration period 1983e1986
function. Calibrated parameter values were g ¼ 0.8, Pmax ¼ 160,
r.squared r.sq.log e.rain5 e.rain5.log e.q90 e.q90.all
and c ¼ 1. The unit hydrograph routing parameters are given in
Table 4. cmd_raw 0.72 0.62 0.83 0.75 0.66 0.74
cmd_log 0.54 0.77 0.81 0.82 0.70 0.81
A sample of the simulated time series from the two calibrated
intensity 0.67 0.39 0.65 0.58 0.64 0.75
CMD models, compared to the observed streamflow time series,
(b) Wet period 1970e1979
r.squared r.sq.log e.rain5 e.rain5.log e.q90 e.q90.all
cmd_raw 0.60 0.66 0.61 0.79 0.52 0.55
Table 4
cmd_log 0.40 0.83 0.60 0.88 0.50 0.58
Calibrated parameters for models calibrated to the period 1983-01-01e1986-01-01
intensity 0.59 0.31 0.59 0.54 0.33 0.51
for the Queanbeyan River catchment.
(c) Dry period 1990e1999
f shape tau_s tau_q v_s delay r.squared r.sq.log e.rain5 e.rain5.log e.q90 e.q90.all
cmd_raw 0.79 32 300 0.88 0.58 1 cmd_raw 0.73 0.53 0.79 0.71 0.75 0.76
cmd_log 0.70 1 203 2.99 0.39 1 cmd_log 0.54 0.75 0.80 0.83 0.79 0.80
intensity 12 0.86 0.61 1 intensity 0.60 0.15 0.68 0.34 0.52 0.65
observed cmd_raw cmd_log
10
1
1/10
1/100
1983−07 1984−01 1984−07 1985−01 1985−07 1986−01
Fig. 9. Fitted streamflow time series from IHACRES CMD models, compared to the observed time series.
also over-estimated by the r.squared case, though this is not the r.sq.log model appears to do much better than the others in
generally the case, as we will see. Both models fail to reproduce terms of its objective function, and also in terms of all the
the switching off of baseflow in 2003 and 2004 (flow falling to event-based statistics, even those without a log transformation
zero for extended periods; not shown), as there is no such of the data.
mechanism in the model formulation. It is easier to see the pattern of the fit statistics when it is dis-
The next section will investigate the performance of the three played graphically, and with finer time resolution. Fig. 10 shows the
candidate models in detail, with a view to further model r.squared and r.sq.log statistics in each calendar year for the three
development. models. An interesting feature that can be seen is that the cmd_log
model (i.e. that calibrated to r.sq.log) actually performs better than
9. Model performance assessment the cmd_raw model in terms of r.squared in many years: specifically
in dry years such as 1993, 1994 and 1996. In wet years the cmd_log
The typical approach to model performance assessment is to model does less well, whereas the intensity model is often one of
calculate fit statistics. Fit statistics for the calibration period are the best performers on this measure. The story is different on the
given in Table 5. The same fit statistics were also calculated over measure of r.sq.log, where the performance is dominated by the
two verification periodsdthe relatively wet decade of the 1970s corresponding cmd_log model.
and the relatively dry decade of the 1990sdand are listed in The fit statistics we have considered so far have been based on
Tables 5b and c. The statistics are defined in Appendix A. comparing observed and simulated values at a daily time step. It is
The fit statistics show that, while the r.squared model does often of interest also to consider the model performance at a longer
best in terms of its own objective function, its improvement time scale, to reveal the size and direction of any systematic biases.
over the intensity reference model appears to be modest in the Fig. 11 shows this in a plot of model residuals smoothed over a time
wet period, but more pronounced in the dry period. In contrast, window of around 1 year. Both raw and log scale residuals are
cmd_raw cmd_log intensity
r.squared
0.8
0.6
0.4
0.2
0.0
r.sq.log
0.8
0.6
0.4
0.2
0.0
runoff
0.4
0.2
0.0
1970 1980 1990 2000
Fig. 10. Fit statistics calculated in simulation over each calendar year for three candidate models calibrated to the Queanbeyan River data. The calibration period is shaded.
cmd_raw cmd_log intensity
smoothed raw residuals

(underestimated)
1.0
0.5
0.0
(overestimated)
smoothed log residuals
−0.5 0.0 0.5 1.0
(underestimated on log scale)

0.0 1.0 2.0 −1.5
(overestimated on log scale)

observed streamflow
1970 1980 1990 2000
Fig. 11. Time series of model residuals, both raw and log-transformed (with an offset). The daily residuals are smoothed over an effective bandwidth of 1 year with a triangular
kernel. This reveals longer-term biases while hiding any short-term errors. The calibration period is shaded.
considered, by analogy with Fig. 10. Indeed, the story is similar at Building on the empirical data analysis of Fig. 7, we can assess
this scale: the cmd_raw model does best on the raw scale in wet model residuals, aggregated up to total flow in event windows, in
years, but does less well than cmd_log in dry years. All models have relation to the same three covariates: antecedent flow level, peak
drastically underestimated the exteme response in years rainfall, and temperature. If the model were explaining the data
1974e1975. well, then there should not be a systematic relationship with these
This kind of model performance assessment is quite detailed, and covariates. It is useful to include additional covariates, to provide
is able to show which models are preferred, over time and from more information about each event: an index of antecedent rainfall
different perspectives. However, it is ultimately unsatisfying because (moving average over preceding 30 days), the peak effective rainfall
it does not provide much guidance in the task of diagnosing model as simulated by each model, and the mean observed flow. Figs. 12
structural problems and developing an improved model. For this, we and 13 show the estimated relationships between model resid-
can invoke an event-based analysis and consider the model residuals uals and these covariates derived from simulation over the whole
in relation to specific features of the data. Queanbeyan River time series. Residuals were calculated as the
cmd_log cmd_raw intensity

log2 ante. flow log2 max rain mean temp. (deg. C)
(underestimated)
40
residual flow sums in event windows (mm)
20
(overestimated)
−20
−4 −2 0 3 4 5 6 7 10 15 20 25 30 35
sqrt 30−day ante. rain log2 max eff. rain (U) log2 mean flow
40
20
−20
0.0 0.5 1.0 1.5 2.0 2.5 3.0 −2 0 2 4 6 −6 −4 −2 0 2
event−aggregated covariate values
Fig. 12. Marginal effects of variables on model residual flow sums in rain5 event windows. These are GAM smooths fitted independently to the residuals against each covariate. All
covariates are in units of mm/day (before transformation) except temperature which is in degrees Celcius.
cmd_log cmd_raw intensity

log2 ante. flow log2 max rain mean temp. (deg. C)
(underestimated on log scale)
50
residual sums of log flow in event windows
−50 (overestimated on log scale)
−4 −2 0 3 4 5 6 7 10 15 20 25 30 35
sqrt 30−day ante. rain log2 max eff. rain (U) log2 mean flow
50
−50
0.0 0.5 1.0 1.5 2.0 2.5 3.0 −2 0 2 4 6 −6 −4 −2 0 2
event−aggregated covariate values
Fig. 13. Marginal effects of variables on model log scale residual flow sums in rain5 event windows. These are GAM smooths fitted independently to the residuals against each
covariate. All covariates are in units of mm/day (before transformation) except temperature which is in degrees Celcius.
difference between total observed and total simulated flow in all confidence bounds. The potential exists to implement the more
rain5 event windows (defined in Section 5). efficient bootstrap-based methods of Selle and Hannah (2010).
The results show that the cmd_raw model does best in As hydromad and R are free software, we hope that further
accounting for antecedent conditions on the raw data scale, and methodological innovation and associated software development
even on the log scale under wet conditions but not dry conditions. will come from the wider community.
Both CMD models provide a significant improvement over the
intensity reference model, which does not account explicitly for
Computational details
antecedent conditions. On the other hand, as might be expected,
the intensity model does best in accounting for high intensity
The results in this paper were obtained using R 2.11.1 with the
rainfall (and effective rainfall), although it is still subject to an
packages hydromad 0.9e7, zoo 1.6e3, (Zeileis and Grothendieck,
obvious systematic effect. The cmd_log model is clearly accounting
2005), mgcv 1.6e2 (Wood, 2004), lattice 0.19e19 (Sarkar, 2010),
for most effects better than the other models when log scale
and latticeExtra 0.6e15 (Sarkar and Andrews, 2010). R itself and all
residuals are considered. It is still subject to systematic effects with
packages used are available from CRAN at http://CRAN.R-project.
respect to antecedent flow, mean temperature, and mean observed
org/.
flow; all of these are likely to be correlated.
The code used to produce the results and figures in this paper is
By examining and comparing systematic errors in the model
available from http://www.nfrac.org/felix/papers/.
residuals we can think about how to develop improved models. For
instance, there is room for improvement in accounting for ante-
cedent wetness. Also, we have shown that runoff from high rainfall Acknowledgements
intensity events are being underestimated, and that even a simple
intensity model can lead to some improvement in that respect. Hak-Soo Kim discovered the interestingness of the Queanbeyan
River catchment, and generously provided the estimated areal
rainfall data.
10. Conclusions and outlook
Appendix A. Statistics
The hydromad package attempts to offer the modeller a simple,
flexible and open software environment, integrated with the R A set of pre-defined statistics is provided in the hydromad
system, for working with hydrological models. For rapid use it package and are listed below. These are not intended to be
includes a set of predefined models, calibration algorithms, and fit a comprehensive set, but rather for use as quick examples and to
statistics. We have shown how an event-based data analysis demonstrate how to define such functions.
approach can diagnose model deficiencies and guide model Raw time step based statistics:
development. It is useful to take a comparative approach in this
regard, comparing model performance with one or more reference P
bias: bias in data units, ðX Q Þ
models. rel.bias: bias as a fraction of the total observed flow,
The package has some support for uncertainty estimation based P P
ðX Q Þ= Q
on parameter sampling. Markov Chain Monte Carlo methods may abs.err: the mean absolute error, EjX Q j
be used for efficient adaptive sampling, with either a formal or an qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
informal likelihood function. The resulting ensemble of parameter RMSE: Root Mean Squared Error, EðX Q Þ2
sets may be simulated on new data to produce point-wise r.squared: R Squared (Nash-Sutcliffe Efficiency),
P P e.q90.log: same as e.q90 but the event totals are log-

1 ðQ XÞ2 = ðQ Q Þ2
r.sq.sqrt: R Squared using square-root transformed data: transformed.
P pffiffiffiffi pffiffiffiffi 2 e.q90.bc: same as e.q90 but the event totals are Box-Cox-
j Q Xj
1 P pffiffiffiffi pffiffiffiffi transformed.
j Q Q j2 e.q90.all: R Squared of flow totals in and between each event,
r.sq.log: R Squared using log-transformed data, with an offset: defined by observed flow exceeding the 90 percentile level for
P
jlogðQ þ eÞ logðX þ eÞj2 at least 2 time steps, and continuing until the flow falls below
1P . Here e is the 10 percentile (i.e.
jlogðQ þ eÞ logðQ þ eÞj2 the 90 percentile level for 4 time steps. Events must be sepa-
lowest decile) of the non-zero values of Q. rated by a further 5 time steps. An example of this event
r.sq.boxcox: R Squared using a Box-Cox transform (Box and Cox, definition is shown in Fig. 3.
1964). The power l is chosen to fit Q to a normal distribution. e.q90.all.log: same as e.q90.all but the event totals are log-
When l ¼ 0 it is a log transform; otherwise it is transformed.
ðy þ eÞl 1 e.q90.all.bc: same as e.q90.all but the event totals are Box-Cox-
y* ¼ . Here e is the 10 percentile (i.e. lowest decile) transformed.
l
of the non-zero values of Q. e.q90.min: the same as e.q90 but instead of flow totals in each
r.sq.diff: R Squared using differences between successive time event, the minimum flow in each event is extracted.
steps, i.e. rises and falls. e.q90.min.log: the same as e.q90.min but flow minima are log-
r.sq.smooth5: R Squared using data smoothed with a triangular transformed (with the non-zero 10 percentile of these
kernel of width 5 time steps: c(1,2,3,2,1)/9. minima as an offset).
persistence: R Squared where the reference model predicts each
time step as the previous observed value. This statistic there-
fore represents a model’s performance compared to a naive
one-time step forecast. References
r.sq.seasonal: R Squared where the reference model is the mean
Andrews, F.T., Croke, B.F.W., Jeanes, K.W., 2010. Robust estimation of the total unit
in each calendar month, rather than the default which is the hydrograph. In: Swayne, D., Et (Eds.), Proceedings of the IEMSs Fifth Biennial
overall mean. Meeting: ”Summit on Environmental Modelling and Software”. International
r.sq.vs.tf: R Squared where the reference model is a second- Environmental Modelling and Software Society, Ottawa, Ontario, Canada.
Ardia, D., Mullen, K. 2010. The ‘DEoptim’ Package: Differential Evolution Optimi-
order transfer function (two stores in parallel) fitted directly zation in ‘R’. R package version 2.0-5.
to the rainfall data, i.e. assuming a constant runoff ratio. This Bai, Y., Wagener, T., Reed, P., 2009. A top-down framework for watershed model
indicates the marginal improvement of including the SMA evaluation and selection under uncertainty. Environmental Modelling & Soft-
ware 24, 901e916.
model component. Bates, D., 2008. Comment on leland wilkinson, “the future of statistical computing”.
r.sq.vs.tf.bc: R Squared using a Box-Cox transform where the Technometrics 50, 439e440.
reference model is a second-order transfer function (two stores Becker, R.A., Chambers, J.M., Wilks, A.R., 1988. The New S Language: a Programming
Environment for Data Analysis and Graphics. Wadsworth & Brooks/Cole, Pacific
in parallel) fitted directly to the rainfall data, i.e. assuming
Grove, CA, USA.
a constant runoff ratio. This indicates the marginal improve- Beven, K., 2006. A manifesto for the equifinality thesis. Journal of Hydrology 320,
ment of including the SMA model component. 18e36.
Beven, K., 2008. On doing better hydrological science. Hydrological Processes 22,
X0: correlation of modelled flow with the model residuals.
3549e3553.
X1: correlation of modelled flow with the model residuals from Blasone, R., Vrugt, J., Madsen, H., Rosbjerg, D., Robinson, B., Zyvoloski, G., 2008.
the previous time step. Generalized likelihood uncertainty estimation (glue) using adaptive markov
ARPE: Average Relative Parameter Error. Requires that a vari- chain monte carlo sampling. Advances in Water Resources 31, 630e648.
Boughton, W., 2004. The australian water balance model. Environmental Modelling
ance-covariance matrix was estimated during calibration. & Software 19, 943e956.
Box, G.E.P., Cox, D.R., 1964. An analysis of transformations. Journal of the Royal
Statistical Society, Series B (Methodological) 26, 211e252.
Aggregated and event-based statistics: Boyle, D.P., Gupta, H.V., Sorooshian, S., 2000. Toward improved calibration of
hydrologic models: combining the strengths of manual and automatic methods.
Water Resources Research 36, 3663e3674.
r.sq.monthly: R Squared with data aggregated into calendar Burnash, R.J.C., 1995. The NWS River Forecast System e Catchment Modeling. Water
months. Resources Publications, Highlands Ranch, Colo (Revised edition).
Clark, M.P., Slater, A.G., Rupp, D.E., Woods, R.A., Vrugt, J.A., Gupta, H.V., Wagener, T.,
e.rain5: R Squared of flow totals in each event, defined by
Hay, L.E., 2008. Framework for understanding structural errors (fuse):
rainfall exceeding 5 mm per time step, and continuing until the a modular framework to diagnose differences between hydrological models.
next such event. Each single event continues at least until Water Resources Research 44, W00B02þ.
Cleveland, R.B., Cleveland, W.S., McRae, J.E., Terpenning, I., 1990. Stl: a seasonal-
rainfall remains below 1 mm for 4 time steps. An example of
trend decomposition procedure based on loess. Journal of Official Statistics 6,
this event definition is shown in Fig. 3. 3e73.
e.rain5.log: same as e.rain5 but the event totals are log- Croke, B.F.W., 2006. A technique for deriving an average event unit hydrograph
transformed (with the non-zero 10 percentile of these totals from streamflow-only data for ephemeral quick-flow-dominant catchments.
Advances in Water Resources 29, 493e502.
as an offset). Croke, B.F.W. 2009. Representing uncertainty in objective functions: extension to
e.rain5.bc: same as e.rain5 but the event totals are Box-Cox- include the influence of serial correlation. In: Anderssen, R.S., Braddock, R.D.,
transformed (with the non-zero 10 percentile of these totals Newham, L.T.H. (Eds.), 18th World IMACS Congress and MODSIM09 Interna-
tional Congress on Modelling and Simulation, Modelling and Simulation Society
as an offset). of Australia and New Zealand and International Association for Mathematics
e.q90: R Squared of flow totals in each event, defined by and Computers in Simulation. pp. 3372e3378.
observed flow exceeding the 90 percentile level for at least 2 Croke, B.F.W., Andrews, F., Jakeman, A.J., Cuddy, S., Luddy, A. 2005. Redesign of the
ihacres rainfall-runoff model, in: 29th Hydrology and Water Resources
time steps, and continuing until the next such event. Each Symposium.
single event continues at least until flow falls below the 90 Croke, B.F.W., Jakeman, A.J., 2004. A catchment moisture deficit module for the
percentile level for 4 time steps, and must be separated from ihacres rainfall-runoff model. Environmental Modelling and Software 19, 1e5.
Dawson, C., Abrahart, R., See, L., 2007. Hydrotest: a web-based toolbox of evaluation
the next event by a further 5 time steps. An example of this
metrics for the standardised assessment of hydrological forecasts. Environ-
event definition is shown in Fig. 3. mental Modelling & Software 22, 1034e1052.
Dawson, C.W., Abrahart, R.J., See, L.M., 2010. Hydrotest: further development of Price, K., Storn, R.M., Lampinen, J.A. (Eds.), 2005. Differential Evolution: a Practical
a web resource for the standardised assessment of hydrological models. Envi- Approach to Global Optimization (Natural Computing Series), One ed.
ronmental Modelling & Software 25, 1481e1482. Springer.
Duan, Q., Sorooshian, S., Gupta, V., 1992. Effective and efficient global optimization R Development Core Team, 2010. R: a Language and Environment for Statistical
for conceptual rainfall-runoff models. Water Resources Research 28, 1015e1031. Computing. R Foundation for Statistical Computing, Vienna, Austria.
Fenicia, F., Savenije, H.H.G., Matgen, P., Pfister, L., 2008. Understanding catchment Sarkar, D. (Ed.), 2008. Lattice: Multivariate Data Visualization with R (Use R), One
behavior through stepwise model concept improvement. Water Resources ed. Springer.
Research 44, W01402þ. Sarkar, D. 2010. lattice: Lattice Graphics. R package version 0.18-5.
Fox, J., 2008. Editorial. The R Journal 8/2, 1e2. Sarkar, D., Andrews, F. 2010. latticeExtra: Extra Graphical Utilities Based on Lattice.
Gentleman, R., Temple Lang, D., 2007. Statistical analyses and reproducible research. R package version 0.6-12.
Journal of Computational and Graphical Statistics 16, 1e23. Savenije, H.H.G., 2009. Hess opinions: the art of hydrology. Hydrology and Earth
Herron, N., Croke, B., 2009. Including the influence of groundwaterexchanges in a lumped System Sciences 13, 157e161.
rainfall-runoff model. Mathematics and Computers in Simulation 79, 2689e2700. Selle, B., Hannah, M., 2010. A bootstrap approach to assess parameter uncertainty in
Hyndman, R.J. 2010. Forecast: forecasting functions for time series. R package simple catchment models. Environmental Modelling & Software 25, 919e926.
version 2.05. Sivapalan, M., Blöschl, G., Zhang, L., Vertessy, R., 2003. Downward approach to
Ihaka, R., Gentleman, R., 1996. R: a language for data analysis and graphics. Journal hydrological prediction. Hydrological Processes 17, 2101e2111.
of Computational and Graphical Statistics 5, 299e314. Viviroli, D., Zappa, M., Gurtz, J., Weingartner, R., 2009. An introduction to the
Jakeman, A., Letcher, R., Norton, J., 2006. Ten iterative steps in development and hydrological modelling system prevah and its pre- and post-processing-tools.
evaluation of environmental models. Environmental Modelling & Software 21, Environmental Modelling & Software 24, 1209e1222.
602e614. Vrugt, J.A., ter Braak, C.J.F., Diks, C.G.H., Robinson, B.A., Hyman, J.M., Higdon, D.,
Jakeman, A.J., Hornberger, G.M., 1993. How much complexity is warranted in 2009. Accelerating markov chain monte carlo simulation by differential
a rainfall-runoff model? Water Resources Research 29, 2637e2649. evolution with self-adaptive randomized subspace sampling. International
Jakeman, A.J., Littlewood, I.G., Whitehead, P.G., 1990. Computation of the instan- Journal of Nonlinear Sciences and Numerical Simulation 10, 273e290.
taneous unit hydrograph and identifiable component flows with application to Wagener, T., Boyle, D.P., Lees, M.J., Wheater, H.S., Gupta, H.V., Sorooshian, S., 2001.
two small upland catchments. Journal of Hydrology 117, 275e300. A framework for development and application of hydrological models.
Katz, R.W., Parlange, M.B., Naveau, P., 2002. Statistics of extremes in hydrology. Hydrology and Earth System Sciences 5, 13e26.
Advances in Water Resources 25, 1287e1304. Wagener, T., Kollat, J., 2007. Numerical and visual evaluation of hydrological and
Kim, H.S., Croke, B.F.W., Jakeman, A.J., Chiew, F.H.S., Mueller, N., 2007. Towards environmental models using the monte carlo analysis toolbox. Environmental
separation of climate and land use effects on hydrology: data analysis of the Modelling & Software 22, 1021e1033.
googong and cotter catchments. In: Oxley, L., Kulasiri, D. (Eds.), MODSIM 2007 Wagener, T., Reed, P., van Werkhoven, K., Tang, Y., Zhang, Z., 2009. Advances in the
International Congress on Modelling and Simulation. Modelling and Simulation identification and evaluation of complex environmental systems models.
Society of Australia and New Zealand, pp. 74e80. Journal of Hydroinformatics 11.
Kokkonen, T., Koivusalo, H., Jakeman, A., Norton, J., 2006. Construction of a degree- Wheater, H.S., Jakeman, A.J., Beven, K.J., 1993. Progress and Directions in Rainfall-
day snow model in the light of the “ten iterative steps in model development”. runoff Modelling. John Wiley, Chichester, pp. 101e132.
In: VoinovA., JakemanA.J., RizzoliA.E. (Eds.), Proceedings of the IEMSs Third Wickham, H., 2009. ggplot2: Elegant Graphics for Data Analysis. Springer,
Biennial Meeting: ”Summit on Environmental Modelling and Software”. Inter- New York.
national Environmental Modelling and Software Society, Burlington, USA. Willems, P., 2009. A time series tool to support the multi-criteria performance
Leavesley, G.H., Markstrom, S.L., Viger, R.J., 2006. USGS Modular Modeling System (MMS) evaluation of rainfall-runoff models. Environmental Modelling & Software 24,
e Precipitation-Runoff Modeling System (PRMS). CRC Press, pp. 159e177.
âAŞ 311e321.
Littlewood, I.G., Croke, B.F.W., Jakeman, A.J., Sivapalan, M., 2003. The role of ‘top-down’ Williams, G.J., 2009. Rattle: a data mining gui for r. The R Journal 1, 45e55.
modelling for prediction in ungauged basins (pub). Hydrological Processes 17, Wood, S.N., 2004. Stable and efficient multiple smoothing parameter estimation for
1673e1679. generalized additive models. Journal of the American Statistical Association 99,
Ljung, L.l (Ed.),1999. System Identification: Theory for the User, second ed. Prentice Hall. 673e686.
McIntyre, N., Al-Qurashi, A., 2009. Performance of ten rainfallerunoff models applied Ye, W., Bates, B.C., Viney, N.R., Sivapalan, M., Jakeman, A.J., 1997. Performance of
to an arid catchment in oman. Environmental Modelling & Software 24, 726e738. conceptual rainfall-runoff models in low-yielding ephemeral catchments.
Nash, J., Sutcliffe, J., 1970. River flow forecasting through conceptual models part i - Water Resources Research 33, 153e166.
a discussion of principles. Journal of Hydrology 10, 282e290. Young, P., 2003. Top-down and data-based mechanistic modelling of rainfall-flow
Norton, J., Chanat, J., 2005. Linear time-varying models to investigate complex dynamics at the catchment scale. Hydrological Processes 17, 2195e2217.
distributed dynamics: a rainfall-runoff example. Mathematics and Computers Young, P.C., 2008. The refined instrumental variable method. Journal Européen des
in Simulation 69, 123e134. Systèmes Automatisés 42, 149e179.
Perrin, C., Michel, C., Andreassian, V., 2003. Improvement of a parsimonious model Young, P.C., Ratto, M., 2008. A unified approach to environmental systems
for streamflow simulation. Journal of Hydrology 279, 275e289. modeling. Stochastic Environmental Research and Risk Assessment.
Petris, G., Petrone, S., Campagnoli, P., 2009. Dynamic Linear Models with R. Use R!, Zeileis, A., Grothendieck, G., 2005. zoo: S3 infrastructure for regular and irregular
Springer. time series. Journal of Statistical Software 14, 1e27.

Environmental Modelling & Software: F.T. Andrews, B.F.W. Croke, A.J. Jakeman

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Environmental Modelling & Software: F.T. Andrews, B.F.W. Croke, A.J. Jakeman

Uploaded by

Copyright:

Available Formats

Environmental Modelling & Software 26 (2011) 1171e1185

Contents lists available at ScienceDirect

Environmental Modelling & Software

An open software environment for hydrological model assessment

The class of hydrological models considered here are dynamic,

The hydromad package provides a set of functions which work

variables; Hornberger, 1993) including the generalisation of Ye et al.

objective function value

1983−01 1983−07 1984−01 1984−07 1985−01 1985−07 1986−01

Lag (days) Lag (events)

raw log box.cox

0.1 1 10 50 90 99 99.9 0.1 1 10 50 90 99 99.9

cumulative probability % (Normal distribution)

Rainfall-runoff dynamics were investigated in the Queanbeyan 7.2. Delay times

log2(anteflow) log2(peakrain) temperature

8. Model development and calibration

calculated ﬁt statistics from each simulation: r.squared and

8.3. Simulation over parameter spaces 8.4. Calibration

observed cmd_raw cmd_log

1983−07 1984−01 1984−07 1985−01 1985−07 1986−01

cmd_raw cmd_log intensity

cmd_raw cmd_log intensity

smoothed raw residuals

(underestimated on log scale)

(overestimated on log scale)

1970 1980 1990 2000

cmd_log cmd_raw intensity

0.0 0.5 1.0 1.5 2.0 2.5 3.0 −2 0 2 4 6 −6 −4 −2 0 2

event−aggregated covariate values

cmd_log cmd_raw intensity

−50 (overestimated on log scale)

0.0 0.5 1.0 1.5 2.0 2.5 3.0 −2 0 2 4 6 −6 −4 −2 0 2

event−aggregated covariate values

P P e.q90.log: same as e.q90 but the event totals are log-

You might also like