Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Int. J. Environ. Sci. Technol.

DOI 10.1007/s13762-015-0800-7

ORIGINAL PAPER

Simulation, evaluation and prediction modeling of river water


quality properties (case study: Ireland Rivers)
E. S. Salami1 • M. Ehteshami2

Received: 24 May 2014 / Revised: 28 January 2015 / Accepted: 15 March 2015


Ó Islamic Azad University (IAU) 2015

Abstract In this analysis, three input parameters tem- Introduction


perature, pH and electrical conductivity were chosen due to
their easy and less costly measurement technique, and a The situation of water resources in the world becomes
package of six models were presented for estimating the more challenging every year. During the last century, water
concentrations of dissolved oxygen, DO percentage, bio- consumption grew at twice the rate of population increase
logical oxygen demand, chloride, alkalinity and total (Stockholm International Water Institute and Elsevier
hardness. 3001 data sets (a 3001 9 8 data array) were used 2012). In addition, the complexity of managing natural
to training the models. The models have been tested in resources generally increases in parallel with human
order to verify their prediction values, and the resulted population growth (Varnell et al. 2008). Assessment of
R factor (the rate of precision) for each model equals to properties and processes related to running waters is a
0.93, 0.95, 0.77, 0.82, 0.85 and 0.92, respectively. This major issue in the management of aquatic environments
proves that the package can be used to estimate the con- (Schleiter et al. 1999). As a result, accurate determination
centrations of water quality parameters with accuracy close of the concentration of nutrients and other substances in
to the reality. The River data collected from 210 water bodies is an essential requirement for supporting
monitoring stations located in all over Ireland have been effective management and legislation with regard to such
used. The data set covers different conditions and makes environments (Donohue and Irvine 2008). Water manage-
the model applicable in many different places and condi- ment decisions are increasingly based on model studies
tions. For development of all models, feed-forward algo- (Scholten et al. 2007), while modeling tools are becoming
rithm used for training, as well as the Levenberg– progressively more sophisticated (McKnighta et al. 2010).
Marquardt and tansign(x) functions as learning and transfer Modeling of water quality parameters also has many
functions. benefits, the most important of which are explained here.
The value of data modeling can be demonstrated at the
Keywords Artificial neural networks  Ireland Rivers  overall savings that it makes in terms of maintenance or
Modeling  Water characteristics  Water quality development costs. When this issue is considered in broad
terms across an organization’s total budget, the amount of
saving can be truly significant. The value of data modeling
can be also seen at a more detailed level in terms of savings
it provides through development tasks related to a specific
project. Additionally, its value can be determined by
& M. Ehteshami identifying specific benefits that data modeling provides
[email protected]
and then quantifying those benefits in all projects. Finally,
1
KN Toosi University of Technology, Tehran, Iran data models can be reused in whole or in part for multiple
2 projects, which can result in significant savings for any
Environmental Engineering Department, KN Toosi
University of Technology, P.O Box 1587-544-16, Tehran, organization (Haughey 2010). Models also can be used to
Iran regenerate the missing data (Diamantopoulou et al. 2005).

123
Int. J. Environ. Sci. Technol.

It is better to make predictor models for the most im- assay. Temperature and pH are two of the easiest and the
portant parameters, which in this case include: least costly (almost without any cost) parameters to mea-
sure. Electrical conductivity (EC) is also another parameter
1. The dissolved oxygen (DO), which is an important
that can be measured quite easily and almost with no cost.
quality index for some types of water. However, it is
It also has good relationship with other output parameters.
difficult to simulate the DO concentration by tradi-
Many researchers have used EC as input data, for example
tional mathematical methods due to the different
(Sevostianov and Shrestha 2010), and many others. The
factors which affect different kinds of water (Lihua
following table shows the names of a number of these
et al. 2008);
researchers.
2. Biological oxygen demand (BOD), which is a major
After finding the most appropriate input and output, it
parameter used to determine the degree of pollution in
was decided to use artificial neural networks (ANN) to
effluents (Akilandeswari and Adline 2013);
make necessary models. Applications of ANN are in-
3. Nitrate and ammonium ions, which estimating their
creasing because they are capable of solving and modeling
concentration in surface and ground water is extremely
every kind of complicated problems. Also, they are able to
important. These values are usually determined at
complete and cover all kinds of required parameter data
laboratories using sophisticated equipment, with a
series. Application of ANN is also increasing in resolving
turnaround time varying from 2 h to 3 days. In many
optimization problems (Koncsos 2010).
instances, the results are needed at the site as quickly
as possible (Rich et al. 2006);
Neural networks
4. Measuring alkalinity is an important factor in deter-
mining a stream’s ability to neutralize acidic pollution
The artificial neural network (ANN), as its name implies, is
resulting from rainfall or wastewater. It is also one of
a technique that simulates the functions of human brain
the best measures of the sensitivity of the stream to
during the problem-solving process. Just as humans use
acidic inputs (US EPA 2013);
knowledge gained from experience to new problems or
5. Hardness is also an important factor in aquaculture.
situations, the structure of a neural network can be used in
Calcium and magnesium are the most common sources
powerful computation of complex nonlinear relationships
of water hardness. (Wurts 2002).
(Kuo et al. 2004). MLF (multilayer feed-forward) networks
In line with the above facts, such parameters as DO, trained with back-propagation algorithm are the most
BOD, hardness and alkalinity are chosen as targets of our popular type of networks (Svozil et al. 1997 and Koncsos
modeling study. For complete coverage, the additional 2010). For example, the following table contains infor-
parameter of chloride (Cl) concentration is added to the list mation on some modeling papers. These papers used feed-
above. Chloride ion is the predominant natural form of forward networks to make their own models. You can also
chlorine and is extremely soluble in water. Major sources see a three-layer MLF network in Fig. 2.
of chloride in natural waters are sedimentary rocks that To understand the structure of networks processing, the
particularly evaporate. Igneous rocks contribute only a structure of a neuron (which is the smallest and basic
fraction of total chloride. Other sources are industrial and element for any type of ANN) should be known at first.
domestic wastewater (Pradhan and Pirasteh 2011). This Figure 1 depicts a simple neuron.
study was conducted as a part of a master thesis research at In this schema, p is the matrix of input data multiplied
KNTU, Environmental Eng. Dept. (2014). by w, which is the matrix of weight. The calculated data
(matrix) are summed with b (the threshold coefficient)
which can be understood as a weight coefficient for the
Methods and materials connection with formally added neuron where a = 1 (so-
called bias) (Svozil et al. 1997). The result of the sum will
It was decided to make a simulation model with seven be a matrix called n. The output of the neuron will be the
parameters. Relations between different physical properties
of these parameters represent one of the most fascinating
problems in modern science. Since this is a fundamental
scientific problem, answering it will also provide answers
to other practical needs, especially when one property is
easier to measure than another (Sevostianov and Shrestha
2010). Now, it is needed to choose a number of related
parameters as inputs for these models. It is better to choose
parameters as inputs that are most economic and easy to Fig. 1 Simple neuron

123
Int. J. Environ. Sci. Technol.

‘‘a’’ matrix, expressed as (Koncsos 2010; Svozil et al. done in a way that the calculated and desired outputs are as
1997): close as possible. In unsupervised training, the desired
a ¼ f ðnÞ ¼ f ðb þ wpÞ ð1Þ output is not known (Svozil et al. 1997). This type of
networks is mostly used for division problems. In the su-
A network consists of a number of layers that each layer pervised mode, the square of the difference between the
contains a number of neurons. There are three types of network output (ai) and the data output is assumed as the
layers (Rounds 2002; Svozil et al. 1997; Menhaj 2008): main criterion for estimating the rate of learning (a). Mean-
1. The input layer contains the input data and defines squared error (MSE) is calculated as: (Ghaffari et al. 2006;
them for the network Menhaj 2008).
2. The hidden layer(s) where the process is carried out. 1X m
1X
The number of hidden layers and the number of mse ¼ e2 ¼ ðti  ai Þ2 ð2Þ
m i¼1 m i¼1
neurons in each layer are variables;
3. The output layer that represents the network results In order of the training carried out correctly, the process
(ai). should be repeated until the required precision is reached.
In the following procedure, every time the process repeats,
Figure 2 shows a four-layer feed-forward network. The weights and biases will change. Figure 3 shows wij, bij and
number of hidden layer(s) and the number of neurons in transfer function of each layer in a three-layer feed-forward
each layer are chosen by designer of the network and de-
network. The calibration process for wi,bi(s) is as (Svozil
pend on the conditions of the data and design details. et al. 1997; Abraham 2005; Menhaj 2008):
Training means to change the weights (wi) and biases
oeðw; bÞ
(bi) in order to get closer to the answers. There are two wðlþ1Þ
i;j
ðlÞ
¼ wi;j a ðlÞ
ð3Þ
main types of training processes: supervised and unsuper- o wi;j
vised training. In supervised training [e.g., multilayer feed-
ðlþ1Þ ðlÞ oeðw; bÞ
forward (MLF) neural network], the neural network knows bi;j ¼ bi;j a ð4Þ
ðlÞ
the desired output, and adjustment of weight coefficients is o bi;j
" #
oeðw; bÞ 1X m
o  ðiÞ yðiÞ

ðlÞ
ðlÞ
¼ e w; b; x ; þ a wi;j ð5Þ
o wi;j m i¼1 o wðlÞ
i:j
" #
oeðw; bÞ 1X m
o  
ðlÞ
ðlÞ
¼ ðlÞ
e w; b; xðiÞ ; yðiÞ þ a bi;j ð6Þ
ob m i¼1 o b
i;j i:j

As shown in Table 1, most of the works done in the field


of modeling assumed a number of data as inputs. Some
clues about choosing the input data (in relation to choosing
Fig. 2 Feed-forward network with five layers the least expensive and the most accessible parameters as

Fig. 3 Three-layer feed-


forward network

123
Int. J. Environ. Sci. Technol.

Table 1 Information on some modeling researches


References Input(s) Output(s)

Patki et al. (2013) Alkalinity, hardness, TS and MPN WQI


Rak (2013) Psychochemical parameters of interim water Turbidity
Chitsazan et al. (2013) Rain data, mean monthly temperature, relative humidity, discharge of Groundwater depth
irrigation canal, groundwater recharge from the plain boundary
Nejadkoorki and Baroutian Meteorological and gaseous pollutants from different air quality monitoring Maximum PM10 concentration
(2011) stations
Gustavo Andres Cuesta Temperature, pH, flow, pipe material, diameter, and age of pipes Free chlorine
Cordoba Ing (2011)
Zhang et al. (2010) Temperature, BOD, NH3–N, COD DO
Panda Rabindra et al. (2010) Water-level date and time Water-level data
Anctil et al. (2009) Number of 12 parameter such as Q, P, F Daily nitrate-nitrogen and
suspended sediment fluxes
Kim and Gilley (2008) Runoff, electrical conductivity (EC) and pH DP and NH4–N
Jalili Ghazi Zade and Noori Weekly amount of solid waste Generated solid waste
(2008)
Nadiri (2007) Groundwater level Groundwater level
Diamantopoulou et al. (2005) Temperature, flow, EC, HCO3, SO4, Na, Cl, Ca and DO Nitrate
Rounds (2002) Air temperature, solar radiation, rainfall and stream flow DO

Table 2 Units and limits of input data cause over-fitting, which means that the neural networks
may overestimate the complexity of the target problem. It
W.Q. parameters Unit Limits
also greatly degrades generalization capability of the
T °C B3 model, which can lead to significant deviation in predic-
pH – 6.5–8.75 tions. In this case, determining the proper number of hid-
EC lS/cm @25 °C 50–900 den neurons to prevent over-fitting is critical in function
approximation using neural networks. There are various
approaches to building the network in a constructive or
inputs) are already mentioned. Useful tips about the destructive way, but the most common methods to deter-
structure of the network’s training algorithm are provided mine whether a certain number of hidden neurons are op-
as following: timal, are cross-validation and early-stopping (Setiono
1. The selected input parameter should be related to the 2001). Actually, to get the best result, the number of layers
output (including the result and the goal of modeling). and neurons should be chosen in proportion to the com-
Closer relationship between the input and output data plexity of problems and the number of input and output
can help us to minimize the error. However, the parameters (Abraham 2005).
reverse is also true, which means if some parameters
chosen as input that are not related to the output, it Data resources
would cause disorder in the training process, which
The River time series’ data are used, which have been
will not help in the modeling.
2. By using a higher number of related data as input, the received from 210 monitoring stations from all over Ire-
training will be longer and will need more neurons as land. These data are available on the website of the Ireland
well. However, it can also make our model more EPA (www.Water Quality Environmental Protection
accurate. Agency, Ireland.htm under the name of Raw River Data for
SE 2012).1 The data that are chosen in our work came from
Table 2 shows the parameters that we have chosen as 3001 groups of existing data set, while each data set con-
inputs for each model. The MATLAB program can be used tains 10 parameters. Fig. 4 shows data such as t, pH, EC,
to do calculations (Chitsazan et al. 2013; Chu et al. 2013). DO, BOD, Cl, nitrate, alkalinity and TDH that are used in
All needed formulas are incorporated in this program. At the current study.
first, one must define the design parameters of the network.
It should be noted that excessive use of hidden neurons will 1
www.water.epa.gov/type/rsl/monitoring/vms510.cfm.

123
Int. J. Environ. Sci. Technol.

Fig. 4 T, pH, EC, DO%, BOD, N, Cl, TDS and alkalinity data used in the study

123
Int. J. Environ. Sci. Technol.

Results and discussion algorithm. The learning function for all models is the
Levenberg–Marquardt (LM), and the transfer function for
As it is mentioned before, the most available parameters all models is the tansign(x) function. The number of
are chosen as the input data, and they are shown in Table 2. layers and neurons (in each layer) is shown in Table 3.
To increase the precision of our models, the limits of each To verify the current model(s), all data which were taken
data should be observed. For example, for nitrate, the limits into account for training and developing the model(s) are
of existing data are between 0 and 0.17 mg/l, but actually used. Furthermore, in order to confirm the precision of
just 70 items of data are between 0.04 and 0.17 mg/l; each model, three criteria of MSE, MAE and R factors
hence, they are removed to prevent their misleading effect are evaluated. Therefore, the following results are
on the training process; otherwise, they could reduce the obtained:
precision of the model. Therefore, all of the existing data
are not used for each model, and different limits are applied Ei ¼ D i  M i ð7Þ
as shown in Table 2. where Di = i’th real data, Mi = i’th estimated data
In this study, seven models are developed, each of
which estimates one of the water quality parameters. All 1X n
MSE ¼ E2 ð8Þ
models are trained by the feed-forward back-propagation n i¼1 i

Table 3 Network properties W.Q. parameters Unit Limits Used data (n) Layers Neurons in layers MSE MAE R

DO mg/l 7–15 2930 5 2-4-1-1 0.92 0.71 0.93


DO percentage % 70–130 2917 4 32-32-16 47.56 5.15 0.95
BOD mg/l 1–3 1455 3 32-25 0.22 0.37 0.77
Cl mg/l 5–40 2924 4 8-32-8 19.95 3.09 0.82
Alkalinity mg/l 20–350 2768 4 16-32-32 542.9 15.4 0.85
Total hardness mg/l 20–400 2819 4 32-32-16 363.3 11.16 0.92

Fig. 5 Package results with pilot data comparison; solid line represents data value, and dash line represents model prediction value

123
Int. J. Environ. Sci. Technol.

1X n
suspended sediment fluxes from a small agricultural catchment.
MAE ¼ j Ei j ð9Þ Ecol Model 220:879–887
n i¼1
Chitsazan M, Rahmani R, Neyamadpour A (2013) Groundwater level
simulation using artificial neural network: a case study from
1X n
Ei Aghili plain, urban area of Gotvand, south-west Iran. JGeope
R¼ 1 ð10Þ
n i¼1 Di 3(1):35–46
Chu HB, Lu WX, Zhang L (2013) Application of artificial neural
The number of data used for modeling of water quality network in environmental water quality assessment. J Agric Sci
parameters and their limits is shown in Table 3. The only Technol 15:343–356
Diamantopoulou MJ, Antonopoulos VZ, Papamichail DM (2005) The
exception is that the developed model for DO use of a neural network technique for the prediction of water
concentration only uses T as the input data. As Table 3 quality parameters of Axios River in Northern Greece. Eur
shows, the precision of all these models is very high and Water 11(12):55–62
these models are, therefore, very reliable. Moreover, Donohue I, Irvine K (2008) Quantifying variability within water
samples: the need for adequate subsampling. Water Res
inasmuch as a data set of 210 monitoring stations is used 42:476–482
for development of the current models, these models can be Ghaffari A, Abdollahi H, Khoshayand MR, Bozchalooi IS, Dadgar A,
adapted to various conditions and can be used under Rafiee-Tehrani M (2006) Performance comparison of neural
different conditions too. As it is noticed, three parameters network training algorithms in modeling of bimodal drug
delivery. Int J Pharm 327:126–138
such as T, pH and EC are used as input data because they Gustavo Andres Cuesta Cordoba Ing (2011) Using of artificial neural
can be easily measured. Therefore, these models are very network for evaluation and prediction of some drinking water
practical and can be used in any field related to the surface quality parameters within a water distribution system. Water
water quality assessment. Each model is tested with 50 management and water structures, Juniorstav, pp 1–11
Haughey I (2010) The return on investment (ROI) of data modeling.
randomly selected data sets to show the precision of the CA, Erwin, March, pp 1–18
package, and the results are shown in Fig. 5. Simulation Jalili Ghazi Zade M, Noori R (2008) Prediction of municipal solid
results of the proposed ANN models (Fig. 5) show reliable waste generation by use of artificial neural network: a case study
and high correlation results for the proposed ANN models. of Mashhad. Environ Res 2(1):13–22
Kim M, Gilley JE (2008) Artificial Neural Network estimation of soil
erosion and nutrient concentrations in runoff from land appli-
cation areas. Comput Electron Agric 64:268–275
Conclusion Koncsos T (2010) The application of neural networks for solving
complex optimization problems in modeling. In: Conference of
Junior Researchers in Civil Engineering pp 97–102
It is known that ANN model can be used in many practical Kuo YM, Liu CW, Lin KH (2004) Evaluation of the ability of an
and scientific subjects. In this work, the focus of attention artificial neural network model to assess the variation of
is on the ANN model and its ability to simulate river water groundwater quality in an area of black foot disease in Taiwan.
quality data. The results show useful applications of ANN Water Res 38:148–158
Lihua C, Shengquan M, Li LI (2008) A model to evaluate do of river
modeling as: reliable replacement for salinity test; con- based on artificial neural network and style book. J Hainan
trolling equipment and operators; proper tool for estimating Normal Univ Nat Sci 21(4):372–376
the missing data; calibration of the measurement tools; the McKnighta S, Fundera SG, Rasmussenb JJ, Finkelc M, Binninga PJ,
ability of predicting quality data; ability of performing Bjerga PL (2010) An integrated model for assessing the risk of
TCE groundwater contamination to human receptors and surface
sensitivity analyzes on the generated data by model for water ecosystems. Ecol Eng 36:1126–1137
scientific applications; and suitable for the conditions with Menhaj MB (2008) Fundamental of neural network, vol 1. Industrial
experimental difficulties. Amir Kabir University, Tehran
Nadiri A (2007) Predicting groundwater level surrounding Tabriz
Acknowledgments The authors are grateful to Dr Sohrab Soori for city. Msd. Thesis, Tabriz University
their editorial and revision assistance. Also, they are thankful of Nejadkoorki F, Baroutian S (2011) Forecasting extreme PM10
‘‘Water Quality Environmental Protection Agency, Ireland,’’ for concentrations using artificial neural networks. J Environ Res
providing data sets. 6(1):277–284
Panda Rabindra K, Pramanik N, Bala B (2010) Simulation of river
stage using artificial neural network and MIKE 11 hydrodynamic
model. Comput Geosci 36:735–745
References Patki VK, Shirihari S, Manu B (2013) Water quality prediction in
distribution system using Cascade feed forward neural network.
Abraham A (2005) Artificial neural networks. Oklahoma State Int J Adv Technol Civil Eng, ISSN: 2231–5721, 2(1):84–91
University, Stillwater, pp 901–908 Pradhan B, Pirasteh S (2011) Hydro-chemical analysis of the ground
Akilandeswari S, Adline MH (2013) Prediction of BOD values in water of the basaltic catchments: upper bhatsai region, Maha-
engineering work industrial effluent by Anfis modeling. Int J Res rashtra. Open Hydrol J 5:51–57
Pure Appl Phys 3(2):7–9 Rak A (2013) Water turbidity modelling during water treatment
Anctila F, Filion M, Tournebizeb J (2009) A neural network processes using artificial neural networks. Int J Water Sci
experiment on the simulation of daily nitrate-nitrogen and 2(3):1–10

123
Int. J. Environ. Sci. Technol.

Rich D, Washo BD, Paladini A (2006) Rapid field test for nitrate and a random porous media with conducting skeleton. Int J Eng Sci
ammonia in reclaimed water. Everglades Res Educ Center 48:1702–1708
2:2006 Stockholm International Water Institute and Elsevier (2012) The
Rounds SA (2002) Development of a neural network model for water and food nexus: trends and development of the research
dissolved oxygen in the Tualatin River. In: Oregon Second landscape
Federal Interagency hydrologic modeling conference, Las Svozil D, KvasniEka V, Pospichal J (1997) Introduction to multi-
Vegas, Nevada, July 29–August 1, pp 1–13 layer feed-forward neural networks. Chemometr Intell Lab Syst
Schleiter IM, Borchardt D, Wagner R, Dapper T, Schmidt KD, 39:43–62
Schmidt HH, Werner H (1999) Modeling water quality, United States Environment Protection Agency (2013) Total Alka-
bioindication and population dynamics in lotic ecosystems using linity. Retrieved 6 Mar 2013
neural networks. Ecol Model 120:271–286 Varnell LM, Evans DA, Bilkovic DM, Olney JE (2008) Estuarine
Scholten H, Kassahun A, Refsgaard JC, Kargas T, Gavardinas C, surface water allocation: a case study on the interactive role of
Beulens AJM (2007) A methodology to support multidisci- science in support of management. Environ Sci Policy
plinary model-based water management. Environ Model Softw 11:602–612
22:743–759 Wurts WA (2002) Alkalinity and hardness in production ponds.
Setiono R (2001) Feed-forward neural network construction using World Aquac 33:16–17
cross validation. Neural Comput 13(12):2865–2877 Zhang Z, Wang X, Ou Y (2010) Water simulation method based on
Sevostianov I, Shrestha M (2010) Cross-property connections BPNN response and analytic geometry. Proc Environ Sci
between overall electric conductivity and fluid permeability of 2:446–453

123

You might also like