IJACSA - Volume3 No11
IJACSA - Volume3 No11
Where
i, j= indices
n = number of containers.
under certain conditions. Some of the many Nagios
features
include [29]:
Monitoring of network services (SMTP, POP3, HTTP,
NNTP, ICMP) and monitoring of host resources
(processor load, disk and memory usage, running
processes, log files, etc.) and monitoring of
environmental factors such as temperature;
Simple plug-in design that allows users to easily
develop their own host and service checks and ability
to define network host hierarchy, allowing detection of
and distinction between hosts that are down and those
that is unreachable;
Contact notifications when service or host problems
occur and get resolved (via email or other user-defined
method);
A Web interface viewing current network status,
notification and problem history, etc.
To implement this application we have used over five
hundred locations. In figure 6 and 7 are presented the locations
monitored for this large enterprise we eliminate the
beneficiary name for advertising reason. Weve installed
Nagios
are
operators for the three imaginary parts. The conjugate of
quaternion number, q*, is defined as
d c b a
q k q j q i q q
* = . (2)
More detailed description of quaternion operations can be
referred in [25].
Table I shows the proposed mapping method between the
four characters and the corresponding quaternion numbers.
Note that all the real parts are zero, while the imaginary parts
can be considered as the coordinates of four vertices of a
regular tetrahedron in 3-D space.
That is, the 3-D coordinates (x,y,z) of the vertices of a
regular tetrahedron are applied to the imaginary part `b,c,d of
quaternion numbers, and the real part `a' of quaternion
numbers are set to zero. Then, a character sequence is mapped
to a quaternion number sequence. For example, a character
sequence s
c
[n]={A,A,T,A,G,C,G,T} is mapped to a quaternion
number sequence s
q
[n]={q
A
, q
A
, q
T
, q
A
, q
G
, q
C
, q
G
, q
T
}.
TABLE I: THE MAPPING TABLE FROM A, C, T, AND G TO CORRESPONDING
QUATERNION NUMBERS.
After the mapping procedure in the former discussion, a
quaternion number sequence standing for a DNA sequence is
obtained:
s
q
[n] = {q
1
, q
2
, q
3
, ...,q
n
}. (3)
Then, an accumulating process is applied to obtain another
quaternion number series
} , , , , { ] [
3 2 1 n q
q q q q n s K = , (4)
where
=
=
l
n
n l
q q
1
.
(5)
By extracting the imaginary part of series ] [n s
q
, the 3-D
coordinates for each accumulated quaternion number in the
series can be obtained. Line segments are used to connect
these points in order, and a 3-D trajectory can thus be obtained
for DNA sequence visualization [26]-[28].
B. Quaternion Correlation
Once the quaternion number sequences of two DNA
sequences have been obtained, the cross-correlation operation
is performed for the global comparison. The cross-correlation
of two quaternion number sequences ] [
1
n s
q
and ] [
2
n s
q
of
lengths M and N, respectively, is defined as
=
+ =
n
q q
n s n s r ], [ ] [ ] [
*
2 , 1
2
1
(6)
where is the index of the correlation function and
1 0 + s s M N .
In the correlation operation, the conjugate operation is
applied on one of the two quaternion numbers multiplied.
Therefore, the cross-correlation of two identical and different
symbols contributes +1 and -1/3 to the result, respectively. If
there are two sequences to be correlated with length N and M
(N>M), the real part of correlation result, v
Re
, forzbp overlap
(0<zM) is
), 3 / 1 ( ) (
Re
+ = p z p v
(7)
Where p is the matching counts and z-p is the mismatching
counts.
Equation (6) shows that the cross-correlation result for a
specific value is the sum of the product of the original and
the shifted and conjugate sequences. The products in certain n
are non-zero and zero values when two sequences overlap and
not, respectively.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
41 | P a g e
www.ijacsa.thesai.org
When the two sequences overlap, the product of two
quaternion numbers reflects whether they are the same or not.
That is, during the cross-correlation operation, the local
alignment proceeds under a given
+
=
> + =
1
*
2 , 1
, 1 for ], [ ] [ ] [
2 1
z m
m n
q q
z n s n s r
(8)
Where m denotes the starting position of the overlap region
under the given value. Since there are (N+M-1) possible
values in the cross-correlation result, all the possible local
alignments between two sequences can be obtained. Therefore,
the local alignment results can be shown in a 2-D array of size
(N+M-1)(N+M-1), in which the entry denotes the matching
status of two nucleotides from two DNA sequences. A
grayscale image f
Re
(x,y) of size (N+M-1) (N+M-1)
corresponding this 2-D array can be generated based on the
following rule:
= +
= +
= +
=
overlap) - (non , 0 ]} [ ] [ Re{ if , 128
mismatch) but (overlap ,
3
1
]} [ ] [ Re{ if , 255
match) & (overlap , 1 ]} [ ] [ Re{ if , 0
) , (
*
*
*
Re
2 1
2 1
2 1
y x s n s
y x s n s
y x s n s
y x f
q q
q q
q q
(9)
Here Re{} denotes the real part of the complex value in
the bracket and 0<x, yN+M-1. If the connected pixels in the
horizontal direction constitutes as a black line in the image,
the local matching between two sequences exists. Therefore,
the local matching information between two DNA sequences
can be obtained from the quaternion correlation result in
addition to the global matching information.
TABLE II.THE IMAGINARY PARTS OF THE MULTIPLIED VALUES OF EVERY TWO
DIFFERENT QUATERNION NUMBERS.
AXT AXC AXG TXC TXG CXG
i
0
3
6
3
6
9
6
9
6
9
6 2
j
3
2 2
3
2
3
2
3
2
3
2
0
k
0 0 0
9
3 4
9
3 4
9
3 4
TXA CXA GXA CXT GXT GXC
i
0
3
6
3
6
9
6
9
6
9
6 2
j
3
2 2
3
2
3
2
3
2
3
2
0
k
0 0 0
9
3 4
9
3 4
9
3 4
C. Mismatching Analysis
The real part of correlation result reflects the matching
information, while the imaginary part reflects the mismatching
information in sequence comparison. Therefore, the number of
the matching and mismatching counts from the real and
imaginary parts of the correlation result could be investigated.
Let q
3
and q
4
denote the products of two quaternion numbers q
1
and q
2
. That is, q
3
=q
1
q
2
and q
4
=q
2
q
1
,where
n n n n
d c b a n
q k q j q i q q
+ + + =
, n=1,2,3,4. According to the rules
of quaternion multiplication, the following results can be
obtained:
,
2 1 2 1 2 1 3
d d c c b b a
q q q q q q q = (10)
,
2 1 2 1 3
c d d c b
q q q q q = (11)
,
2 1 2 1 3
b d d b c
q q q q q + = (12)
,
2 1 2 1 3
b c c b d
q q q q q = (13)
and
. , , ,
3 4 1 4 3 4 3 4
d d c c b b a a
q q q q q q q q = = = = (14)
Figure 1.The colors corresponding to the different combination of the
multiplied quaternion numbers.
In addition to the real part of the calculated quaternion
number, the rest imaginary parts can provide further more
mismatching information between the two sequences. A color
image in which the R, G, and B components are respectively
corresponding to the three imaginary parts
k j i
in the
quaternion number can be generated.
First, the possible values of the three imaginary parts are
determined and listed in Table I. Second, for each component,
the finite values are normalized into the monotone pixel values
between 0 and 255.
There are six, five, and three possible values for the
k j i
and ,
=
= + =
1
0
2
*
1
1 - *
2 1 2 , 1
]}, [ ] [ { FT ] [ ] [ ] [
n
l
X X l x n x r k k
(15)
where ]} [ { FT ] [
1 1
n x X = k and ]} [ { FT ] [
2 2
n x X = k are the
FTs of two sequences x
1
[n] and x
2
[n], respectively. In Eq. (15),
the time complexity depends on the FT. Because the time
complexity of the FT is O(N
2
) and time complexity of
multiplication is O(N), the time complexity of correlation
operation is O(N
2
). With the fast FT algorithm, the time
complexity can be improved to become O(Nlog
2
N).
Let FT
Q
and
-1
Q
FT denote the quaternion Fourier transform
(QFT) and inverse QFT, respectively. From the Hypercomplex
Wienear-Khintchine theorem [23], Eq.(15) becomes:
]}, [ ] [ { FT ]} [ ] [ { FT ] [
*
2 1 Q
*
|| 2 1
-1
Q 2 , 1
k k k k
+ = X X X X r
(16)
where ]} [ { FT ] [
1 Q 1
n x X = k , ]} [ { FT ] [
2 Q 2
n x X = k ,
k || ] [
|| 2
X , and k ] [
2
=867, which
corresponds to the peak correlation result shown in Fig.
7(b).In addition to the cross-correlation values, which
represent the global matching information of two sequences,
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
44 | P a g e
www.ijacsa.thesai.org
the local matching information can be observed and measured
in the parallelogram. To demonstrate the capability of the
proposed quaternion correlation method, Figs. 7(c), 7(e), and
7(g) show the alignment results when 70 bp deletion, insertion,
and substitution occur in one of the two sequences. For the
cases of deletion and insertion, the horizontal line breaks into
two parts, which are shifted horizontally by 70 bp. For the
case of substitution, part of the line corresponding to the
substituted nucleotides disappears. Figs. 7(d), 7(f), and 7(g)
show that the corresponding correlation peaks appear
accordingly.In addition to the peak values, the local matching
positions can also be directly observed from the 2-D pattern.
Compared with the 1-D correlation result, obviously, the 2-D
pattern provides more information on the local matching result.
Figure 7. (a) Local matching results obtained from the cross-correlation
operation of two quaternion sequences; (b) Corresponding 1-D cross-
correlation result of the 2-D pattern shown in (a);(c) 70 bp deletion in one of
the sequences;(d) Corresponding 1-D cross-correlation result of the 2-D
pattern shown in (c).(e) 70 bp insertion in one of the sequences;(f)
Corresponding 1-D cross-correlation result of the 2-D pattern shown in (e);(g)
70 bp substitution in one of the sequences;(h) Corresponding 1-D cross-
correlation result of the 2-D pattern shown in (g).
Finally, Fig. 8 shows the 2-D color pattern in which the
mismatching information can be directly observed. According
to Fig. 8, four kinds of mismatching (AT, TA, CG, and GC)
are the major parts. More information can be observed by
examining the detailed parts in this pattern.
Figure 8. The 2-D pattern corresponding to the imaginary parts of quaternion
correlation results of the two sequences for the Human TGFA and Mouse
DHFR genes.
IV. CONCLUSION
In this study, the quaternion correlation based on
quaternion number systems is proposed for DNA sequence
representation and alignment. From the cross-correlation result
of quaternion-number sequences, two DNA sequences can be
compared in a pair-wise mode. The peak value of real part of
the correlation result corresponds to the globally best-
matching position of two similar sequences. On the other
hand, the 2-D image obtained from the product terms in the
cross-correlation operation can provide more information on
local alignment of two DNA sequences. For the deletion or
insertion happening in the sequences, they can be
discriminated by analyzing the correlation results. Moreover, a
color 2-D image can also be generated to visualize the
mismatching conditions of two DNA sequences. The
simulation results show that the proposed method is of
promising potential in bioinformatics. Future work will focus
on extracting more information and relationships between two
sequences from the generated 2-D pattern.
ACKNOWLEDGMENT
This work is partly supported by National Science Council,
Taiwan, under the contract number NSC 100-2628-E-224-
002-MY2.
REFERENCES
[1] C. Zhang and A.K. Wong, A genetic algorithm for multiple molecular
sequence alignment,Comput. Appl. Biosci., vol. 13, pp. 565581, 1997.
[2] W. Choe, O.K. Ersoy, and M. Bina, Neural network schemes for
detecting rare events in human genomic DNA, Bioinformatics, vol. 16,
pp. 10621072, 2000.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
45 | P a g e
www.ijacsa.thesai.org
[3] L. Hirschman, J.C. Park, J. Tsujii, L. Wong, and C.H. Wu,
Accomplishments and challenges in literature data mining for biology,
Bioinformatics, vol. 18, pp. 15531561, 2002.
[4] P.D. Cristea, Large scale features in DNA genomic signals, Signal
Processing, vol. 83, no. 4, pp. 871888, 2003.
[5] http://www.expasy.org/links.html
[6] http://www.ensembl.org/index.html
[7] http://www.ncbi.nlm.nih.gov/Entrez/
[8] D. Anastassiou, Genomic signal processing,Signal Processing
Magazine, vol. 18, pp. 820, 2001.
[9] J. Astola, E. Dougherty, I. Shmulevich, and I. Tabus,Genomic signal
processing, Signal Processing, vol. 83, no. 4, pp. 691694, 2003.
[10] A.V. Oppenheim, R.W. Schafer, and J.R. Buck, Discrete-Time Signal
Processing, 2nd Edition,Chapter 1, Prentice Hall International, Inc.,
1999.
[11] S.-Y. Hsieh, C.-W. Huang, and H.-H. Chou, A DNA-based graph
encoding scheme with its applications to graph isomorphism problems,
Applied Mathematics and Computation, vol. 203, no. 2, pp. 502512,
September 2008.
[12] S.-Y. Hsieh and M.-Y. Chen, A DNA-based solution to the graph
isomorphism problem using Adleman-Lipton model with stickers,
Applied Mathematics and Computation, vol. 197, no. 2, pp. 672686,
April 2008.
[13] W. Wang and D.H. Johnson, Computing linear transforms of symbolic
signals, IEEE Transactions on Signal Processing, vol. 50, pp. 628634,
2002.
[14] S. Tiwari, S. Ramachandran, A. Bhattacharya, S. Bhattacharya,and R.
Ramaswamy, Prediction of probable genes by Fourier analysis of
genomic sequences, Computer Applications in the Biosciences, vol. 13,
pp 263270, 1997.
[15] D. Anastassiou, Frequency-domain analysis of biomolecular
sequences, Bioinformatics, vol. 16, no. 12, pp. 10731081, 2000.
[16] Y. Magarshak , Quaternion representation of RNA sequencesand
tertiary structures, Biosystems, vol. 30, no.13, pp. 2129, 1993.
[17] T. Bulow and G. Sommer, Hypercomplex signals A novel extension
of the analytic signal to the multidimensional case, IEEE Transactions
on Signal Processing, vol. 49, no. 11, pp. 28442852, 2001.
[18] J.-J. Shu and L.S. Ouw, Pairwise alignment of the DNA sequence using
hypercomplex number representation, Bulletin of Mathematical
Biology, vol. 66, no. 5, pp. 1423--1438, 2004.
[19] Y. Li and J.-J. Shu. Cross-correlation of DNA sequences using
hypercomplex number encoding, 2006 International Conference on
Biomedical and Pharmaceutical Engineering (ICBPE 2006), pp. 453
458, 1114 Dec. 2006.
[20] J.-J. Shu and Y. Li, Hypercomplex cross-correlation of DNA
sequences, Journal of Biological Systems, vol. 18, no. 4, pp.711--725,
2010.
[21] I.L. Kantor and A.S. Solodovnikov, Hypercomplex Number: An
Elementary Introduction to Algebras, New York: Springer-Verlag, 1989.
[22] S.J. Sangwine and T.A. Ell, Color image filter based on hypercomplex
convolution, IEE Proceedings of Vision, Image Signal Processing, vol.
147, no. 2, pp. 8993, 2000.
[23] T.A. Ell and S.J. Sangwine, Hpyercomplex Winear-Khintchine theorem
with applicationto color image correlation, IEEE International
Conference on Image Processing,vol. 2, pp. 792795, 2000.
[24] K. Ueda, S.-I. Takahashi, Digital filters with hypercomplex
coefficients, 1993 IEEE International Symposium on Circuits and
Systems, vol. 1, pp. 479482, 1993.
[25] http://www.wikipedia.org/
[26] H.T. Chang, N.W. Lo, W.C. Lu, and C.J. Kuo, Visualization of DNA
sequences by use of three-dimensional trajectory, The First Asia-
Pacific Bioinformatics Conference, vol. 19, pp. 8185, Australia, Feb.
2003.
[27] H.T. Chang, DNA sequence visualization, Chapter 4 in Advanced
Data Mining Technologies in Bioinformatics, pp. 6384, Edited by Dr.
H.-H. Hsu, Idea Group, Inc., 2006.
[28] N.-W. Lo, H.T. Chang, S.W. Xiao, and C.J. Kuo,Global visualization
of DNA sequences by use of three-dimensional trajectories, Journal of
Information Science and Engineering, vol. 23, no. 6, pp. 17231736,
Nov. 2007.
[29] S.-C. Pei, J.J. Ding, and J.H. Chang, Efficient implementation of
quaternion Fourier transform, convolution, and correlation by 2-D
Complex FFT, IEEE Transactions on Signal Processing, vol. 49, no. 11,
pp. 27832797, Nov. 2001.
[30] http://www.ncbi.nlm.nih.gov/
[31] http://www.ncbi.nlm.nih.gov/BLAST/
AUTHORS PROFILE
Hsuan T. Chang received his B.S. degree in Electronic Engineering from
National Taiwan University of Science and Technology, Taiwan, in 1991, and
M.S. and Ph.D. degree in Electrical Engineering (EE) from National Chung
Cheng University (NCCU), Taiwan, in 1993 and 1997, respectively. He was a
visiting researcher at Laboratory for Excellence in Optical Data Processing,
Department of Electrical and Computer Engineering, Carnegie Mellon
University, from 1995 to 1996. Dr. Chang was an assistant professor in
Department of Electronic Engineering in Chien-Kuo University of
Technology, Changhua, Taiwan, from 1997 to 1999, an assistant professor in
Department of Information Management, Chaoyang University of Technology,
Wufeng, Taiwan, from 1999 to 2001, and an assistant professor and associate
professor in EE Department of National Yunlin University of Science and
Technology (YunTech), Douliu, Taiwan, fron 2001 to 2002 and 2003 to 2006,
respectively. Her served as Chairman of Graduate Institute of
Communications Engineering of Yuntech from 2008~2011. He currently is a
full professor in EE Department of YunTech, Chairman of Graduate School of
Engineering Technology and Science, and Deputy Dean of College of
Engineering, YunTech. He was also an adjunct assistant professor in Graduate
Institute of Communications Engineering of NCCU from 2000 to 2003. Dr.
Chang was a visiting scholar in Institute of Information Science, Academia
Sinica, Taiwan and in EE department, University of Washington, Seattle USA
from 2003/7 to 2003/9 and 2007/8 to 2008/3, respectively. Dr. Chang's
interests include image/video analysis, optical information
processing/computing, medical image processing, and human computer
interface. He has published more than 180 journal and conference papers in
the above research areas. He was the recipient of the visiting research
fellowship from Academia Sinica, Taiwan in 2003, and the excellent research
award for young faculty in NYUST in 2005. He also received Outstanding
Paper Award from 2009 MATLAB/Simulink Tech Forum Call for Papers in
2009, Taiwan. He served as the reviewer of several international journals.He
served as the conference chair of 2005 Workshop on Consumer Electronics
and Signal Processing held in Taiwan and was an invited speaker, session
chair, and program committee in various domestic and international
conferences. Dr. Chang is Senior Member of Institute of Electrical and
Electronics Engineers (IEEE), Senior Member of Optical Society of America
(OSA), International Society for Optical Engineering (SPIE), a member of
International Who's Who (IWW), Taiwanese Association of Consumer
Electronics (TACE), Asia-Pacific Signal and Information Processing
Association (APSIPA), and The Chinese Image Processing and Pattern
Recognition (IPPR) Society.
Chung J. Kuoreceived BS and MS degree in Power Mechanical
Engineering from National TsingHua University, Taiwan, in 1982 and 1984,
respectively, and PhD degree in Electrical Engineering (EE) from Michigan
State University (MSU) in 1990. He joined EE Depart-ment of National
Chung Cheng University (NCCU) in 1990 as an associate professor and then
became a full professor in 1996. He was the chairman of Graduate Institute of
Communica-tions Engineering of NCCU between 1999 and 2002. Dr. Kuo
was a visiting scientist at Opto-Electronics & System Lab, Industrial
Technology Research Institute in 1991 and IBM T.J. Watson Research Center
from 1997 to 1998 and a consultant to several international/local companies.
He was the Director of RD Center of Components Business Group (CPBG),
Delta Electronics, Inc. from 2003 to 2004. In 2004, Dr. Kuo became the
Senior Director of Magnetics and Microwave Business Unit of CPBG, Delta
Electronics, Inc. Dr. Kuo currently is consultants of two private companies. Dr.
Kuo interests in image/video signal processing, VLSI signal processing, and
photonics. He is the co-director of the Signal and Media (SAM) Labs., NCCU.
Dr. Kuo received the Distinguished Research Award from National Chung
Cheng University in 1998, Overseas Research Fellowship from National
Science Council (NSC) in 1997, Outstanding Research Award from College
of Engineering, NCCU in 1997, Medal of Honor from NCCU in 1995,
Research Award from NSC for consecutive 11 times, EE Fellowship from
MSU in 1989, and Outstanding Academic Achievement Award from MSU in
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
46 | P a g e
www.ijacsa.thesai.org
1987. He was a guest editor for three special sections of Optical Engineering
and 3D Holographic Imaging (to be published by John, Wiley and Sons) and
an invited speaker and program committee chairman/member for several
international/local conferences. He also serves as an Associate Editor of IEEE
Signal Processing Magazine and President of SPIE Taiwan Chapter (1998-
2000). Dr. Kuo is a senior member of IEEE and a member of Phi Kappa Phi,
Phi Beta Delta, OSA, and SPIE and listed in Who's Who in the World.
Neng-Wen Lo received his bachelor's degree in Physics from the
Tunghai University in 1986 and Ph.D. degree in Biochemistry from the State
University of New York at Buffalo, USA, in 1997. He was a research fellow
at the Oncology Center of Johns Hopkins School of Medicine in Maryland,
USA, from 1997 to 1999. In 1999, he returned to Taiwan and joined the
Department of Animal Science and Biotechnology at Tunghai University. He
is currently an Associate Professor and Head of the Agriculture Extension
Center, College of Agriculture. Dr. Lo's research interest is in
Computational Biology and in Reproduction Biology. He has published more
than 40 journal and conference papers. He was the chief editor of Tunghai
Journal in 2003 and ever served as referees for several journals. Dr. Lo is a
founding member of the Bioinformatics Society in Taiwan. He is also a
member of the Society for the Study of Reproduction and a permanent
member of Chinese Society of Animal Science.
Wei-Z.LvBiography is not available.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol.3, No. 11, 2012
47 | P a g e
www.ijacsa.thesai.org
Evaluation of Regressive Analysis Based Sea Surface
Temperature Estimation Accuracy with NCEP/GDAS
Data
Kohei Arai
1
Graduate School of Science and Engineering
Saga University
Saga City, Japan
AbstractIn order to evaluate the skin surface temperature
(SSST) estimation accuracy with MODIS data, 84 of MODIS
scenes together with the match-up data of NCEP/GDAS are used.
Through regressive analysis, it is found that 0.305 to 0.417 K of
RMSE can be achieved. Furthermore, it also is found that band
29 is effective for atmospheric correction (30.6 to 38.8% of
estimation accuracy improvement). If single coefficient set for the
regressive equation is used for all the cases, SSST estimation
accuracy is around 1.969 K so that the specific coefficient set for
the five different cases have to be set.
Keywords- Thermal infrared radiometer; Skin sea surface
temperature; Regressive analysis; Split window method;
NCEP/GDAS; MODTRAN; Terra and AQUA/MODI S.
I. INTRODUCTION
The required skin sea surface temperature estimation
accuracy is better than 0.25K for radiation energy budget
analysis, global warming study and so on [1]. Skin sea surface
temperature (SSST) is defined as the temperature radiation
from the sea surface (approximately less than 20m in depth
from the surface) and is distinct with the Mixed layer sea
surface temperature (MSST) based on the temperature
radiation from the sea surface (about 10 m in depth from the
surface) and also is different from the Bulk sea surface
temperature (BSST) which is based on the temperature
radiation from just below the skin [2]. In order to estimate
SSST, (a) atmospheric influences which are mainly due to
water vapor followed by aerosols for the atmospheric window
channels [3], (b) cloud contamination, (c) emissivity changes
mainly due to white caps, or forms followed by limb
darkening due to changes of path length in accordance with
scanning angle changes should be corrected [4]-[8].
One of the atmospheric corrections is split window method
which is represented by Multi Channel Sea Surface
Temperature (MCSST) estimation method [9]. In the MCSST
method, 10m of atmospheric window is split into more than
two channels. The atmospheric influences are different among
the split channels so that it is capable to estimate atmospheric
influence by using the difference. Through a regressive
analysis between the estimated SSST with acquired channels
of satellite onboard thermal infrared radiometer (TIR) data and
the corresponding match-up truth data such as buoy or
shipment data (Bulk temperature) with some errors, all the
required coefficients for the regressive equation are
determined. Thus SSST is estimated with the regressive
equation if the newly acquired TIR data is put into the
regressive equation. On the other hand, the method for
improvement of SSST estimation accuracy by means of
linearized inversion of radiative transfer equation (RTE) is
proposed [10] and also the method for solving RTE more
accurately based on iterative method is proposed [11],[12].
RTE is expressed with Fred-Holm type of integral equation
with a variety of parameters. Such RTE can be solved with
linear and/or Non-Linear inversion methods. Integral equation
can be linearized then RTE can be solved by using linear
inversion method and also can be solved iteratively. The
former and the later are called Linearized inversion and Non-
Linear iteration, respectively.
In accordance with MODTRAN [13], the following 5
cases of the ocean areas (Latitude) and the seasons are
selected, (a)Sub-Arctic Summer, (b)Sub-Arctic Winter,
(c)Mid-Latitude Summer, (d)Mid-Latitude Winter, (e)Tropic.
The regressive analysis is made by using a match-up data set
of MODIS data and NCEP/GDAS (Global Data Assimilation
Model [14], 1 degree mesh data of air temperature, relative
humidity, ozone, cloud, precipitable water for the sphere with
the altitude ranges from 1000 to 100 hPa) data. The regressive
error in terms of Root Mean Square Error (RMSE) and
regressive coefficients are calculated.
Firstly, major specification of MODIS is introduced
together with atmospheric characteristics such as transparency,
water vapor profile and aerosol profile derived from
MODTRAN. Secondly, the method for regressive analysis is
described followed by the procedure of the preparation of
match-up data derived from NCEP/GDAS and MODIS data
together with cloud masking. Thirdly, the results from the
regressive analysis are shown followed by the results from the
comparative study on regressive equations. Finally, major
conclusions are discussed.
II. PROPOSED METHOD
A. MODIS onboard Terra and Aqua satellites
MODIS/TIR onboard Terra and AQUA satellites is
moderate spatial resolution (IFOV=1k m) of thermal infrared
radiometer with the swath width of 2400 km and consists of 3
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol.3, No. 11, 2012
48 | P a g e
www.ijacsa.thesai.org
channels of radiometer which covers the wavelength region
shown in Table 1.
B. Atmospheric Model Used
The atmospheric transmittance for the wavelength region
for the typical atmospheric models in accordance with
MODTRAN 4.0, Tropic, Mid-Latitude Summer and winter,
Sub-Arctic Summer and winter are shown in Fig.1.
TABLE I. WAVELENGTH COVERAGE OF MODIS
Cloud Properties 29 8.400 - 8.700 9.58(300K) 0.05
Surface/Cloud
Temperature
31 10.780 - 11.280 9.55(300K) 0.05
32 11.770 - 12.270 8.94(300K) 0.05
Figure 1 Transmittance for the typical atmospheric models
Fig2 The vertical profile of the water vapor for the typical atmospheric models.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol.3, No. 11, 2012
49 | P a g e
www.ijacsa.thesai.org
Fig.3 The vertical profile of the aerosol extinction coefficient for the typical atmospheric models
In this calculation, aerosol type and meteorological range
are set to rural and 23 km, respectively while the observation
angle is set at 0 degree of zenith angle, namely, nadir looking
situation. The vertical profiles of water vapor content in the
atmosphere (in unit of atm cm/km) and aerosol extinction
coefficient (in unit of km
-1
) are shown in Fig.2 and 3,
respectively.
The Tropic shows lowest transmittance followed by Mid-
Latitude Summer, Sub-Arctic Summer, Mid-Latitude Winter
and Sub-Arctic Winter while the Tropic shows largest water
vapor content in the atmosphere followed by Mid-Latitude
Summer, Sub-Arctic Summer, Mid-Latitude Winter and Sub-
Arctic Winter.
The most dominant factor for atmospheric influence in this
wavelength region is water vapor followed by aerosol so that
vertical profiles have to be clarified. Relatively good MODIS
data for the Sub-Arctic Winter could not be found so that (1)
Tropic, (2) Mid-Latitude Summer, (3) Mid-Latitude Winter
and (4) Sub-Arctic Summer were selected for the analysis.
C. Regressive Analysis
The following simple linear regressive equation between
NCEP/GDAS derived SSST and the physical temperature
estimated with MODIS Level 1B product of data is assumed,
SSST=C
1
+C
2
*T
30
+C
3
*(T
30
-T
31
)+C
4
*(sec()-1)*(T
30
-T
31
)
+C
5
*(T
30
-T
29
)*(sec()-1)*(T
30
-T
29
) (1)
where Ti is the brightness temperature of the channel i,
is the solar zenith angle. Through a regressive analysis, Ci for
each case is estimated as well as regression error, Root Mean
Square Error (RMSE) which is corresponding to the SSST
estimation error. In this connection, the swath width of
MODIS is 60 km while the grid size of NCEP/GDAS is 1
degree so that the averaged Ti over full scene of MODIS
(2400 km) is calculated and put into the equation (1) together
with the linearly interpolated NCEP/GDAS data at the scene
center of the MODIS data which become SSST in the equation
(1). In accordance with ATBD 25 (SST algorithm [15]), SST
is estimated with the following equation,
SST = a
0
+a
1
T
1
+a
2
(T
1
-T
2
)T
b
+a
3
(sec() -1) (2)
where Tb is MCSST [9]. The proposed regressive equation
uses band 29 of brightness temperature influenced by water
vapor. It is very effective for atmospheric correction.
III. EXPERIMENTS
The relations between MODIS derived SSST and
NCEP/GDAS SSST for the typical atmospheric models of
Tropic, Mid-Latitude Summer and winter and Sub-Arctic
Summer are shown in Fig.4 to 7. In the experiments, 37, 18,
15 and 14 MODIS data which were acquired from 2000 to
2001 were used together with the match-up NCEP/GDAS
data.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol.3, No. 11, 2012
50 | P a g e
www.ijacsa.thesai.org
Figure 4 The relation between MODIS derived SSST and NCEP/GDAS SSST for Tropic ocean area.
Figure 5 The relation between MODIS derived SSST and NCEP/GDAS SSST for Mid-Latitude Summer of ocean area and season
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol.3, No. 11, 2012
51 | P a g e
www.ijacsa.thesai.org
Figure 6 The relation between MODIS derived SSST and NCEP/GDAS SSST for Mid-Latitude Winter of ocean area and season
Figure 7 The relation between MODIS derived SSST and NCEP/GDAS SSST for Sub-Arctic Summer of ocean area and season
For all the ocean areas and seasons, in particular, for
Tropic ocean area, it may concluded that MODIS derived
SSST is lower than NCEP/GDAS derived SSST for the
relatively high SSST portion while MODIS derived SSST is
higher than that from NCEP/GDAS for the relatively low
SSST portion. For the tropic ocean area, SSST ranges just
from 298 K to 303 K while the ranges for the Mid-Latitude
Summer and winter as well as Sub-Arctic Summer are 281-
302 K, 280-299 K and 272-287 K, respectively.
It is also found that the effectiveness of band 29 on RMSE
improvement for Tropic model is greatest followed Mid-
Latitude Summer, Sub-Arctic Summer and Mid-Latitude
Winter. This order is the same order of water vapor and
aerosol content in the atmosphere as is shown in Fig.3 and 4.
IV. CONCLUSION
It is found that SSST estimation accuracy with MODIS
data is better than 0.417 K for all the cases defined above. A
comparison of SSST estimation accuracy among the cases is
attempted. Through regressive analysis, it is found that 0.305
to 0.417 K of RMSE can be achieved.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol.3, No. 11, 2012
52 | P a g e
www.ijacsa.thesai.org
Furthermore, it is also found that the effectiveness of
exclusion of the band 29 of on RMSE degradation for Tropic
model is greatest followed Mid-Latitude Summer, Sub-Arctic
Summer and Mid-Latitude Winter. This order is the same
order of water vapor and aerosol content in the atmosphere.
RMSE of the regressive equation without band 29 ranges from
0.498 to 6.01 K . Thus it is concluded that band 29 is effective
to atmospheric correction (water vapor absorption). The effect
corresponds to 30.6 to 38.8% of SSST estimation accuracy
improvement.
If single set of regressive coefficients for all the cases is
used, then SSST estimation accuracy is around 1.969 K so that
the regressive equations with the specific five different cases,
Tropic, Mid-Lat. Summer and winter and Sub-Arctic Summer
and winter have to be used.
REFERENCES
[1] NASA, Science and mission requirements Working Report, EOS
Science Steering Committee Report, Vol.I, 1990.
[2] Barton,I.J., Satellite-derived sea surface temperatures: Current status,
Journal of Geophysical Research, Vol.100, pp.8777-8790, 1995. and
Personal correspondence at the 37th COSPAR Congress, Warsaw,
Poland, July 2000.
[3] Scott, N.A. and A.Chedin, A fast line-by-line method for atmospheric
absorption computations: the automatized atmospheric absorption atlas,
Journal of Applied Meteorology, Vol.20, pp.802-812, 1981.
[4] Hollinger, J.P, Passive microwave measurements of sea surface
roughness, IEEE Trans. on Geoscience and Remote Sensing, Vol.GE-9,
No.3, 1971.
[5] Cox, C. and W.H.Munk, Some problems in optical oceanography,
Journal on Marine Research, Vol.14, pp.68-78, 1955.
[6] Harris, A.R., O. Brown and M.Mason, The effect of wind speed on sea
surface temperature retrieval from space, Geophysical Research Letters,
Vol.21, No.16, pp.1715-1718, 1994.
[7] Masuda K., T.Takashima and T.Takayama, Emissivity of pure and sea
waters for the model sea surface in the infrared window regions, Remote
Sensing Environment, Vol.24, pp.313-329, 1988.
[8] Watte, P.D., M.R.Allen and T.J.Nightingale, Wind speed effect on sea
surface emission and reflection for along track scanning radiometer,
Journal of Atmospheric and Oceanic Technology, Vol.13, pp.126-141,
1996.
[9] McClain, E.P., W.G.Pichel and C.C.Walton, Comparative performance
of AVHRR based multi-channel sea surface temperatures, Journal of
Geophysical Research, Vol.90, No.C6, pp.11587-11601, 1985.
[10] Arai K.,K.Kobayashi and M.Moriyama, Method to improve sst
estimation accuracy by means of linearized inversion of the radiative
transfer equation, Journal of Remote Sensing Society of Japan, Vol.18,
No.3, pp.272-279, 1998.
[11] Arai, K., Y.Terayama, M.Miyamoto and M.Moriyama, SST estimation
using thermal infrared radiometer data by means of iterative method,
Journal of Japan Society of Photogrammetory and Remote Sensing,
Vol.39, No.1, pp.15-20, 2000.
[12] Arai, K., Influence due to aerosols on SST estimation accuracy for the
non-linear iterative method, Journal of Remote Sensing Society of
Japan, Vol.21, No.4, pp.358-365, 2001.
[13] US Air Force Geophysical Laboratory, MODTRAN 3 User
Instructions,GL-TR-89-0122, 1996.
[14] NOAA/NGDC, ftp://ftp.ngdc.noaa.gov/pub/NCEP/GDAS/
[15] NASA/GSFC,
http://picasso.oce.orst.edu/ORSOO/MODIS/code/ATBDsum.html#SST
AUTHORS PROFILE
Kohei Arai, He received BS, MS and PhD degrees in 1972, 1974 and 1982,
respectively. He was with The Institute for Industrial Science and Technology
of the University of Tokyo from April 1974 to December 1978 also was with
National Space Development Agency of Japan from January, 1979 to March,
1990. During from 1985 to 1987, he was with Canada Centre for Remote
Sensing as a Post Doctoral Fellow of National Science and Engineering
Research Council of Canada. He moved to Saga University as a Professor in
Department of Information Science on April 1990. He was a councilor for the
Aeronautics and Space related to the Technology Committee of the Ministry
of Science and Technology during from 1998 to 2000. He was a councilor of
Saga University for 2002 and 2003. He also was an executive councilor for
the Remote Sensing Society of Japan for 2003 to 2005. He is an Adjunct
Professor of University of Arizona, USA since 1998. He also is Vice
Chairman of the Commission A of ICSU/COSPAR since 2008. He wrote
30 books and published 322 journal papers.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
53 | P a g e
www.ijacsa.thesai.org
The Modelling Process of a Paper Folding Problem in
GeoGebra 3D
1
Muharrem Aktumen
Department of Mathematics
Education
AhiEvran University
Kirsehir, Turkey
Bekir Kursat Doruk
Department of Mathematics
Education
AhiEvran University
Kirsehir, Turkey
Tolga Kabaca
Department of Mathematics
Education
Pamukkale University
Denizli, Turkey
AbstractIn this research; a problem situation, which requires
the ability of thinking in three dimensions, was developed by the
researchers. As the purpose of this paper is producing a modeling
task suggestion, the problem was visualized and analyzed in
GeoGebra3D environment. Then visual solution was also been
supported by algebraic approach. So, the capability of creating
the relationship between geometric and algebraic representations
in GeoGebra was also presented in 3D sense.
Keywords-component; Modelling; GeoGebra 3D; Paper Folding.
I. INTRODUCTION
There are several studies on modeling the real life
situations in GeoGebra [2, 3, 4, 5]. In this research, a problem
situation has been suggested and modeling process in
GeoGebra has been explained. Zbiek and Conner pointed out,
modeling contributes to understand the fore known
mathematical concepts thoroughly by demonstrating the
applicability of mathematical thoughts to real life, to learn new
mathematical concepts, to establish inter disciplinary relations
and to both conceptual and operational development of the
students studying in modeling processes [6]. Furthermore, the
algebraic and geometric representations are needed to be
connected in two ways [1,7]. That is, the modeling should
present the advantage of understanding how algebraic facts
effect the observed situations.
The problem can be described as follows according to the
figure 1 and figure 2. Lets call the intersection point of the
segment [ ] KL and the line passing through the point D and
parallel to segment [ ] OM as B and the intersection point of the
segment [ ] DK and the line passing through the point C and
parallel to the segment [ ] KL as A. So, the rectangle DABC and
the triangles DOA
A
, AKB
A
, BLC
A
and CMD
A
can be obtained,
such that the points D and A can be moved dynamically on the
segments on which they are located.
1
1
A short summary of this article has been submitted to
The International GeoGebra Institute Conference 2012 on 21-
23 September 2012 and supported by Ahi Evran University.
Figure 1: Statement of pre-folding
The problem was stated as for which locations of the
dynamic points A and B, the segments
t t
OL (
and
t t
K M (
intersect? in the research (Figure 2). The mathematical
concepts related to the solution can be summarized as
algebraic approach, line equations, slope, point and vector in
three dimensional space and scalar triple product. As a short
result, it can be stated that at least one of the points A and B
must be in the midpoint of the segments on which they are
located after visualization in the environment of GeoGebra
5.0. It is expected that this kind of real situated activities are
attractive for the students.
Figure 2: Statement of post-folding
II. UNDERSTANDING THE MODEL VISUALLY
The basic structure of the problem can be constructed as
follows where the values of a, b,
0
x and
0
y defined as a slider
tool. The points (0, 0) O , ( , 0) K a , ( , ) L a b , (0, ) M b are the
corners of the rectangle. The points
0
( , 0) A x ,
0
( , ) C x b ,
0
(0, ) D y and
0
( , ) B a y , are dynamic points which are located
on the sides of the rectangle and controlled by the sliders
dynamically.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
54 | P a g e
www.ijacsa.thesai.org
Figure 3: Problem statement
III. ANALYZING THE MODEL ALGEBRAICALLY
After the triangles , , DOA AKB BLC and CMD are
folded on the sides DA, AB, BC and DC respectively, the
coordinates of the points , ,
d d d
O K L and
d
M must be
determined.
Finding the coordinate of the point
d
O :
The intersection point of the line
DA
d and the line
d
OO
d will
provide us to determine the coordinates of the point
d
O .
Since
0
0
DA
d
y
m
x
= , by using the fact of the product of the
tangents of two perpendicular lines, it can be obtained that
0
0
OO
d
d
x
m
y
= .
Both of the lines equations can be constructed by using a
point which belongs to the line and its tangent.
The equation of the line
DA
d can be calculated as follows;
0
0 0 0 0 0
0
: ( )
DA
y
d y x x yx y x x y
x
= = + . After the
revision of equation, following form can be obtained;
0 0 0 0
yx y x x y + = . (1)
The equation of the line
d
OO
d can be calculated as follows;
0
0 0
0
:
d
OO
x
d y x yy x x
y
= = . After the revision of
equation, following form can be obtained;
0 0
0 yy x x = . (2)
By finding the common solution of the equation (1) and (2)
the coordinates of
d
O can be obtained as follows;
2 2
0 0 0 0
2 2 2 2
0 0 0 0
,
d
x y x y
O
x y x y
| |
|
+ +
\ .
(3)
Finding the coordinates of the point
d
K :
The intersection point of the line
AB
d and the line
d
KK
d
will provide us to determine the coordinates of the point
d
K .
Since
0
0
AB
d
y
m
a x
=
= .
Both of the lines equations can be constructed by using a
point which belongs to the line and its tangent.
The equation of the line
AB
d can be calculated as follows;
( ) ( )
0
0 0 0 0
0
0 0 0 0
: ( )
AB
y
d y x x y a x y x x
a x
ya yx y x x y
= =
=
. After
the revision of equation, following form can be obtained;
0 0 0 0
( ) y a x y x x y = (4)
The equation of the line
d
KK
d can be calculated as follows;
( ) ( )
0
0 0
0
: ( )
d
KK
a x
d y x a yy a x x a
y
= = After
the revision of equation, following form can be obtained;
0 0 0
( ) ( ) yy a x x a a x + = (5)
By finding the common solution of the equation (4) and (5)
the coordinates of
d
K
can be obtained as follows;
2 2 2
0 0 0 0 0
2 2 2 2
0 0 0 0
( ) ( )
,
( ) ( )
d
y x a a x y a x
K
y a x y a x
| | +
|
+ +
\ .
(6)
Finding the coordinates of the point
d
L :
The intersection point of the line
BC
d and the line
d
LL
d
will provide us to determine the coordinates of the point
d
L .
Since
0
0
BC
d
y b
m
a x
.
Both of the lines equations can be constructed by using a
point which belongs to the line and its tangent.
The equation of the line
BC
d can be calculated as follows;
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
55 | P a g e
www.ijacsa.thesai.org
( )
( )
0
0
0
0 0 0
:
( )( ) ( )
BC
y b
d y y x a
a x
y y a x y b x a
=
After the revision of equation, following form can be
obtained;
0 0 0 0 0
( ) ( ) ( ) ( ) y a x x b y a b y y a x + = + (7)
The equation of the line
d
LL
d can be calculated as follows;
( )
( )
0
0
0 0
:
( )( ) ( )
d
LL
a x
d y b x a
y b
y b y b a x x a
=
After the revision of equation, following form can be
obtained;
0 0 0 0
( ) ( ) ( ) ( ) y y b x a x a a x b y b = + (8)
By finding the common solution of the equation (7) and (8)
the coordinates of
d
L can be obtained as follows;
2 2 2
2 2
0 0 0 0
0 0 0
2 2 2 2
0 0 0 0
( ) ( ) ( )( )
( ) ( )
,
( ) ( ) ( ) ( )
d
a y b a x a x y b
y a x b b y
L
a x y b a x b y
| |
( +
+
|
| + +
\ .
(9)
Finding the coordinates of the point
d
M :
The intersection point of the line
DC
d and the line
d
MM
d
will provide us to determine the coordinates of the point
d
M .
Since
0
0
DC
d
b y
m
x
.
Both of the lines equations can be constructed by using a
point which belongs to the line and its tangent.
The equation of the line
DC
d can be calculated as follows;
0
0 0 0 0 0
0
:
DC
b y
d y y x yx y x bx xy
x
= = .
After the revision of equation, following form can be
obtained;
0 0 0 0
( ) yx x y b x y + = (10)
The equation of the line
d
MM
d can be calculated as
follows;
0
0 0
0
2
0 0 0
: ( )( )
d
MM
x
d y b x y b b y x x
b y
yb yy b by x x
= =
+ =
After the revision of equation, following form can be
obtained;
0 0 0
( ) ( ) y b y xx b b y + = (11)
By finding the common solution of the equation (10) and
(11) the coordinates of
d
M can be obtained as follows
2 2 2
0 0 0 0 0
2 2 2 2
0 0 0 0
( ) ( )
,
( ) ( )
d
x b y x y b b y
M
x b y x b y
| | +
|
+ +
\ .
(12)
When the triangles , , DOA AKB BLC and CMD are
folded perpendicular to the floor on the sides DA, AB, BC and
DC respectively, the corner points O, K, L and M will be in
their new places. Lets call these revised points as
t
O ,
t
K ,
t
L
and
t
M respectively. These points will be in three
dimensional space and the coordinates of , ,
d d d
O K L and
d
M
will be the first two components of them. The third
components of each point will be the lengths of
d
OO ,
d
KK ,
d
LL and
d
MM (figure-2)
Figure 4: Comparing two and three dimensional position of folded paper.
These lengths can be calculated by using the points O, K,
L and M and the equations of the lines
DA
d ,
AB
d ,
BC
d and
DC
d
. After the proper calculations the lengths are found as
follows;
0 0
2 2
0 0
d
x y
OO
y x
=
+
0 0
2 2
0 0
( )
( )
d
a x y
KK
a x y
=
+
0 0
2 2
0 0
( )( )
( ) ( )
d
a x b y
LL
a x b y
=
+
0 0
2 2
0 0
( )
( )
d
x b y
MM
x b y
=
+
So, the ordered triples as the components of the points
t
O
,
t
K
,
t
L
and
t
M
can be stated as follows;
2 2
0 0
0 0 0 0
2 2 2 2
2 2
0 0 0 0
0 0
, ,
t
x y
x y x y
O
x y x y
y x
| |
|
| + +
+
\ .
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
56 | P a g e
www.ijacsa.thesai.org
2 2 2
0 0
0 0 0 0 0
2 2 2 2
2 2
0 0 0 0
0 0
( )
( ) ( )
, ,
( ) ( )
( )
t
a x y
y x a a x y a x
K
y a x y a x
a x y
| |
+
|
| + +
+
\ .
The components of the
( )
, ,
x y z
t t t t
L L L L =
2 2 2
0 0 0 0
2 2
0 0
( ) ( ) ( )( )
( ) ( )
x
t
a y b a x a x y b
L
a x y b
( +
=
+
2 2
0 0 0
2 2
0 0
( ) ( )
( ) ( )
y
t
y a x b b y
L
a x b y
+
=
+
0 0
2 2
0 0
( )( )
( ) ( )
z
t
a x b y
L
a x b y
=
+
2 2 2
0 0
0 0 0 0 0
2 2 2 2
2 2
0 0 0 0
0 0
( )
( ) ( )
, ,
( ) ( )
( )
t
x b y
x b y x y b b y
M
x b y x b y
x b y
| |
+
|
| + +
+
\ .
Now, we have the points
t
O ,
t
K ,
t
L and
t
M in space. We
can think as the mathematical question in our main problem is
what is the condition for the line segments
t t
O L and
t t
K M are
intersected. The basic answer will be the vectors
t t
O L and
t t
K M must be coplanar and not parallel. Since the position of
the points
t
O ,
t
K ,
t
L ve
t
M in space, we can be sure that these
vectors are not parallel and has intersection points when they
are reflected on the floor. Lets check the vectors
t t
O L and
t t
K M coplanar when they are intersected in space.
For this check, the fact of three vectors scalar triple
product must be zero to be coplanar can be used. We have
two vectors for this operation. By choosing the vector
t t
O K
additionally, we can calculate the scalar triple product.
The components of the vector
t t
O L ;
2 2 2
2 2 2 2
0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
2 2 2 2 2 2 2 2
2 2 2 2
0 0 0 0 0 0 0 0
0 0 0 0
( ) ( ) ( )( )
( )( ) ( ) ( )
, ,
( ) ( ) ( ) ( )
( ) ( )
a y b a x a x y b
a x b y x y x y y a x b b y x y
a x y b x y a x b y x y
a x b y y x
| | ( +
+
|
| + + + +
+ +
\ .
The components of the vector
t t
K M ;
2 2 2 2 2 2
0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 2 2 2 2 2 2 2
2 2 2 2
0 0 0 0 0 0 0 0
0 0 0 0
( ) ( ) ( ) ( ) ( ) ( )
, ,
( ) ( ) ( ) ( )
( ) ( )
x b y a x y x b y y x a a x x y b b y y a x
x b y y a x x b y y a x
x b y a x y
| |
+ +
|
| + + + +
+ +
\ .
The components of the vector
t t
OK
;
2 2 2 2 2
0 0 0 0
0 0 0 0 0 0 0 0 0
2 2 2 2 2 2 2 2
2 2 2 2
0 0 0 0 0 0 0 0
0 0 0 0
( )
( ) ( )
, ,
( ) ( )
( )
a x y x y
y x a a x x y y a x x y
y a x x y y a x x y
a x y y x
| |
+
|
| + + + +
+ +
\ .
The triple product of the vectors can be calculated by following determinant;
2 2 2
2 2 2 2
0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
2 2 2 2 2 2 2 2
2 2 2 2
0 0 0 0 0 0 0 0
0 0 0 0
2 2 2 2
0 0 0 0 0 0 0
2 2 2 2
0 0 0 0
( ) ( ) ( )( )
( )( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( )
( ) ( )
( ) ( )
a y b a x a x y b
a x b y x y x y y a x b b y x y
a x y b x y a x b y x y
a x b y y x
x b y y x a a x x y b
x b y y a x
( +
+
+ + + +
+ +
+ +
+ +
2 2
0 0 0 0 0 0 0
2 2 2 2
2 2 2 2
0 0 0 0
0 0 0 0
2 2 2 2 2
0 0 0 0 0 0 0 0 0 0 0 0 0
2 2 2 2 2 2 2 2
2 2 2 2
0 0 0 0 0 0 0 0
0 0 0 0
( ) ( ) ( ) ( )
0
( ) ( )
( ) ( )
( ) ( ) ( )
( ) ( )
( )
x b y a x y b y y a x
x b y y a x
x b y a x y
a x y x y y x a a x x y y a x x y
y a x x y y a x x y
a x y y x
=
+ +
+ +
+
+ + + +
+ +
IV. CONCLUSION
In the GeoGebra 5.0 Beta release platform, it has been
easily seen that the scalar triple product is zero when the
intersected position of the vectors
t t
OL and
t t
K M is
captured.
By this way, students may understand the relationship of their
fore known mathematical knowledge and a real life situation
[6].
There is also another opportunity of understanding the
relationship between algebraic and geometric representations
in this paper [7] although it still needs to be proved.We
observed that at least one points of A and B must be the
midpoint of the segment which the point belongs. This fact
can be proved by solving the proper equations obtained from
the scalar triple product.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
57 | P a g e
www.ijacsa.thesai.org
REFERENCES
[1] AKKO, H. (2006). Fonksiyon Kavramnn oklu Temsillerinin
artrd Kavram Grntleri. H.. Eitim Fakltesi Dergisi (H.U.
Journal of Education). 30. 1-10
[2] AKTMEN, M. & KABACA, T. (2012). Exploring the Mathematical
Model Of The Thumbaround Motion by Geogebra. Technology,
Knowledge and Learning. DOI: 10.1007/s10758-012-9194-5
[3] AKTUMEN, M., BALTACI, S. & YILDIZ, A. (2011). Calculating the
surface area of the water in a rolling cylinder and visualization as two
and three dimensional by means of GeoGebra. International Journal of
Computer Applications
(www.ijcaonline.org/archives/volume25/number1/3170-4022)
[4] KABACA, T. & AKTUMEN, M. (2010), Using Geogebra as an
Expressive Modeling Tool: Discovering the Anatomy of the Cycloids
Parametric Equation, International Journal of GeoGebra The New
Language For The Third Millennium, Zigotto Printing And Publishing
House, Galati-Romania, Vol.1 No.1, 63-81 Issn: 2068-3227.
[5] AKTMEN, M. HORZUM & T., CEYLAN, T.(2010). nnde Engel
BulunanBirKaleminUcununIzininParametrikDenklemininHesaplanmas
veGeogebraileGrselletirme.9.
MatematikSempozyumuSergiveenlikleri.KaradenizTeknikniversitesi,
20-22 Ekim 2010 Trabzon.
[6] ZBIEK, R. M. & CONNER, A. (2006). Beyond motivation: Exploring
mathematical modeling as a context for deepening students
understandings of curricular mathematics. Educational Studies in
Mathematics, 63(1), 89-112.
[7] ZMANTAR, M., F., BINGLBALI, E. & AKKO, H. (2008).
Matematikselkavramyanlglarvezmnerileri.PegemAkademi.
Ankara.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
58 | P a g e
www.ijacsa.thesai.org
Sensitivity Analysis of Fourier Transformation
Spectrometer: FTS Against Observation Noise on
Retrievals of Carbon Dioxide and Methane
Kohei Arai
Graduate School of Science and Engineering
Saga University
Saga City, Japan
Takuya Fukamachi
Graduate School of Science and Engineering
Saga University
Saga City, Japan
Hiroshi Okumura
Graduate School of Science and Engineering
Saga University
Saga City, Japan
Shuji Kawakami
JAXA, Japan
Tsukuba City, Japan
HirofumiOhyama
JAXA, Japan
Tsukuba City, Japan
AbstractSensitivity analysis of Fourier Transformation
Spectrometer: FTS against observation noise on retrievals of
carbon dioxide and methane is conducted. Through experiments
with real observed data and additive noise, it is found that the
allowable noise on FTS observation data is less than 2.1x10
-5
if
estimation accuracy of total column carbon dioxide and methane
is better than 1(%).
Keywords-FTS; carbon dioxide, methane, sensitivity analysis, error
analysis.
I. INTRODUCTION
Greenhouse gases Observing SATellite: GOSATcarries
TANSO CAI for clouds and aerosol particles observation of
mission instrument and TANSO FTS
1
: Fourier Transformation
Spectrometer
2
for carbon dioxide and methane retrieving
mission instrument [1]. In order to verify the retrieving
accuracy of two mission instruments, ground based laser radar
and TANSO FTS are installed. The former is for TANSO CAI
and the latter is for FTS, respectively. One of the other
purposes of the ground-based laser radar and the ground-based
FTS is to check sensor specifications for the future mission of
instruments to be onboard future satellite with extended
mission. Although the estimation methods for carbon dioxide
and methane are well discussed [2]-[6], estimation method
which takes into account measurement noise is not analyzed
1
http://www.jaxa.jp/projects/sat/gosat/index_j.html
2
http://ja.wikipedia.org/wiki/%E3%83%9E%E3%82%A4%E3%82%B1%E3%
83%AB%E3%82%BD%E3%83%B3%E5%B9%B2%E6%B8%89%E8%A8
%88
yet. Therefore, error analysis for additive noise on estimation
accuracy is conducted.
In order to clarify requirement of observation noises to be
added on the ground-based FTS observation data, Sensitivity
analysis of the ground-based FTS against observation noise on
retrievals of carbon dioxide and methane is conducted.
Experiments are carried out with additive noise on the real
acquired data of the ground-based FTS. Through retrievals of
total column of carbon dioxide and methane with the noise
added the ground-based FTS signals, retrieval accuracy is
evaluated. Then an allowable noise on the ground-based FTS
which achieves the required retrieval accuracy (1%) is reduced.
The following section describes the proposed sensitivity
analysis followed by some experiments. Then concluding
remarks with some discussions is followed by.
II. PROPOSED SENSITIVITY ANALYSIS
A. Ground-based FTS
Figure 1 shows schematic configuration of the ground-
based FTS which is originated from Michelson Interference
Measurement Instrument. Light from the light source divided
in to two directions, the left and the forward at the dichotic
mirror of half mirror. The left light is reflected at the fixed hold
mirror and reaches to the half mirror while the forward light is
reflected at the moving mirror and reaches at the half mirror.
Then interference occurs between the left and the forward
lights. After that interference light is detected by detector.
Outlook of the ground-based FTS is shown in Figure2.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
59 | P a g e
www.ijacsa.thesai.org
Figure 3 (a) shows an example of the interferogram
3
(interference light detected by the detector of the ground-based
FTS). By applying Fourier Transformation to the interferogram,
observed Fourier spectrum is calculated as shown in Figure 3
(b). When the ground-based FTS observes the atmosphere, the
observed Fourier spectrum includes absorptions due to
atmospheric molecules and aerosol particles. By comparing to
the spectrum which is derived from the radiative transfer code
with atmospheric parameters, atmospheric molecules and
aerosol particles are estimated.
Figure 1 Michelson Interference Measurement Instrument
Figure 2 Outlook of the FTS used
3
http://en.wikipedia.org/wiki/Interferometry
(a)Interferogram
(b)Fourier spectrum
Figure 3 Examples of interferogram and Fourier spectrum when FTS observes
the atmosphere
B. Principle for Carbon Dioxide and Methane Retrievals with
TANSO FTS Data
Figure 4 shows a principle of the retrieval method for
atmospheric continuants using GOSAT/TANSO data. Figure 4
(a) shows Top of the Atmosphere: TOA radiance in the
wavelength ranges from 500 to 2500nm (visible to shortwave
infrared wavelength regions). There are three major absorption
bands due to oxygen (760-770nm), carbon dioxide and
methane (1600-1700nm), and water vapor and carbon dioxide
(1950-2050nm) as shown in Figure 4 (b), (c), and (d),
respectively. These bands are GOSAT/TANSO spectral bands,
Band 1 to 3, respectively. In addition to these, there is another
wide spectrum of spectral band, Band 4 as shown in Figure
4(e) which covers from visible to thermal infrared regions..
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
60 | P a g e
www.ijacsa.thesai.org
(a)TOA radiance
(b)Band 1
(c)Band 2
(d)Band 3
(e)Band 4
Figure 4 Example of TOA radiance and absorption bands as well as spectral
bands of GOSAT/TANSO instrument
III. EXPERIMENTS
A. Ground-based FTS Data Used
The ground-based FTS data used for experiments are
acquired on November 14 and December 19 2011. Figure 5
shows the interferograms derived from the acquired the
ground-based FTS data.
Figure 5 Example of interferograms used for experiments.
B. Experimental Method
Observation noise is included in the observed
interferograms. In addition to the existing noise, several levels
of additional noises which are generated by random number
generator of Messene Twister with zero mean and several
standard deviations is added on to the iterferograms as shown
in Figure 6.
Figure 6 Method for adding the noises to the acquired interferograms
C. Experimental Results(Noise Added Interferograms and
Fourier Spectra)
Figure 7 shows noise added interferograms and the Fourier
spectra derived from the noise added interferograms.Added
noises ranges from 0 to 1x10
-3
. 1/1000 of standard deviation of
noise (zero mean) against signal level is added to the single
channel of signal level in maximum. Vertical axis shows signal
level and horizontal axis shows optical pass length difference
for Figure 7 (a), (c), (e), (g), (i), (k), and (m) while vertical axis
shows Fourier spectrum (amplitude) and horizontal axis shows
frequncy (or wave number) for the rest of Figure 7.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
61 | P a g e
www.ijacsa.thesai.org
As shon in Figure 7, Fourier spectra is degrading in
accordance with incresing of noise obviously. Although the
additive noises are not clearly seen, it is slightly recognizable
the noise through comparison between Figure (b) and (o).
(a)Interferogram(0)
(b)Fourier spectrum(0)
(c)Interferogram(2.5x10
-5
)
(d)Fourier spectrum(2.5x10
-5
)
(e)Interferogram(5x10
-5
)
(f)Fourier spectrum(5x10
-5
)
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
62 | P a g e
www.ijacsa.thesai.org
(g)Interferogram(7.5x10
-5
)
(h)Fourier spectrum(7.5x10
-5
)
(i)Interferogram(1x10
-4
)
(j)Fourier spectrum(1x10
-4
)
(k)Interferogram(2x10
-4
)
(l)Fourier spectrum(2x10
-4
)
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
63 | P a g e
www.ijacsa.thesai.org
(m)Interferogram(5x10
-4
)
(n)Fourier spectrum(5x10
-4
)
(o)Interferogram(1x10
-3
)
(p)Fourier spectrum(1x10
-3
)
Figure 7 Noise addediInterferograms and Fourier spectrum derived from the
interferograms
D. Experimental Results (Retrieval Error)
Figure 8 (a) shows methane retrieved results. Horizontal
axis shows standard deviation of additive noise and vertical
axis shows retrieved methane amount in unit of ppm (percent
per million). Figure 8 (b) shows retrieved error (retrieved
methane amount from noise added interferogram minus
retrieved methane amount from noise free interferogram).
Meanwhile, Figure 8 (c) shows carbon dioxide retrieved
results. Horizontal axis shows standard deviation of additive
noise and vertical axis shows retrieved carbon dioxide amount
in unit of ppm (percent per million). Figure 8 (d) shows
retrieved error (retrieved methane amount from noise added
interferogram minus retrieved carbon dioxide amount from
noise free interferogram). GFIT of retrieval software code is
used for both estimations of total column carbon dioxide and
methane contents in the atmosphere [6].
(a)Retrieved methane amount
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
64 | P a g e
www.ijacsa.thesai.org
(b)Retrieved error
(c)Retrieved carbon dioxide amount
(d)Retrieved error
Figure 8 Retrieved carbon dioxide and methane amount with noise added and
noise free iterferograms together with retrieved errors.
From these figures, it is concluded as follows,
Allowable retrieval errors for methane and carbon dioxide
are 0.02 ppm and 4 ppm, respectively. Therefore, acceptable
noise level on FTS interferogram is less than 2.1 x 10
-5
.
IV. CONCLUSION
Sensitivity analysis of Fourier Transformation
Spectrometer: FTS against observation noise on retrievals of
carbon dioxide and methane is conducted. Through
experiments with real observed data and additive noise, it is
found that the allowable noise on FTS observation data is less
than 2.1x10
-5
if estimation accuracy of total column carbon
dioxide and methane is better than 1% (allowable retrieval
errors for methane and carbon dioxide are 0.02 ppm and 4 ppm,
respectively. These correspond to 1% error for both methane
and carbon dioxide retrievals).
REFERENCES
[1] http://repository.tksc.jaxa.jp/dr/prc/japan/contents/AA0065136000/6513
6000.pdf?IS_STYLE=jpn( Accessed on September 14 2012)
[2] Clough, S. A., et al. [2006], IEEE Trans. Geosci. Remote Sens., 44,
13081323.
[3] Hase, F., et al. [2004], J. Quant. Spectrosc. Radiat. Transfer, 87(1), 25
52.
[4] Rodgers, C. D. [2000], Inverse Methods for Atmospheric Sounding:
Theory and Practice.
[5] Tikhonov, A. [1963], Dokl. Acad. Nauk SSSR, 151, 501504.J. Clerk
Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., vol. 2.
Oxford: Clarendon, 1892, pp.6873.
[6] Wunch, D, et al., [2001], The Total Carbon Cloumn Obesrving Network
(TCCON), Phil., Tarns. R. Soc. A., 369, 2087-2112.
AUTHORS PROFILE
Kohei Arai, He received BS, MS and PhD degrees in 1972, 1974 and 1982,
respectively. He was with The Institute for Industrial Science, and Technology
of the University of Tokyo from 1974 to 1978 also was with National Space
Development Agency of Japan (current JAXA) from 1979 to 1990. During
from 1985 to 1987, he was with Canada Centre for Remote Sensing as a Post
Doctoral Fellow of National Science and Engineering Research Council of
Canada. He was appointed professor at Department of Information Science,
Saga University in 1990. He was appointed councilor for the Aeronautics and
Space related to the Technology Committee of the Ministry of Science and
Technology during from 1998 to 2000. He was also appointed councilor of
Saga University from 2002 and 2003 followed by an executive councilor of
the Remote Sensing Society of Japan for 2003 to 2005. He is an adjunct
professor of University of Arizona, USA since 1998. He also was
appointedvice chairman of the Commission A of ICSU/COSPAR in 2008.
He wrote 30 books and published 332 journal papers.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
65 | P a g e
www.ijacsa.thesai.org
Sensitivity Analysis for Water Vapor Profile
Estimation with Infrared: IR Sounder Data Based on
Inversion
Kohei Arai
1
Graduate School of Science and Engineering
Saga University
Saga City, Japan
Abstract Sensitivity analysis for water vapor profile estimation
with Infrared: IR sounder data based on inversion is carried out.
Through simulation study, it is found that influence due to
ground surface relative humidity estimation error is greater than
that of sea surface temperature estimation error on the water
vapor vertical profile retrievals.
Keywords- I R sounde; error budget analysis; MODTRAN; air
temperature; relative humidity.
I. INTRODUCTION
Air-temperature and water vapor profiles are used to be
estimated with Infrared Sounder data [1]. One of the problems
on retrieving vertical profiles is its retrieving accuracy. In
particular, estimation accuracy of air-temperature and water
vapor at tropopause
1
altitude is not good enough because there
are gradient changes of air-temperature and water vapor
profile in the tropopause so that observed radiance at the
specific channels are not changed for the altitude.
In order to estimate air-temperature and water vapor, least
square based method is typically used. In the process, Root
Mean Square: RMS difference between observed radiance and
calculated radiance with the designated physical parameters
are minimized. Then the designated physical parameters
including air-temperature and water vapor at the minimum
RMS difference are solutions.
Typically, Newton-Raphson method
2
which gives one of
local minima is used for minimization of RMS difference.
Newton-Raphson needs first and second order derivatives,
Jacobean and Hessian at around the current solution. It is not
easy to formularize these derivatives analytically. The
proposed method is based on Levenberg Marquardt: LM of
non-linear least square method
3
. It uses numerically calculated
first and second order derivatives instead of analytical based
derivatives. Namely, these derivatives can be calculated with
radiative transfer model based radiance calculations. At
around the current solution in the solution space, directional
derivatives are calculated with the radiative transfer model.
1
http://en.wikipedia.org/wiki/Tropopause
2
http://en.wikipedia.org/wiki/Newton's_method
3
http://en.wikipedia.org/wiki/Levenberg%E2%80%93Marquardt_algorithm
The proposed method is validated for air-temperature and
water vapor profile retrievals with Infrared: IR sounder
4
data
derived from AQUA/AIRS [2]-[7]. A comparison of retrieving
accuracy between Newton-Raphson method and the proposed
method based on LM method [8] is made in order to
demonstrate an effectiveness of the proposed method in terms
of estimation accuracy in particular for the altitude of
tropopause [9]. Global Data Assimilation System: GDAS
5
data
of assimilation model derived 1 degree mesh data is used as
truth data of air-temperature and water vapor profiles. The
experimental data show that the proposed method is superior
to the conventional Newton-Raphson method.
Atmospheric sounding can be improved by using the high
spectral resolution of sounder, such as AIRS, IASI, instead of
HIRS, TOVS used in the past three and half decades[10]-[13].
Their sensors have large number of channels, and have large
amount of atmospheric sounding information in the
measurement data. But for the retrieval of air temperature
profile, it is not practical nor an advantage to use all spectral
points. Therefore it is important for this work to eliminate
those channels whose information does not add to the final
retrieval accuracy and even before for the sake of efficiency,
those channels potentially contaminated by solar radiation or
significantly affected by other gases (not required for
temperature profiling.
Sensitivity analysis of water vapor profile retrieval against
surface relative humidity and sea surface temperature is
carried out. Multi channels utilized retrieval method is
assumed for water vapor profile estimation. Then influences
due to surface relative humidity and sea surface temperature
on water vapor profile retrieval accuracy are investigated.
The following section describes background of the
research as well as assumptions for sensitivity analysis
followed by detailed method for sensitivity analysis. Secondly,
experiments are described with simulation results followed by
conclusion and some discussions.
4
http://en.wikipedia.org/wiki/Atmospheric_Infrared_Sounder
5
http://www.mmm.ucar.edu/mm5/mm5v3/data/gdas.html
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
66 | P a g e
www.ijacsa.thesai.org
II. RESEARCH BACKGROUND AND METHIOD FOR
SENSITIVITY ANALYSIS
A. Infrared: IR Sounder
Sounder instruments on Earth observation satellite are for
ozone, water vapor and air temperature vertical profile
retrievals with very narrow bandwidth of wavelength channels
which measure radiation from the Earth. There are many
absorption bands due to atmospheric molecule in infrared and
microwave wavelength regions. Atmospheric optical depths
are varied with altitude above sea level. Therefore, vertical
profile of density of the atmospheric molecules can be
estimated results in ozone, water vapor, air temperature
profiles can be retrieved.
Table 1 shows major purposes, absorption molecules, and
the most appropriate wavelength or frequency regions.
TABLE I. MAJOR PURPOSES, ABSORPTION MOLECULES, AND THE
MOST APPROPRIATE WAVELENGTH OR FREQUENCY REGIONS FOR SOUNDER
Wavelength_
Region
Wavelength
_Frequency
Absorption_
Molecule
Major_Purpose
Infrared 6.3um H2O Water_vapor
Infrared 9.6um O3 Ozone
Infrared 15um CO2 Air
Temperature
Microwave 22.235GHz H2O Water_vapor
Microwave 60GHz O2 Air
Temperature
B. Profile Retrieval Method
IR sounder data derived radiance, R
0i
and estimated
radiance, R
i
based on atmospheric model with the parameters
of air-temperature and relative humidity.
(1)
Thus the geophysical parameter at when the difference of
radiance reaches to the minimum is estimated. Widely used
and accurate enough atmospheric software code of
MODTRAN
6
is used in the proposed method. Solution update
equation of Newton-Raphson method is expressed by equation
(2).
X
n+1
=X
n
- H
-1
J(x
n
) (2)
where H denotes Hessian matrix
7
which consists of second
order derivatives (residual square error, S which is represented
by equation (1) by geophysical parameter, air-temperature and
relative humidity). Also J denotes Jacobean
8
which consists of
the first derivative of vectors (S by geophysical parameters, x).
6
http://en.wikipedia.org/wiki/MODTRAN
7
http://en.wikipedia.org/wiki/Hessian_matrix
8
http://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant
On the other hand, solution update equation is expressed
by equation (3).
X
n+1
=X
n
+ (J
T
J + I)
-1
J
T
(R
i
- R
0i
) (3)
The first derivative is represented by equation (4) while the
second order derivative is expressed by equation (5),
respectively.
(4)
(5)
On the other hand, the first and second order derivatives of
R with x are expressed by equation (6) and (7), respectively.
(6)
(7)
In the proposed method, these derivatives are calculated
numerically. 2% changes of relative humidity is taken into
account for calculation of derivative R and S while 0.5K
changes of air-temperature is also considered for calculation of
derivative of R and S. Thus the geophysical parameter at when
the difference of radiance reaches to the minimum is estimated.
C. Weighting Function
Based on radiative transfer model, upwelling radiance
from the atmosphere is greater than ground surface radiance in
accordance with absorption molecule density in the case that
the atmosphere contains a plenty of absorption molecules.
Weighting function which represents contributions to
upwelling radiance at all the altitudes can be determined by
optical depth. Namely,
t
t
}
c
c
= =
=
}
e e z tr
dp z K
dz k
p
e
z
z v
p
) (
0
) (
) (
) , (
(8)
where K denotes weighting function, tr denotes
transparency, denotes optical depth, and p denotes
atmospheric pressure while z denotes altitude. Also denotes
molecule density. For instance, weighting function for 6.3, 7.3,
7.5 um wavelength of H2O absorption channels are shown in
Figure 1.
S=
i= 0
n
R
i
R
0i
2
S
x
i
= 2
k= 1
n
R
k
R
0k
R
0k
x
i
2
S
x
i
x
j
= 2
k= 1
n
[
R
0k
x
i
R
0k
x
j
R
k
R
0k
2
R
0k
x
i
x
j
]
R
0k
x
i
MODTRAN
2
R
0k
x
i
x
j
R
0k
x
i
R
0k
x
j
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
67 | P a g e
www.ijacsa.thesai.org
Figure 1 Weighting Function for 6.3,7.3 and 7.5m
D. Method for Sensitivity Analysis
Assuming MODTRAN is truth, water vapor profile
retrieval error is evaluated by which some errors are added to
surface air temperature and relative humidity. Then it is
possible to clarify relations between water vapor profile
retrieval error caused by the errors on the surface air
temperature and relative humidity. In the process, steepest
descending method of non-linear optimization method for
single variable is used for determination of optimum solution
of water vapor profile. Equation (9) and (10) shows steepest
descending method.
. ) (
2
c s
A
I toa
I (9)
n
n
dRH
dI
n n
RH RH + =
+1
(10)
Thus the retrieval error can be evaluated.
III. EXPERIMENTS
A. Experimantal Parameters
All the parameters required for experiments is listed in the
Table 2. Residual errors of water vapor estimation for surface
temperature and relative humidity are shown in Figure 2 and 3,
respectively. On the other hands, retrieved water vapor profile
is shown in Figure 4. 5 and 10 (%) of errors are added to
the default relative humidity in this case.
TABLE II. PARAMETERS FOR EXPERIMENTS
Atmospheric_Model Mid.Latitude_Summer
Wavelength 6.7,7.3,7.5um
Default_Relative_Humidity
0%,10%,30%,
Default_Air_Temperature
0,1,3K
Altitude_at_Top_of_Atmosphere 100km
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
68 | P a g e
www.ijacsa.thesai.org
Figure 2 Sensitivity of surface air temperature
Figure 3 Sensitivity of relative humidity
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
69 | P a g e
www.ijacsa.thesai.org
Figure 4 Estimated water vapor profile
B. Summerized Experiemntal Results
Through the comparison between Figure 2 and 3, it is
found that influence due to error on the relative humidity is
greater than that due to surface air temperature. It is
reasonable that relative humidity is closely linked to water
vapor profile rather than surface air temperature.
Estimated water vapor profile shows also reasonable. In
the troposphere, humidity is decreasing smoothly. At around
tropopause, there is sharp drop of humidity. This is because
that troposphere and stratosphere are interacted weakly each
other. Estimated humidity profile shows that 10% of error
which is added to the default humidity is going to be small and
greater depending on the altitude. Also it shows a difficulty on
water vapor profile retrieval at around tropopause.
IV. CONCLUSION
Sensitivity analysis for water vapor profile estimation with
Infrared: IR sounder data based on inversion is carried out.
Through simulation study, it is found that influence due to
ground surface relative humidity estimation error is greater
than that of sea surface temperature estimation error on the
water vapor vertical profile retrievals.
It is found that influence due to error on the relative
humidity is greater than that due to surface air temperature. It
is reasonable that relative humidity is closely linked to water
vapor profile rather than surface air temperature.
ACKNOWLEDGMENT
The author would like to thank Mr. Wang King and Dr. Xing
Ming Liang, for their effort to experiments.
REFERENCES
[1] Kohei Arai, Lecture Note on Remote Sensing, Morikita-Shuppan
publishing Co. Ltd, 2004.
[2] NASA/JPL, "AIRS Overview". NASA/JPL.
http://airs.jpl.nasa.gov/overview/overview/.
[3] NASA "Aqua and the A-Train". NASA.
http://www.nasa.gov/mission_pages/aqua/.
[4] NASA/GSFC "NASA Goddard Earth Sciences Data and Information
Services Center". NASA/GSFC.
http://disc.gsfc.nasa.gov/AIRS/data_products.shtml.
[5] NASA/JPL "How AIRS Works". NASA/JPL.
http://airs.jpl.nasa.gov/technology/how_AIRS_works.
[6] NASA/JPL "NASA/NOAA Announce Major Weather Forecasting
Advancement". NASA/JPL.
http://jpl.nasa.gov/news/news.cfm?release=2005-137.
[7] NASA/JPL "New NASA AIRS Data to Aid Weather, Climate
Research". NASA/JPL.
http://www.jpl.nasa.gov/news/features.cfm?feature=1424.
[8] Kohei Arai and Naohisa Nakamizo, Water vapor and air-temperature
profile estimation with AIRS data based on Levenberg -Marquardt,
Abstract of the 50th COSPAR(Committee on Space Research/ICSU)
Congress, A 3.1-0086-08,995, Montreal, Canada, July, 2008
[9] Kohei Arai and XingMing Liang, sensitivity analysis for air temperature
profile estimation method around the tropopause using simulated
AQUA/AIRS data, Advances in Space Research, 43, 3, 845-851, 2009.
[10] Jeffrey, A.L., Elisabeth, W., and Gottfried, K., Temperature and
humidity retrieval from simulated Infrared Atmospheric Sounding
Interferometer(IASI) measurements, J. Geop. Res., 107, 1-11, 2002.
[11] Rodgers, C.D., Information content and optimization of high spectral
resolution measurements. In Optical Spectroscopic Techniques and
Instrumentation for Atmospheric and Space Research , vol. 2830, pp.
136-147, Int. Soc. For Optical Eng., Bellingham, Wash.,1996.
[12] Rodgers, C.D., Inverse method for atmospheres: theory and practice,
World Sci., Singapore, 2000.
[13] Liou, K.N., An introduction to atmospheric radiation. Elsevier Science,
USA, 2002.
AUTHORS PROFILE
Kohei Arai, He received BS, MS and PhD degrees in 1972, 1974 and 1982,
respectively. He was with The Institute for Industrial Science and Technology
of the University of Tokyo from April 1974 to December 1978 also was with
National Space Development Agency of Japan from January, 1979 to March,
1990. During from 1985 to 1987, he was with Canada Centre for Remote
Sensing as a Post Doctoral Fellow of National Science and Engineering
Research Council of Canada. He moved to Saga University as a Professor in
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
70 | P a g e
www.ijacsa.thesai.org
Department of Information Science on April 1990. He was a councilor for the
Aeronautics and Space related to the Technology Committee of the Ministry
of Science and Technology during from 1998 to 2000. He was a councilor of
Saga University for 2002 and 2003. He also was an executive councilor for
the Remote Sensing Society of Japan for 2003 to 2005. He is an Adjunct
Professor of University of Arizona, USA since 1998. He also is Vice
Chairman of the Commission A of ICSU/COSPAR since 2008. He wrote
30 books and published 322 journal papers.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No.11, 2012
71 | P a g e
www.ijacsa.thesai.org
Wavelet Based Change Detection for Four
Dimensional Assimilation Data in Space and Time
Domains
Kohei Arai
1
Graduate School of Science and Engineering
Saga University
Saga City, Japan
Abstract Method for time change detection of four dimensional
assimilation data by means of wavelet analysis is proposed
together with spatial change detection with satellite imagery data.
The method is validated with assimilation data and the same
scale of satellite imagery data. Experimental results show that the
proposed method does work well in visually.
Keywords- Assimilation; change detection; wavelet analysis.
I. INTRODUCTION
Data assimilation is useful for numerical prediction of
global issues including global warming, weather forecasting,
and climate changes [1]-[3]. In particular, four dimensional
assimilations, three dimensional space and time dimension,
space and time domains, is useful. There are some problems
on the current four dimensional assimilations as follows,
(1)Prediction accuracy is not good enough,
(2)Precision of input data is not good enough,
(3)Not so stable solution can be derived
(4)Boundary conditions cannot be given properly,
(5)Change detection performance is not good enough.
In order to overcome the aforementioned problems, (3) and
(5), wavelet analysis is introduced here in this paper. Namely,
wavelet analysis based prediction is attempted for getting
stable solutions. Also, wavelet analysis based space and time
change detections are attempted as well.
One of the examples is shown in this paper. Time series of
three dimensional air temperature and relative humidity are
created with assimilation model of which earth observation
satellite data derived profiles. These are called as state
variables. Space and time changes on the state variables are
detected in terms of the change locations and change amount.
Then extremely hot summer in Japan in 2010 is predicted. It is
mainly caused by the jet stream winding in northern
hemisphere and the location and magnitude of high pressure
system situated in the Pacific Ocean areas. Such this result
could be come out from the proposed method.
The following section describes the research background
together with the proposed wavelet based method followed by
experiments conducted. Then conclusion with some
discussions is followed.
II. PROPOSED METHOD
A. Four Dimensional Assimilation
Three dimensional box type of space is assumed for each
district in assimilation. In the space, remaining variables, mass,
energy and angular momentum are maintained and balanced.
Such remaining variable X changed in time domain, X/t,
and is same as supplied variables from the top (Ftop), from the
surface (Fsfc), and from the horizontal directions (Fside) as
shown in Figure 1.
Figure 1 Illustrative view of the assimilation model
The space is divided with meshes in vertical and horizontal
directions as shown in Figure 2. In the meshes, variables are
maintained and balanced. For instance, input energy to the
mesh is totally equal to output energy from the mesh. Thus
partial differential equation can be formulated. Because partial
differentiation in space is partial differentiation in time, then
prediction can be done. Therefore, assimilation can be done.
This is called assimilation model.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No.11, 2012
72 | P a g e
www.ijacsa.thesai.org
Figure 2 Assimilation model with meshed space and time function of variables.
Variables, input data for assimilation model are obtained
from the earth observation satellite data, in particular, infrared
sounder as well as microwave sounder data. Table 1 and 2
shows example of the input data, variables, for assimilation
model derived from the specific sensor data.
Table 1 Example of the variables, input data for assimilation model and data
sources mainly from the satellite data.
Table 2 Example of sensor name and satellite name of the data sources for
assimilation model.
The variables are formed time series of spatially aligned
three dimensional geophysical data as shown in Figure 3.
Layered two dimensional geophysical data, atmospheric
pressure, air temperature, relative humidity, etc. aligned along
with time direction. Using assimilation model with satellite
data derived geophysical data as input data, such this four
dimensional geophysical data are created.
Figure 3 Four dimensional assimilation data
B. Wavelet Analysis Based Problem Solving Method and
Change Detections in Space and Time Domains
One dimensional wavelet transformation can be expressed
in equation (1).
F =Wf (1)
where f denotes geophysical data in space or time domain
while W denotes wavelet transformation matrix. Since, [G
ijk
]
t
=G
kji
, then three dimensional wavelet transformation can be
written as follows,
F =[W
k
[W
j
[W
i
G
ijk
]
t
]
t
]
t
(2)
Not only three dimensional wavelet transformation, but
also n dimensional wavelet transformation can be expressed as
follows,
F=[W
n
[W
n-1
[[W
1
G
12n
]
t
]
t
..]
t
(3)
This wavelet transformation is called as decompositions.
One of the specific features of the wavelet transformation
is that the original geophysical data G can be reconstructed
with the calculated wavelet frequency components, F,
perfectly. This reconstruction processes is called inverse
wavelet transformation.
Also, wavelet analysis is useful for solving integral
equations as well as partial differential equation [ ]. Therefore,
wavelet analysis based method can be used for solving the
partial differential equations in the assimilation model.
Furthermore, wavelet analysis based Multi Resolution
Analysis: MRA is used for change detections in space and
time domains. Using MRA, wavelet frequency components
are calculated. If the reconstruction is applied to the frequency
components without low frequency component, then spatial
and time changes are extracted because the reconstruction is
made with high frequency components only.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No.11, 2012
73 | P a g e
www.ijacsa.thesai.org
C. Proposed MRA Based Change Detection Method
Change detection performance can be evaluated as follows,
(4)
where J1 denote cost function which represent change
detection performance. Namely, Root Mean Square: RMS
difference between the original and the reconstructed images
is a good measure for evaluation of change detection
performance. The procedure of the proposed method is as
follows,
(1) Wavelet transformation is applied to the original image,
(2) Wavelet frequency component, coefficients are sorted in
accordance with the coefficient values,
(3) Remove the first n coefficients form the minimum
(4) Reconstruct image with the rest of coefficients of
component
(5) Determine an optimum n of which the changes of RMS
difference is saturated
III. EXPERIMENTS
A. Change Detection Using Subtraction
One of the examples of the change detection in space and
time domains is shown. That is cloud movement analysis with
Geostationary Meteorological Satellite: GMS imagery data.
Figure 4 shows (a) original time series of GMS imagery data,
(b) binalized images, and (3) detected changes in space and
time domains by using subtraction between adjacent binalized
images Also change detection and object movement analysis
can be done with optical flow model as well. Thus the clouds
movement can be analyzed. It is sensitive to the threshold
processes. Namely, detected changes depend on binalized
results.
B. Change Detection of Air Temperaure and Relative
Humidity Profiles in Space and Time Domains Using the
Aforementioned Proposed MRA Based Method
One of examples of change detections based on the
proposed wavelet MRA analysis based method is shown.
Figure 5 shows monthly average of the relative humidity
profiles which were acquired in February and August in 1992.
The profile can be formed with 8 layers in altitude direction, z,
and with 1 degree resolution in horizontal directions, x and y.
Also the relative humidity ranges from 0.005 to 20 g/Kg.
Change detection in time domain can be done with the
reconstruction with all wavelet frequency components except
LLLL component. LLLL denotes low frequency components
of x, y, z, and time directions. However, it is quite obvious
that the resultant image shows detected changes in space and
time domains as shown in Figure 5 (c).
(a)Original (b)Binalized (c)Changes
Figure 4 Example of change detection in space and time domains
Also, the resultant image seems that not so good
performance in terms of detected changes in the image. Then
reconstruction with all wavelet frequency components except
LLLL component together with the first n wavelet coefficients
is tried. In this connection, 100% -% of wavelet coefficients
are removed. corresponds to data compression ratio.
(a)Relative humidity in August 1992
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No.11, 2012
74 | P a g e
www.ijacsa.thesai.org
(b)Relative humidity in February 1992
(c)Detected changes with all frequency components except LLLL component
Figure 5 Detected changes by means of the MRA based method, reconstruction
of original image without LLLL component
C. Evaluation of Data Compression Ratio
As the aforementioned in the previous section, change
detection performance is highly correlated with the RMS
difference between the original and the reconstructed images.
Therefore, change detection performance for the proposed
method is evaluated with calculation of RMS difference as a
function of data compression ratio.
Almost homogeneous ocean area is selected for the
evaluation. In this Pacific Ocean area which is shown in
Figure 6, relative humidity is relatively homogeneous.
Relation between RMS difference and data compression ratio
is shown in Figure 7. Although RMS error for the data
compression ratio of 100% (which means no data compression
is applied) is obviously zero, it is getting large in accordance
with decreasing of data compression ratio. For instance, RMS
error for data compression ratio of 10% is 0.4. This implies
that 0.4 of RMS error has to be accepted for 1/10 of data
compression.
Figure 6 Intensive study area for RMS difference evaluations
Figure 7 Relation between RMS difference and data compression ratio
IV. CONCLUSION
Method for time change detection of four dimensional
assimilation data by means of wavelet analysis is proposed
together with spatial change detection with satellite imagery
data. The method is validated with assimilation data and the
same scale of satellite imagery data. Experimental results
show that the proposed method does work well in visually.
Also the relation between RMS error and data compression
ratio is clarified.
ACKNOWLEDGMENT
The author would like to thank Dr. Kaname Seto for his
effort to conduct experiments.
REFERENCES
[1] R. Daley, Atmospheric data analysis, Cambridge University Press,
1991.J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd
ed., vol. 2. Oxford: Clarendon, 1892, pp.6873.
[2] Ide, K., P. Courtier, M. Ghil, and A. C. Lorenc (1997) Unified Notation
for Data Assimilation: Operational, Sequential and Variational Journal
of the Meteorologcial Society of Japan, vol. 75, No. 1B, pp. 181189
[3] John M. LEWIS; S. Lakshmivarahan, Sudarshan Dhall, "Dynamic Data
Assimilation : A Least Squares Approach", Encyclopedia of
Mathematics and its Applications 104, Cambridge University Press,
2006
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No.11, 2012
75 | P a g e
www.ijacsa.thesai.org
AUTHORS PROFILE
Kohei Arai, He received BS, MS and PhD degrees in 1972, 1974 and 1982,
respectively. He was with The Institute for Industrial Science, and Technology
of the University of Tokyo from 1974 to 1978 also was with National Space
Development Agency of Japan (current JAXA) from 1979 to 1990. During
from 1985 to 1987, he was with Canada Centre for Remote Sensing as a Post
Doctoral Fellow of National Science and Engineering Research Council of
Canada. He was appointed professor at Department of Information Science,
Saga University in 1990. He was appointed councilor for the Aeronautics and
Space related to the Technology Committee of the Ministry of Science and
Technology during from 1998 to 2000. He was also appointed councilor of
Saga University from 2002 and 2003 followed by an executive councilor of
the Remote Sensing Society of Japan for 2003 to 2005. He is an adjunct
professor of University of Arizona, USA since 1998. He also was appointed
vice chairman of the Commission A of ICSU/COSPAR in 2008. He wrote
30 books and published 332 journal papers.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
76 | P a g e
www.ijacsa.thesai.org
Method for Image Source Separation by Means of
Independent Component Analysis: ICA, Maximum
Entory Method: MEM, and Wavelet Based Method:
WBM
Kohei Arai
1
Graduate School of Science and Engineering
Saga University
Saga City, Japan
AbstractMethod for image source separation based on
Independent Component Analysis: ICA, Maximum Entropy
Method: MEM, and Wavelet Based Method: WBM is proposed.
Experimental results show that image separation can be done
from the combined different images by using the proposed
method with an acceptable residual error.
Keywords-Blind separation; image separation; cucktail party effect;
I CA; MEM; wavelet analuysis.
I. INTRODUCTION
There are some technologies and systems which allow
separate the specific speaker from the mixed image data which
is acquired at some noisy circumstances. The related
application studies are conducted, in particular, for TV
meeting environment, image recognition systems, and digital
hearing aid system etc. In particular, the microphone array
system as well as Independent Component Analysis (ICA)
base approach [1] is focused. Microphone array allows
enhance a target image from the mixed images suppressing
noises and taking into account the phase difference among the
imagesources which corresponds to the distance between the
microphone and the location of the imagesources. There is a
delay sum [2] and an adaptation [3] types array microphone
systems. These types of array microphone allow direct the
beam to the desired direction of the target of interest.
ICA is the method which allows the blind separation based
on the imagesources are isolated each other. ICA based
method configures reconstruction filter maximizing Kullback-
Leibler Divergence [4] for separation of target imagesources
in concern from the acquired images [5], [6].
The blind separation method with entropy maximization
rule is proposed [7]. In order to improve separability among
the possible imagesources, high frequency component derived
from the wavelet based Multi Resolution Analysis (MRA)
[8],[9] is used in the entropy maximization rule utilized
method. Due to the fact that the MRA based separability
improvement is not good enough, further improvement is
required.
Blind separation method which is proposed here is based
on the MRA based separability improvement. A single level of
MRA is not good enough for characteristics enhancement of
each imagesource. Therefore, the level is considered as a
parameter for MRA utilized blind separation [10]. An
appropriate level is found for improvement of separability then
blind separation is applied. It is found that the proposed
method can achieve 4 to 8.8% of separability improvement for
the case of the number of speakers is 2, 4 and 8 [11].
The following section describes the proposed method
followed by some experiments with the mixed images. Then
conclusions and some discussions is followed.
II. PROPOSED MTHEOD
A. Image Mikxing Model
Assuming the original image,x
i
(t) is mixed with several
images, for instance s
1
(t) and s
2
(t) as shown in equation (1).
(1)
where a
i1
and a
i2
are weighting factor, mixing ratio. Also it
is assumed that s
1
(t) and s
2
(t)are mutually independent.
Although this is an example of image mixing model for just
two images, it is possible to expand the number of images
which are to be mixed together.
B. Maximum Entropy Method
Maximum Entropy Method: MEM is used for learning
processes for maximizing combined entropy in two layered
neural network
(2)
(3)
wherex, y notes input and output signals, or images while
w denotes weighting coefficients of two layered neural
network. denotes a threshold. g(v) is called sigmoid function.
The combined entropy can be expressed with the equation
(4).
x
i
(t )= a
i1
s
1
(t )+ a
i2
s
2
(t )
y
i
= g(
k= 1
2
w
ik
x
k
i
)
g(v)= 1 e
v
/1+ e
v
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
77 | P a g e
www.ijacsa.thesai.org
(4)
whereJ denotes Jacobian matrix as shown in equation (5)
(5)
Because the second term of the equation (4) is constant,
the following equation (6) has to be maximized.
(6)
C. Steepest Descending Method
In order to maximize the equation (6), Steepest
Descending Method: SDM is used. Updating equations for w
and can be expressed with equations (7) and (8).
(7)
(8)
More precisely, updating equations can be re-written by
equation (9) and (10).
(9)
(10)
Thus the original images are separated from the combined
image with the proposed method.
D. Discrete Wavelet Transformation
In order to separate several original images from the given
combined images, Discrete Wavelet Transformation: DWT is
used. DWT allows decompose images with high and low
wavelet frequency components orthogonally. There are many
orthogonal base functions for DWT. One of the based
functions is Haar function. Figure 1 shows an example of the
original image of Lena in the SIDBA standard image database,
and decomposed image through WT. In the decomposed
image, LL image, low frequency component in horizontal
direction and low frequency component in vertical direction is
situated at the top left corner while LH, HL, and HH
components are situated at the top right, the bottom left, and
the bottom right corners, respectively.
Histogram of the low frequency component is shown in
Figure 2 (a) while that of the high frequency component is
shown in Figure 2 (b). This histogram is for the decomposed
image of Lena image. The histogram (a) is for LL component
while the histogram (b) is for HH component. The histogram
of the high frequency component looks similar to normal
distribution and is mainly concentrated at the wavelet
frequency component ranges from 0 to 10 wavelet frequency.
On the other hand, Histogram of the LL component does not
look like normal distribution.
(a)Original image
(b)Decomposed image
Figure 1 Example of original and decomposed images with DWT.
These histogram characteristics are common to original
image and differ from each other depending on natures of the
original images. Therefore, it is possible to separate original
images using the difference of histograms of the decomposed
image derived through DWT.
(a)Low Frequency Component
H( y)= ln(J) ln( p(x))
J = det (
y
1
/ x
1
y
1
/ x
2
y
2
/ x
1
y
2
/ x
2
)
I = ln(J)
w
ik
new
= w
ik
old
I / w
ik
i
new
=
i
old
I /
i
W
new
= W
old
+ [W
T
]
1
+ 2yx
T
new
=
old
+ 2y
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
78 | P a g e
www.ijacsa.thesai.org
(b)High Frequency Component
Figure 2 Histograms of high and low frequency components of the
decomposed image after DWT
III. EXPERIMENTS
A. Original Imaes Used
Figure 3 shows two original images, Lena, and Barbara in
the same standard image database, SIDBA.
(a)Lena
(b)Barbara
Figure 3 Two original images used for experiments
Also examples of mixed images are shown in Figure 4.
Mixing ratios of Figure 4 (a) and (b) are different. Mixing
ratio of Figure 4 (a) is 90% of Lena image and 10% of Barbara
image while that of Figure 4 (b) is 10% of Lena image and
90% of Barbara image. Image As shown in Figure 4, image
defects are found on the mixed, combined images.
Figure 5 shows DWT applied images of Figure 4 of
combined images.
(a)Combined image with 90% of Lena and 10% of Barbara
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
79 | P a g e
www.ijacsa.thesai.org
(b)Combined image with 90% of Barbara and 10% of Lena
Figure 4 Combined images with the different mixing ratios
(a)DWT image of Figure 3 (a) of combined image
(b)DWT image of Figure 3 (b) of combined image
Figure 5 DWT images of Figure 4.
B. Separated Images with the Different Mixing Ratios
Mixed image between Lena and Barbara images with the
different mixing ratios, 50%, 30%, and 10% are created. Using
the proposed method, separation of each original image is
attempted. Resultant images are shown in Figure 6, 7, and 8
for the mixing ratios, 50%, 30%, and 10%, respectively. As
shown in these Figures, image separation performance
depends on the mixing ratio. It is difficult to separate image
when the mixing ratio is around 50%. Meanwhile, image
separation performance is getting better in accordance with
decreasing of the mixing ratio. In particular, the separated
images for the mixing ratio of 10% are almost perfect, look
extremely similar to the original images.
(a)
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
80 | P a g e
www.ijacsa.thesai.org
(b)
Figure 6 Separated images in the case of which mixing ratio is 50%.
(a)
(b)
Figure 7 Separated images in the case of which mixing ratio is 30%.
(a)
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
81 | P a g e
www.ijacsa.thesai.org
(b)
Figure 8 Separated images in the case of which mixing ratio is 10%.
IV. CONCLUSION
Method for image source separation based on Independent
Component Analysis: ICA, Maximum Entropy Method: MEM,
and Wavelet Based Method: WBM is proposed. Experimental
results show that image separation can be done from the
combined different images by using the proposed method with
an acceptable residual error.
It is found that the image separation performance depends
on the mixing ratio. It is difficult to separate image when the
mixing ratio is around 50%. Meanwhile, image separation
performance is getting better in accordance with decreasing of
the mixing ratio. In particular, the separated images for the
mixing ratio of 10% are almost perfect, look extremely similar
to the original images.
ACKNOWLEDGMENT
The author would like to thank Takeshi Yoshida for his
effort to the experiments.
REFERENCES
[1] A. Hyvanrinen, J. Karhunen, E. Oja, Independent Component Analysis,
John Wiley and Sons, 2001.
[2] H. Nomura, Y. Kaneda, N. Kojima, Near Field Types of Microphone
Array, Journal of the Acoustical Society of Japan, 53, 2, 110-116, 1997.
[3] Y. Kaneda, Adaptive Microphone Array, Journal of the Institute of
Electronics, Information and Communication Engineers, J71-B-II, 11,
742-748, 1992.
[4] M. Takagi and H. Shimoda Edt. K.Arai et al., Image Analysis Handbook,
Tokyo-Daigaku-Shuppankai Pub. Co. Ltd., 1991.
[5] C.Jutten,J.Herault, Blind separation of sources, Part I: An adaptive
algorithm based on neuron, Signal Processing, 24, 1-10, 1991.
[6] S. Amari, N. Murata, Indepent Component Analysis -A New Method for
Multi-Variate Data Analysis, Science Pub. Co. Ltd., 2002.
[7] S. Morishita, S. Miyano Edt., K. Nijima, Extraction of hidden image
signals by means of blind separation and wavelets, Kyoritsu Shuppan
Pub. Co. Ltd., 201-206, 2000.
[8] K.Arai, Fundamental Theory on Wavelet Analysis, Morikita Shuppan
Pub. Co. Ltd., 2000.
[9] K.Arai, Self Learning on Wavelet Analysis, Kindai-Kagakusha Shuppan
Co. Ltd., 2003.
[10] K. Arai, T. Yoshida, Speaker separation based on blind separation
method with wavelet transformations, Journal of the Visualization
Society of Japan, 26, Suppl.1, 171-174, 2006.
[11] K.Arai, Blind separation, IJRRCS
AUTHORS PROFILE
Kohei Arai, He received BS, MS and PhD degrees in 1972, 1974 and 1982,
respectively. He was with The Institute for Industrial Science and Technology
of the University of Tokyo from April 1974 to December 1978 also was with
National Space Development Agency of Japan from January, 1979 to March,
1990. During from 1985 to 1987, he was with Canada Centre for Remote
Sensing as a Post Doctoral Fellow of National Science and Engineering
Research Council of Canada.
He moved to Saga University as a Professor in Department of Information
Science on April 1990. He was a councilor for the Aeronautics and Space
related to the Technology Committee of the Ministry of Science and
Technology during from 1998 to 2000. He was a councilor of Saga University
for 2002 and 2003. He also was an executive councilor for the Remote
Sensing Society of Japan for 2003 to 2005. He is an Adjunct Professor of
University of Arizona, USA since 1998. He also is Vice Chairman of the
Commission A of ICSU/COSPAR since 2008. He wrote 30 books and
published 322 journal papers.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
82 | P a g e
www.ijacsa.thesai.org
Multifinger Feature Level Fusion Based Fingerprint
Identification
Praveen N
Research FellowAudio and Image Research
Laboratory Department of ElectronicsCochin
University of Science and Technology
Cochin, Kerala, India.
Tessamma Thomas
Professor
Department of Electronics
Cochin University of Science and Technology
Cochin, Kerala, India.
Abstract Fingerprint based authentication systems are one of
the cost-effective biometric authentication techniques employed
for personal identification. As the data base population increases,
fast identification/recognition algorithms are required with high
accuracy. Accuracy can be increased using multimodal evidences
collected by multiple biometric traits. In this work, consecutive
fingerprint images are taken, global singularities are located
using directional field strength and their local orientation vector
is formulated with respect to the base line of the finger. Feature
level fusion is carried out and a 32 element feature template is
obtained. A matching score is formulated for the identification
and 100% accuracy was obtained for a database of 300 persons.
The polygonal feature vector helps to reduce the size of the
feature database from the present 70-100 minutiae features to
just 32 features and also a lower matching threshold can be fixed
compared to single finger based identification.
Keywords- fingerprint; multimodal biometrics; gradient;
orientation field; singularity; matching score.
I. INTRODUCTION
Personal authentication based on biometric traits is the
most common in the current security access technologies. As
the criminal/fraudulent activities are increasing enormously,
designing high security identification has always been the
main goal in the security business. Biometrics deals with
identification of people by their physical and/or behavioral
characteristics and, so, inherently requires that the person to be
identified is physically present at the point of identification.
Fingerprints offer an infallible means of personal
identification [1].The large numbers of fingerprint images,
which are collected for criminal identification or in business
for security purpose, continuously increase the importance of
automatic fingerprint identification systems. Most of the
automatic fingerprint identification systems can reach around
97% accuracy with a small database and the accuracy of
identification is drops down as the size of database is growing
up [2, 3]. Also, the processing speed of automatic
identification systems decreases if it involves a large number
of detection features. Hence feature code template size has to
be minimized so that identification may be much easier. A
fingerprint is characterized by singularities- which are small
regions where ridge lines forms the distinctive shapes: loop,
delta or whorl (Fig.1). Singularities play a key role in
classification of fingerprints [1, 5] which sets fingerprints into
a specific set.
Fig.1. Fingerprint Singularities
Classification eases the searching and in many of the AFIS,
classification is the primary procedure adopted [6].
According to Galton-Henry classification scheme [1, 4]
there are four common classes of fingerprints: Arch and
Tented Arch, Left loop, Right Loop and Whorl (Fig. 2).
Cappelli and Maltoni, 2009 [7] studied the spatial distribution
of fingerprint singularities and proposed a statistical model for
the four most common classes: Arch, Left loop, Right loop
and Whorl. The model they proposed gives a clear indication
of the fingerprint identity and is used here for identification
with a sharp reference position.
Biometric systems that use a single modality are usually
affected by problems like noisy sensor data, non-universality
and/or lack of distinctiveness of the biometric trait,
unacceptable error rates, and spoof attacks [8]. Multibiometric
system deals with two or more evidences that are taken from
different sources like multiple fingers of the same person,
Fig.2. Fingerprint Types
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
83 | P a g e
www.ijacsa.thesai.org
Fig.3. Information fusion
multiple samples of the same instances, multiple sensors for
the same biometric, multiple algorithms for representation and
matching of multiple traits [9]. Information fusion refers to the
consolidating of information or evidences presented by
multiple biometric sources [10, 11, 12].
II. FUSION IN BIOMETRICS
Hall and Llinas[13], Ross and Jain[8] have divided
information fusion into several categories: sensor level
fusion, feature level fusion, score level fusion and decision
level fusion. Based on this Sanderson and Paliwal[14] have
classified information fusion into pre-mapping fusion, midst-
mapping fusion and post-mapping fusion [Fig. 3]. Pre-
mapping fusion refers to combining information before any
use of classifiers or experts. In midst-mapping fusion,
information is combined during mapping from
sensordata/feature space into opinion/decision space and in
post-mapping fusion, information is combined after the
decisions of the classifiers have been obtained. Match score
fusion based multibiometric algorithm have been developed
by Ross and Jain, 2003[8], Frischholz and Dieckmann,
2000[15], Hong and Jain, 1998[16], Biguin et al., 1997[17],
Wang et al., 2003[18], Kumar and Zhang, 2003[19]. Fusion at
the match score, rank and decision level have been developed
and studied extensively. Feature level fusion, however, is a
relatively understudied problem [20].
In the present work, feature level fusion of feature vectors
is done by concatenating individual feature vectors of
consecutive fingers. Fingerprint baseline, which is defined as
the line between the Distal and Intermediate Phalangeal join
line is taken as the reference line [Fig. 4].This line is detected
using correlation technique, singularities are detected using
directional field strength and a polygon is formed with
Fig.4. Fingerprint Baseline
singularities and the baseline. For each finger, feature vector is
computed as the distance, angle parameters and ridge counts
which are concatenated to form the multifinger feature vector.
Matching score is formulated to identify the fingerprint. FAR
and FRR curves are plotted.
III. DEFINITION OF THE NOVEL FINGERPRINT STRUCTURE
AND FEATURE VECTOR FORMATION
In this work, fingerprint singularities are identified and a
polygon is formed with the baseline [21] [Fig.5]. The polygon
thus formed is invariant to rotation. Feature vector describing
the polygon is defined as F = (d,u , A, T, r)
T
where d is the
distance metric, u is the angle metric, A is the area of the
polygon, T is the type of the fingerprint/polygon and r is the
ridge counts. The feature vector thus formed is a 16 element
vector as:
T
where d
cc
, d
cb
...d
cdl
are the distance measures,
c
,
dr
..
cc
are
the convex angle metrics, A is the Area of the polygon formed
and r
cd
, r
cb
& r
db
are the ridge counts between core-delta, core-
base and delta-base respectively. These are shown in fig.5.
Following steps are carried out for constructing the fingerprint
polygon:
A. Directional Field Estimation and Strength
Directional field shows the coarse structure or basic shape
of a fingerprint [22] which gives the global information about
a fingerprint image. It is defined as the local orientation of the
ridge-valley structures.
By computing directional field, singularities can be
efficiently located. Several methods have been adopted to
estimate directional field. [23], [24], [25]. M. Kass and
Witkin, [26] introduced the gradient based method and was
adopted by fingerprint researchers [27, 28, 29, 30]. This
method is used in this
work.
The gradient vector
| |
T
y x
y x G y x G ) , ( ) , (
is defined as:
(1)
Where I(x,y) represents the gray-scale image. The
directional field is perpendicular to the gradients. Gradients
are orientations at pixel-scale whereas directional field
describes orientation of ridge-valley structure.
An averaging operation is done on the gradients to obtain
the directional field. Gradients cannot be averaged in the local
neighborhood as opposite gradients will cancel each other. To
solve this problem Kass and Witkin doubled the angle of the
gradient vectors before averaging. Doubling makes opposite
vectors points in the same
(
(
(
(
c
c
c
c
= V =
(
y
y x I
x
y x I
y x I
y x G
y x G
y
x
) , (
) , (
) , (
) , (
) , (
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
84 | P a g e
www.ijacsa.thesai.org
Fig.5. Fingerprint Polygon
direction and will reinforce each other while perpendicular
gradients will cancel each other. After averaging, the gradient
vectors have to be converted back to their single-angle
representation.
The gradient vectors are estimated first in Cartesian co-
ordinate system and is given by [G
x
, G
y
]. For the purpose of
doubling the angle and squaring the length, the gradient vector
is converted to polar system, which is given by
[]
T
where
t | t
2
1
2
1
s <
(2)
The gradient vector is converted back to its Cartesian as:
(3)
The average squared gradient | |
T
y s x s
G G
, ,
is given by
(
=
(
(
(
=
(
(
xy
yy xx
W
y x
W
y x
y s
x s
G
G G
G G
G G
G
G
2
2
2 2
,
,
(4)
where
=
=
=
W
y x xy
W
y yy
W
x xx
G G G
G G
G G
2
2
(5)
are estimates for the variances and cross covariance of G
x
and
G
y
, averaged over the window W. The average gradient
direction | is given by:
(6)
whereZ(x,y) is defined as:
Fig.6. Fingerprint and Directional Field
< . <
> . < +
>
= Z
0 0 ) / ( tan
0 0 ) / ( tan
0 ) / ( tan
) , (
1
1
1
y x x y
y x for x y
x x y
y x
t
t
Directional field image obtained is shown in Fig. 6.
B. Singularity Detection and Fingerprint Classification
Singularities are the vertices of the fingerprint polygon.
The most common method used forsingularity detection is by
means of Poincar index proposed by Kawogoe and Tojo,
1984[31].
Poincar index is given by
( ) (
()
)
(7)
Where G is the field associated with the fingerprint
orientation image, D. C is the closed path defined as an
ordered sequence; d is the directional field of individual
blocks around region of interest (Table 1).
TABLE 1. POINCAR INDEX COMPUTATION SCHEME
d2 d3 d4
d1 [i,j] d5
d0 d7 d6
Poincar index method cannot accurately detect the
singular points for noisy or low quality fingerprints and for
singular points in arch fingerprints and some of the tented arch
fingerprints [32]. Coherence, which gives the strength of the
orientation, measures how well all squared gradient vectors
share the same orientation [26]. In this work, singularities are
(
(
(
+
=
(
x
y
y x
G
G
G G
1
2 2
tan
=
(
sin
cos
y
x
G
G
) 2 , (
2
1
xy yy xx
G G G Z = |
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
85 | P a g e
www.ijacsa.thesai.org
Fig.7. Fingerprints and Coherence Images
located using Coherence computed using squared gradients.
The coherence of the squared gradients is given by:
(8)
Fig.7 shows the coherence image formed with singularities
clearly shown as dark spots. Depending on the relative
position of the singularities, fingerprints are classified into
seven types namely: Left Loop, Right Loop, Whorl without
any Delta, Whorl with one Delta and whorl with two deltas,
Arch and Tented Arch.
C. Baseline detection and feature vector formation
Majority of the fingerprint identification algorithms are
based on minutiae and ridge features. In this work baseline is
considered as the reference line for the fingerprint
singularities. This line has to be detected accurately to form
the fingerprint polygon. Hough Transform and other versions
of Hough Transform based line identification are the most
popular line identification technique used by image processing
researchers [33], [34], [35]. Guru et al. [36] have proposed a
PCA based method for line detection. In all the cases
computational complexity is high. Also fingerprints are as
such line patterns and hence identification of base line using
Hough transform methods requires additional intelligence. In
this work base line is detected using a correlation method as
per the following steps [21]:
1. Since baseline falls in the lower portion of the
fingerprint image, computation for line identification
needs to be done only in the lower portion of
fingerprint image and hence identification can be
done below the centroid of the segmented fingerprint
image.
2. Binary masks of sizes from 200 X 50 to 200 X 3 are
defined. Mask of 200 X 50 is to detect most slanted
base line (about 23
=
W
y s x s
W
y s x s
G G
G G
Coh
) , (
) , (
, ,
, ,
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
86 | P a g e
www.ijacsa.thesai.org
2. Compute matching score,
{
|
Th is the Threshold value assigned, w
n
is the weight, I
n
and
F
n
are the Input image and Template metrics, N is the feature
vector length.
3. If M Th, fingerprint matches, where 0 Th 1.
VI. IMPLEMENTATION OF THE ALGORITHM
A. Database used
In our work consecutive fingerprint images (forefinger and
middle finger) have been acquired using a fingerprint scanner
with 500 dpi resolution and image size of 600 X 600 pixels.
Fingerprint samples of about 300 persons were collected and
features were extracted and stored.
B. Implementation
Individual fingerprints are segmented to detect the baseline
and to generate the feature vectors. Baseline is captured
clearly for individual fingerprints and the polygon is drawn for
whorl, left loop, right loop and arch classes of the input
images and the features from the polygon are evaluated.
Classification of fingerprints is done for each fingerprint as
per the classification scheme and the templates are formed and
stored [Table 2].
C. Results and Discussion
About 300 fingerprint pairs were taken and the features
were extracted and stored as template data. Another sample set
of 300 fingerprint pairs of same persons were taken as test set.
Match score has been calculated for each fingerprint in the test
set with the template. Box plot, which is a statistical plot of the
score distribution, is shown in fig.10. A genuine fingerprint is
one which is supposed to match with the same fingerprint
template in the template data. The genuine distribution shows
a median of about 0.92 and the whiskers ranging between 0.72
to 1. An imposter in one whose fingerprint does not matches
with the template data. The imposter distribution shows a
median of 0.56 with the whiskers ranging between 0 and 0.71.
Hence fixing a matching score threshold between 0.72 and
0.71 can identify all the fingerprints with 100% accuracy. The
Receiver Operator Characteristics (ROC) graph is shown in
fig.11. The False Acceptance Rate(FAR) is a measure of how
many imposter users are falsely accepted into the system as
genuine users is plotted by varying the threshold from 0 to
1. The False Rejection Rate (FRR) is a measure of how many
genuine users are falsely rejected by the system as imposters
is also calculated for various thresholds and is plotted. The
Equal Error Rate (EER) is defined as the condition at which
FAR=FRR as in Figure 11, is approximately equal to zero at a
threshold of Th = 0.715. For this threshold, fingerprints are
identified with 100% accuracy. Fig.12 shows the ROC
correspond to single finger based identification hich shows a
high threshold can only identify the fingerprint with 100 %
accuracy.
TABLE 2. FINGERPRINT TEMPLATE
d
cc
d
cb
d
cdr
d
dbr
d
bb
d
dbl
d
cdl
u
C
u
DR
u
DL
u
CC
A T
r
cd
r
cb
r
db
0.00 234.47 0.00 0.00 110.74 181.77 122.70 60.13 0.00 119.88 0.00 23049.00 2 10.67 18.67 15.33
0.00 219.56 0.00 0.00 86.41 151.14 110.49 44.77 0.00 135.18 0.00 16055.00 2 11.00 21.00 13.33
0.00 258.32 0.00 0.00 141.61 144.20 179.13 50.25 0.00 129.55 0.00 27670.75 2 17.50 16.75 10.50
0.00 256.01 181.95 120.13 121.06 0.00 0.00 41.82 138.10 0.00 0.00 22805.50 1 13.67 15.17 8.33
0.00 252.18 0.00 0.00 109.93 142.25 155.49 44.95 0.00 135.15 0.00 21605.67 2 13.33 21.50 14.50
0.00 237.91 0.00 0.00 12.02 187.32 52.30 13.33 0.00 166.66 0.00 2567.25 2 3.00 15.50 14.75
0.00 253.24 0.00 0.00 44.01 218.95 56.31 52.07 0.00 127.82 0.00 10451.00 2 7.00 20.00 16.50
0.00 317.00 0.00 0.00 152.00 124.00 245.67 17.12 0.00 162.88 0.00 33516.00 2 17.00 18.00 6.00
0.00 250.12 0.00 0.00 101.24 174.45 126.45 47.78 0.00 132.20 0.00 21503.25 2 11.00 20.75 14.00
0.00 231.18 0.00 0.00 43.44 172.38 73.14 36.61 0.00 143.34 0.00 8782.83 2 5.00 21.50 16.00
0.00 281.00 0.00 0.00 125.00 105.00 193.72 21.79 0.00 158.21 0.00 25875.00 2 16.00 21.50 9.00
0.00 271.20 0.00 0.00 124.81 103.71 197.70 29.88 0.00 150.28 0.00 24235.75 2 16.00 17.50 9.25
0.00 297.74 182.66 129.60 71.56 0.00 0.00 23.05 156.96 0.00 0.00 15289.25 1 15.00 20.50 8.00
0.00 230.95 158.13 121.03 113.58 0.00 0.00 45.92 134.14 0.00 0.00 19952.33 1 13.67 17.00 9.33
0.00 249.02 102.70 171.14 66.85 0.00 0.00 37.23 142.76 0.00 0.00 14043.33 1 8.33 18.17 12.17
0.00 233.79 91.84 184.95 77.24 0.00 0.00 55.65 124.32 0.00 0.00 16181.00 1 10.33 23.33 14.00
163.77 129.25 221.05 114.21 101.18 0.00 0.00 168.02 142.36 0.00 154.31 20160.00 5 12.00 14.00 16.00
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
87 | P a g e
www.ijacsa.thesai.org
VII. CONCLUSION
Multifinger fusion based fingerprint identification is
presented here. Singularities of consecutive individual
fingerprints were found out using coherence computed via
directional field strength. Fingerprint polygon was constructed
for each fingerprints with the baseline detected and feature
vectors were concatenated to form a 32 element vector. A
distance based matching score was formulated and was tested
with an accuracy of 100% detection for a database of 300
candidates.
.
Fig.10. Score Distribution Multifinger
Fig.11. FAR-FRR Graph- Multifinger
Fig.12 FAR-FRR Graph- Single finger based
REFERENCES
[1] David Maltoni, Dario Maio, Anil K Jain, Salil Prabhakar, Handbook of
fingerprint recognition, 2
nd
Ed. Springer, 2005.
[2] A.K. Jain, L. Hong and R. Bolle, On-line fngerprint verification, IEEE
Trans. Pattern Anal. Mach Intell. 19 4 (1997), pp. 302-314.
[3] A.K. Jain, S. Prabhakar and S. Pankanti, Filterbank-based fingerprint
matching, IEEE Trans. Image Process. 9 5 (2000), pp. 846-859.
[4] Nalini Ratha, Ruud Bolle Editors, Automatic Fingerprint Recognition
Systems, Springer, 2003.
[5] R. Cappelli and D. Maio, The state of the Art in Fingerprint
Classification, Automatic Fingerprint Recognition Systems, N. Ratha
and R. Bolle, eds., Springer, 2004.
[6] R. Cappelli, D. Maio and D. Maltoni, Indexing Fingerprint Databases
for Efficient 1: N Matching, Proc. Sixth International Conf. Control,
Automation, Robotics and Vision, Dec. 2000.
[7] Raffaele Cappelli and Davide Maltoni, On the Spatial Distribution of
Fingerprint Singularities, IEEE Transactions on Pattern Analysis and
Machine Intelligence, Vol 31, No. 4, April 2009 pp- 742-748.
[8] A Ross, A Jain, Information fusion in biometrics, Pattern Recogn.
Lett. 24 (2003)2115-2125
[9] Jain A K, Nandakumar K, and Ross A, Score Normalization in
Multimodal Biometric Systems, Pattern Recognition, 38(12), 2270-
2285
[10] Arun A Ross, Karthik Nandakumar, Anil K Jain, Handbook of
Multibiometrics, Springer, 2006.
[11] Biometrics in the Age of Heightened Security and Privacy. Available at
http : // www.itl.nist.gov/div895/isis/bc/bc2001/
EDIT FINAL DR.ATTICK.pdf
[12] Most M, Battle of Biometrics, Digital ID World Magazine, 2003,
Pages 16-18.
[13] D L Hall, J Llinas, Multisensor data fusion, D L Hall, J Llinas (Eds.),
Handbook of Multisensor Data Fusion, CRC Press, 2001, pp 1-10.
[14] C. Sanderson, K.K. Paliwal, Identity Verification Using Speech and
Face Information, Digital Signal Processing, Vol. 14, No. 5, 2004, pp.
449-480.
[15] R Frischholz and U Dieckmann, Biod: A multimodal biometric
identification system, IEEE Computer, 33(2), pp 64-68, 2000.
[16] L Hong and A K Jain, Integrating faces and fingerprints for personal
identification, IEEE Trans. on Pattern Analysis and Machine
Intelligence 20(12), p 1295-1307, 1998.
[17] E Bigun, J Bigun, B Due and S Fischer, Expert conciliation for multi
modal person authentication by Bayesian statistics, Proc. of Intl Conf.
on Audio and Video based Person Authentication, pp 311-318, 1997.
[18] Y.Wang, T Tan and A. K. Jain, Combining face and iris biometrics for
identitiy verification, Proc. of Intl Conf. on Audio and Video based
Person Authentication, pp 805-813, 2003.
[19] A Kumar and D Zhang, Integrating palmprint with face for user
authentication, Workshop on Multi Modal User Authentication
(MMUA), pp 107-112, 2003.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
88 | P a g e
www.ijacsa.thesai.org
[20] Arun Ross and Rohin Govindarajan, Feature Level Fusion Using Hand
and Face Biometrics, Proc. of SPIE Conference on Biometric
Technology for Human Identification II, Vol 5779, pp 196-204, 2005.
[21] N Praveen and Tessamma Thomas, Singularity Based Fingerprint
identification, International Journal of Research and Reviews in
Computer Science (IJRRCS) Vol. 2, No. 5, pp 1055-1059, October
2011.
[22] Asker M Bazen and Sabih H. Gerez, Systematic Methods for the
Computation of the Directional Fields and Singular Points of
Fingerprints, IEEE Transactions on Pattern Analysis and Machine
Intelligence, Vol. 24, No. 7, July, 2002 pp 905-919.
[23] G. A. Dretsand G. G. LiljenstrOm, Fingerprint Sub classification: A
Neural Network Fingerprint Classification, Intelligent Biometric
Techniques in Fingerprint and Face Recognition, L C. Jain, U. Halici, I.
Hayashi,, S. B. Lee, and S. Tsutsui, eds., pp. 109-134, Boca Raton, Fla.:
CRC Press, 1999.
[24] C. L. Wilson, G. T. Candela, and C. I. Watson, Neural Network
Fingerprint Classification, J. Artificial Neural Networks, Vol. 1, No. 2,
pp. 203-228, 1994.
[25] LO. Gorman and J. V. Nicherson, An Approach to Fingerprint Filter
Design, Pattern Recognition, Vol. 22, No. 1, pp.-29-38, 1989.
[26] M. Kass and A. Witkin, Analyzing Oriented Patterns, Computer
Vision, Graphics, and Image Processing, Vol. 37, No. 3, pp.-362-385,
March 1987.
[27] A. K Jain, L. Hong, S. Pankanti, and R. Bolle, An Identity
Authentication System Using Fingerprints, Proc. IEEE, Vol. 85, no. 9,
pp1365-1388, Sept. 1997.
[28] A. R. Rao and R. C. Jain,Computerized, Flow Field Analysis: Oriented
Texture Fields, IEEE Transactions on Pattern Analysis and Machine
Intelligence, Vol. 14, no. 7,, pp. 693- 709, July, 1992.
[29] N. Ratha, S. Chen, and A. Jain, Adaptive Flow Orientation Based
Feature Extraction in Fingerprint Images, Pattern Recognition, vol. 28,
pp. 1657-1672, Nov. 1995.
[30] P. Perona, Orientation Diffusions, IEEE Trans. Image Processing,
vol.7, no.3, pp.457-467, Mar.1998.
[31] Masahiro Kawagoe and Akio Tojo, Fingerprint pattern classification,
Pattern Recognition, Vol. 17, no.3, pp 295-303,1984 .
[32] CHENG Xin-Ming, XU Dong-Cheng, XU Cheng, A New Algorithm
for Detecting Singular Points in Fingerprint Images, Third International
Conference on Genetic and Evolutionary Computing, 2009.
[33] Atiquzzaman, M., 1992. Multiresolution Hough transforman efficient
method of detecting patterns in images, IEEE Trans. Pattern
Anal.Machine Intell. 14 (11), 1090-1095.
[34] Duda, R.O., Hart, P.E., 1972. Use of Hough transformation to detect
lines and curves in pictures, Commun. ACM 15 (1), 11-15.
[35] Hough P.V.C., 1962. Method and means for recognizing complex
patterns, U.S. Patent No. 3069654.
[36] D. S. Guru B, H. Shekar, P. Nagabhushan, A simple and robust line
detection algorithm based on small Eigen value analysis, Pattern
Recognition. 25, pp. 1-13, 2003.
AUTHORS PROFILE
Praveen N received his MSc (Electronics) from Department of Electronics,
Cochin University of Science and Technology, Kochi, 682022. He is working
as Associate Professor of Electronics in the Department of Electronics, N S S
College, Rajakumari, Idukki Dt., Kerala. He has 15 years of teaching
experience and five years of research experience. At present he is doing PhD
in Department of Electronics, Cochin University of Science and Technology.
His research areas of interest include digital signal / image processing,
biometrics, computer vision etc.
Dr.Tessamma Thomas received her M.Tech and Ph.D from Cochin
University of Science and Technology, Cochin-22, India. At present she is
working as Professor in the Department of Electronics, Cochin University of
Science and Technology. She has to her credit more than 90 research papers,
in various research fields, published in International and National journals and
conferences. Her areas of interest include digital signal / image processing, bio
medical image processing, super resolution, content based image retrieval,
genomic signal processing etc.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
89 | P a g e
www.ijacsa.thesai.org
Gender Effect Canonicalization for Bangla ASR
B.K.M. Mizanur
Rahman
Lecturer
United International
University,
Dhaka, Bangladesh
Bulbul Ahamed
Senior Lecturer
Northern University
Bangladesh, Dhaka,
Bangladesh
Md. Asfak-Ur-
Rahman
Technical Project
Manager
Grype Solutions
Bangladesh
Khaled Mahmud
Lecturer
Institute of Business
Administration
University of Dhaka
Mohammad Nurul
Huda
Associate Professor
United International
University,
Dhaka, Bangladesh
AbstractThis paper presents a Bangla (widely used as Bengali)
automatic speech recognition system (ASR) by suppressing
gender effects. Gender characteristic plays an important role on
the performance of ASR. If there is a suppression process that
represses the decrease of differences in acoustic-likelihood among
categories resulted from gender factors, a robust ASR system can
be realized. In the proposed method, we have designed a new
ASR incorporating the Local Features (LFs) instead of standard
mel frequency cepstral coefficients (MFCCs) as an acoustic
feature for Bangla by suppressing the gender effects, which
embeds three HMM-based classifiers for corresponding male,
female and geneder-independent (GI) characteristics. In the
experiments on Bangla speech database prepared by us, the
proposed system has achieved a significant improvement of word
correct rates (WCRs), word accuracies (WAs) and sentence
correct rates (SCRs) in comparison with the method that
incorporates Standard MFCCs.
Keywords- Acoustic Model; AutomaticSpeech Recognition;
Gender Effects Suppression; Hidden Markov Model.
I. INTRODUCTION
The current automatic speech recognition (ASR) system
had been investigated for achieving the adequate performance
at any time and everywhere; however, it could not be able for
providing the highest level accuracies till to date. One of the
reasons is that the acoustic models (AMs) of a hidden Markov
model (HMM)-based classifier include many hidden factors
such as speaker-specific characteristics that include gender
types and speaking styles. It is difficult to recognize speech
affected by these factors, especially when an ASR system
contains only a single acoustic model. One solution is to
employ multiple acoustic models, one model for each type of
gender. By handling these gender effects appropriately the
robustness of each acoustic model in an ASR can be extended
to some limit.
A method of decoding in parallel with multiple HMMs
corresponding to hidden factors has recently been proposed in
[1], [2] forresolving these difficulties. Multi-path acoustic
modeling, that represents hidden factors with several paths in
the same AM instead of applying multiple HMMs, was also
presented [3]. Unfortunately, only a very few works have been
done for ASR in Bangla (can also be termed as Bengali),
which is one of the largely spoken languages in the world.
More than 220 million people speak in Bangla as their native
language. It is ranked sixth based on the number of speakers
[4]. A major difficulty to research in Bangla ASR is the lack
of proper speech corpus. Some efforts are made to develop
Bangla speech corpus to build a Bangla text to speech system
[5].
Although some Bangla speech databases for the eastern
area of India (West Bengal and Kolkata as its capital) were
developed, but most of the natives of Bangla (more than two
thirds) reside in Bangladesh, where it is the official language.
Besides, the written characters of Standard Bangla in both the
countries are same; there are some sounds that are produced
variably in different pronunciations of Standard Bangla, in
addition to the myriad of phonological variations in non-
standard dialects [6]. Therefore, there is a need to do research
on the main stream of Bangla, which is spoken in Bangladesh,
ASR.
Some developments on Bangla speech processing or
Bangla ASR can be found in [7]-[14]. For example, Bangla
vowel characterization is done in [7]; isolated and continuous
Bangla speech recognition on a small dataset using HMMs is
described in [8].
Again, Bangla digit recognition was found in [15]. Since
no work in Bangla was found for suppressing the gender
factor, previously, we proposed a method [16] for that purpose
by embedding multiple HMM-based classifiers. But this
method of gender effect suppression did not incorporate any
gender-independent (GI) classifier for resolving those male
and female speakers whose voices have effect of opposite
gender to some extent. To resolve this problem, we proposed
another method for suppressing the gender factor more
accurately by incorporating the GI classifier[17]; but this
method could not be able to provide enough performance
because of embedding standard mel frequency cepstral
coefficients (MFCCs) as an input acoustic feature that did not
include frequency domain information in its extraction
process. Consequently, an exploitation of new feature is
needed to obtain time and frequency domain information in its
feature vector.
In this paper, we have designed a new ASR incorporating
the Local Features (LFs) instead of standard mel frequency
cepstral coefficients (MFCCs) as an acoustic feature for
Bangla by suppressing the gender effects, which embeds three
HMM-based classifiers for corresponding male, female and
geneder-independent (GI) characteristics. In the experiments
on Bangla speech database prepared by us, the proposed
system has achieved a significant improvement of word
correct rates (WCRs), word accuracies (Was) and sentence
correct rates (SCRs) in
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
90 | P a g e
www.ijacsa.thesai.org
comparison with the method that incorporates Standard
MFCCs.
This paper is organized as follows. Sections II discusses
Bangla phoneme schemes, Bangla speech corpus and triphone
model. On the other hand, Section III outlines MFCCs and LFs
features extraction procedures and Section IV explain the
gender effect suppression methods [16] and [17] including the
proposed suppression technique by incorporating LFs.Section
V describes the experimental setup and Section VI shows
experimental results and provides a discussion. Finally,
Section VII and VIII conclude the paper with some future
remarks and references, respectively.
II. BANGLA PHONEME SCHEME, SPEECH CORPUS AND
TRIPHONE MODEL
Bangla phonetic scheme and triphone models design were
presented in our papers, [16] and [17]. These papers show ed
how the left and right contexts are used to design the triphone
models.
At present, a real problem to do experiment on Bangla
phoneme ASR is the lack of proper Bangla speech corpus. In
fact, such a corpus is not available or at least not referenced in
any of the existing literature. Therefore, we develop a medium
size Bangla speech corpus, which is described below.
Hundred sentences from the Bengali newspaper Prothom
Alo [18] are uttered by 30 male speakers of different regions
of Bangladesh. These sentences (30x100) are used as male
training corpus (D1). On the other hand, 3000 same sentences
uttered by 30 female speakers are used as female training
corpus (D2).
On the other hand, different 100 sentences from the same
newspaper uttered by 10 different male speakers and by 10
different female speakers are used as male test corpus (D3)
and female test corpus (D4), respectively. All of the speakers
are Bangladeshi nationals and native speakers of Bangla. The
age of the speakers ranges from 20 to 40 years. We have
chosen the speakers from a wide area of Bangladesh: Dhaka
(central region), Comilla Noakhali (East region), Rajshahi
(West region), Dinajpur Rangpur (North-West region),
Khulna (South-West region), Mymensingh and Sylhet (North-
East region). Though all of them speak in standard Bangla,
they are not free from their regional accent.
Recording was done in a quiet room located at United
International University (UIU), Dhaka, Bangladesh. A desktop
was used to record the voices using a head mounted close-
talking microphone. We record the voice in a place, where
ceiling fan and air conditioner were switched on and some low
level street or corridor noise could be heard.
Jet Audio 7.1.1.3101 software was used to record the
voices. The speech was sampled at 16 kHz and quantized to 16
bit stereo coding without any compression and no filter is used
on the recorded voice.
III. ACOUSTIC FEATURE EXTRACTOR
3.1 MFCC Feature Extractor
Conventional approach of ASR systems uses MFCCof 39
dimensions (12-MFCC, 12-MFCC, 12-MFCC, P, P and
P, where P stands for raw energy of the input speech signal)
and the procedure of MFCC feature extraction is shown in Fig.
1. Here, hamming window of 25 ms is used for extracting the
feature. The value of pre-emphasis factor is 0.97.
Fig. 1. MFCC feature extraction.
Fig. 2. Example of local features.
Fig. 3. Local feature extraction.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
91 | P a g e
www.ijacsa.thesai.org
3.2LocalFeature Extractor
At the acoustic feature extraction stage, the input speech is
first converted into LFs that represent a variation in spectrum
along the time and frequency axes. Two LFs are then extracted
by applying three-point linear regression (LR) along the time
(t) and frequency (f) axes on a time spectrum pattern (TS),
respectively. Fig. 2 exhibits an example of LFs for an input
utterance, /gaikoku/. After compressing these two LFs with 24
dimensions into LFs with 12 dimensions using discrete cosine
transform (DCT), a 25-dimensional (12 t, 12 f, and P,
where P stands for the log power of a raw speech signal)
feature vector called LF is extracted. Fig.3 shows the local
feature extraction procedure.
IV. GENDER EFFECT SUPPRESSION METHODS
4.1 MFCC-Based Suppression method without GI classifier
[16]
Fig. 4 shows the system diagram of the existing method
[16], where MFCC features are extracted from the speech
signal using the MFCC extractor described in Section 3.1 and
then male and female HMM classifiers are trained using the
D1 and D2 data sets, respectively. Here, triphone acoustic
HMMs are designed and trained using D1 and D2 data sets.
Output hypothesis is selected based on maximum output
probabilities after comparing male and female hypotheses, and
passed the best matches hypothesis to the output.
Fig. 4. MFCC-based gender effect suppression method without GI classifier
[16].
4.2 MFCC-Based Suppression method incorporating GI
classifier [17]
The diagram of the method [17] is depicted in Fig. 5. Here,
the extracted MFCC features from the input speech signal are
inserted into the male, female and GI HMM-based classifiers.
The male, female and GI HMM-based classifiers are trained
using the D1, D2 and (D1+D2) data sets. Here, output
hypothesis is selected based on maximum output probabilities
after comparing male, female and gender independent
hypotheses, and passed the best matches hypothesis to the
output.
Fig. 5. MFCC-based gender effect suppression method incorporating GI
classifier [17].
4.3 LF-based Proposed Suppression Method
incorporating GI classifier
The diagram of the LF-based method is depicted in Fig. 6.
Here, the extracted LFs [19] from the input speech signal are
inserted into the male, female and GI HMM-based classifiers.
The male, female and GI HMM-based classifiers are trained
using the D1, D2 and (D1+D2) data sets. Here, output
hypothesis is selected based on maximum output probabilities
after comparing male, female and gender independent
hypotheses, and passed the best matches hypothesis to the
output.
Fig. 6. Proposed LF-based suppression method incorporating GI classifier.
V. EXPERIMENTAL SETUP
The frame length and frame rate are set to 25 ms and 10 ms
(frame shift between two consecutive frames), respectively, to
obtain acoustic features (MFCCs and LFs) from an input
speech. MFCC and comprised of 39 (12-MFCC, 12-MFCC,
12-MFCC, P, P and P, where P stands for raw energy
of the input speech signal) and 25 (12 delta coefficients along
time axis, 12 delta coefficients along frequency axis, and delta
coefficient of log power of a raw speech signal) dimensions,
respectively.
For designing an accurate continuous word recognizer,
word correct rate (WCR), word accuracy (WA) and sentence
correct rate (SCR) for (D3+D4) data set are evaluated using an
HMM-based classifier. The D1 and D2 data sets are used to
design Bangla triphones HMMs with five states, three loops,
and left-to-right models. Input features for the classifier are 39
dimensional MFCCs and 25 dimensional LFs.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
92 | P a g e
www.ijacsa.thesai.org
In the HMMs, the output probabilities are represented in
the form of Gaussian mixtures, and diagonal matrices are used.
The mixture components are set to 1, 2, 4 and 8.
For evaluating the performance of the methods, [16] and
[17], we have designed the following experiments:
(a) MFCC (Train: 3000 male, Test: 1000 male+1000
female).
(b) MFCC (Train: 3000 female, Test: 1000 male+1000
female).
(c) MFCC (Train: 3000 male + 3000 female, Test: 1000
male +1000 female).
(d) MFCC (Train: 3000 male, Train: 3000 female, Test:
1000 male + 1000 female) [16].
(e) MFCC (Train: 3000 male, Train: 3000 female, Train:
3000 male + 3000 female, Test: 1000 male + 1000
female).
New experiments given below are designed for mixture
component one using the LFs for evaluating the performance:
(a) LF (Train: 3000 male, Test: 1000 male+1000 female).
(b) LF(Train: 3000 female, Test: 1000 male+1000 female).
(c) LF(Train: 3000 male + 3000 female, Test: 1000 male
+1000 female).
(d) Proposed, LF(Train: 3000 male, Train: 3000 female,
Train: 3000 male + 3000 female, Test: 1000 male +
1000 female).
VI. EXPERIMENTAL RESULTS AND ANALYSIS
Fig. 7 shows the comparison of word correct rates among
all the MFCC-based investigated methods, (a), (b), (c), (d) and
(e). Among all the mixture components investigated, the
method, (e) shows higher performance in comparison with the
other method evaluated. It is noted that the method, (e)
exhibits its best performance (92.17%) at mixture component
two.
Fig. 7. Word correct rate comparison among MFCC-based investigated
methods.
Word accuracies for different investigated methods based
on MFCCs are depicted in Fig. 8. From the figure, it is
observed that the highest level performance (91.64%) at
mixture component two is found by the method, (e) compared
to the other methods investigated. Here, the performance of
the methods, (a), (b), (c), (d) and (e) at mixture component two
are 77.58%, 81.47%, 87.39%, 90.78% and 91.64%,
respectively.
It is shown from the Fig. 9 that sentence correct rates for
the MFCC-based investigated methods, (a), (b), (c), (d) and (e)
are 77.20%, 81.45%, 86.60%, 90.45% and 91.30%,
respectively, where the method, (e) provides its best
performance. The methods, (a), (b) and (c) give less
performance in comparison with the method (d) because (d)
incorporates both HMM-based classifiers for male and female.
Again, the method, (e) incorporates GI HMM-based classifier
over the method (d), which increases sentence correct rate
significantly. Since the maximum output probability is
generated by the MFCC based method, (e) after comparing the
probabilities among male, female and GI classifiers, the
suppression method, (e) shows its superiority.
Fig. 8. Word accuracy comparison among MFCC-based investigated methods.
Fig. 9. Sentence correct rate comparison among MFCC-based investigated
methods.
Fig. 10. Effect of Gender-Independent classifier in the MFCC-based method
(e) over the method (d).
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
93 | P a g e
www.ijacsa.thesai.org
Improvements by the GI classifier in the MFCC-based
method (e) over the method (d) that does not incorporate GI
classifier is shown in Fig. 10. From the figure, it is observed
that the method, (e) shows its highest level improvement at
mixture components one, where the improvement of sentence
correct rate, word accuracy and word correct rate are 2.4%,
2.36% and 2.32%, respectively.
Word Correct Rates (WCRs), Word Accuracies (WAs) and
Sentence Correct Rates (SCRs) for LF and MFCC based
methods, (a), (b), (c) and (d)/(e) using (D3+D4) data set are
shown in Table I. Here, methods (d) and (e) represent LF-
based and MFCC-based suppression methods with gender
independent (GI) classifiers, respectively. From the
experiment #1, where the HMM-based classifier is trained
with D1 data set and evaluated with (D3+D4) data set, a
tremendous improvement of WCRs, WAs and SCRs are
exhibited by the LF-based ASR that incorporates gender effect
canonicalization module in the ASR process. Similarly, the
same pattern of performance is also achieved in the
experiment #2, where the HMM-based classifier is trained
with D2 data set and evaluated with (D3+D4) data set. This
trend explicates the LFs, which embeds time and frequency
domain information in its extraction procedure, as excellent
feature for the Bangla automatic speech recognition system.
Again, the experiment #3, which shows the GI ASR for
Bangla language based on LF and MFCC features, is trained
with (D1+D2) and evaluated with (D3+D4) data sets and
provides 3.83%, 2.60% and 3.70% improvements by the LF-
based method in comparison with the MFCC-based
counterpart. Finally, the methods in the experiment #4 imply
two ASR systems for GI Bangla ASR by integrating gender
factor canonicalization process in its architecture and shows
the highest level performance for WCRs, WAs and SCRs
compare to the corresponding the methods in the experiments
#1, #2 and #3. Since these methods of the experiment #4
always maximize the output probabilities obtained from the
three classifiers: male, female and GI, it shows its maximum
level of performance. Besides, the LF-based methods
improves the WCRs, WAs and SCRs by 3.94%, 3.43% and
3.90%, respectively than the method that inputs MFCC
features in the HMM-based classifier in the gender effect
suppression process.
Table I.Word Correct Rates (WCRs), Word Accuracies
(WA) and Sentence Correct Rates (SCRs) for LF and MFCC
based methods, (a), (b), (c) and (d)/(e) using (D3+D4) data set
for mixture component one. Here, methods (d) and (e) are LF-
based and MFCC-based suppression methods with gender
independent (GI) classifiers, respectively.
On the other hand, Table II exhibits sentence recognition
performance for LF and MFCC based methods, (a), (b), (c)
and (d)/(e) using (D3+D4) data set, where the methods (d) and
(e) represent LF-based and MFCC-based suppression methods
with gender independent (GI) classifiers, respectively. Here,
experiments #1, #2, #3 and #4 use same corpora for training
and evaluation that we explained earlier. From all the
experiments, it is evident that the LF-based method reduces
the number of incorrectly recognized sentences with respect to
its counterpart. The experiment #4 shows the highest number
of correctly recognized sentences than the corresponding
methods in the other experiments, #1, #2 and #3 investigated.
For an example, in the LF-based and MFCC-based methods of
experiment #4, the numbers of correctly recognized sentences
are 1879 and 1801, respectively that are the highest numerical
figures among the corresponding methods of all the
experiments. It is noted that two significant phenomenon
contributed more for obtaining the best experimental results by
the LF-based method of experiment #4: i) both time and
frequency domain information and ii) selection of maximum
probability among the three output probabilities calculated by
the male, female and GI HMM-based classifiers.
Table II. Sentence recognition performance for LF and MFCC
based methods, (a), (b), (c) and (d)/(e) using (D3+D4) data set
for mixture component one. Here, methods (d) and (e) are LF-
based and MFCC-based suppression methods with gender
independent (GI) classifiers, respectively
.
Moreover, word recognition performance for LF and
MFCC based methods, (a), (b), (c) and (d)/(e) using (D3+D4)
data set are summarized in the Table III. In the table, methods
(d) and (e) represent LF-based and MFCC-based suppression
methods with gender independent (GI) classifiers,
respectively. The same speech corpora for training and
evaluation are used for the experiments #1, #2, #3 and #4
which is already described in the earlier. It is observed from all
the experiments that the LF-based method increases the
number of correctly recognized words in comparison with its
counterpart. The highest number of correctly recognized words
shown by the experiment #4 with respect to their
corresponding methods in the other experiments, #1, #2 and #3
are evident from the table. It can be mentioned as an example
that the LF-based and MFCC-based methods of experiment #4
provide the highest numbers of correctly recognized words,
which are 6247 and 5988, respectively, dictates the respective
numerical figures obtained for the corresponding methods of
all the other experiments. The reason for obtaining the best
experimental results by the LF-based method of experiment #4
is also illustrated earlier.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
94 | P a g e
www.ijacsa.thesai.org
Table III. Word recognition performance for LF and
MFCC based methods, (a), (b), (c) and (d)/(e) using (D3+D4)
data set for mixture component one. Here, methods (d) and (e)
represent LF-based and MFCC-based suppression methods
with gender independent (GI) classifiers, respectively.
VII. CONCLUSION
This paper has proposed an automatic speech recognition
technique based on LFs for Bangla language by suppressing
the gender effect incorporating HMM-based classifiers for
male, female and Gender-Independent characteristics. The
following information concludes the paper.
i) The MFCC-based method incorporating GI
classifier provides the higher performance than the
method that does not incorporate GI classifier. The
MFCC-based method incorporating GI exhibits its
superiority at all the mixture components
investigated.
ii) The incorporation of GI HMM classifier improves
the word correct rates, word accuracies and
sentence correct rates significantly.
iii) The proposed LF-based method shows a significant
improvement of word correct rates, word
accuracies and sentence correct rates for mixture
component one.
In future, the authors would like to evaluate performance
by incorporating neural network based systems.
REFERENCES
[1] S. Matsuda, T. Jitsuhiro, K. Markov and S. Nakamura, Speech
Recognition system Robust to Noise and Speaking Styles, Proc.
ICSLP04, Vol.IV, pp.2817-2820, Oct. 2004.
[2] T. Shinozaki and S. Furui, Spontaneous Speech Recognition Using a
Massively Parallel Decoder, Proc. ICSLP04, Vol.III, pp.2817-2820,
Oct. 2004.
[3] A. Lee, Y. Mera, K. Shikano and H. Saruwatari, Selective multi-path
acoustic model based on database likelihoods, Proc. ICSLP02, Vol.IV,
pp.2661-2664, Sep. 2002.
[4] http://en.wikipedia.org/wiki/List_of_languages_by_total_speakers, Last
accessed July04, 2012.
[5] S. P. Kishore, A. W. Black, R. Kumar, and Rajeev Sangal, "Experiments
with unit selection speech databases for Indian languages," Carnegie
Mellon University.
[6] S. A. Hossain, M. L. Rahman, and F. Ahmed, Bangla vowel
characterization based on analysis by synthesis, Proc. WASET, vol. 20,
pp. 327-330, April 2007.
[7] A. K. M. M. Houque, "Bengali segmented speech recognition system,"
Undergraduate thesis, BRAC University, Bangladesh, May 2006.
[8] K. Roy, D. Das, and M. G. Ali, "Development of the speech recognition
system using artificial neural network," in Proc. 5
th
International
Conference on Computer and Information Technology (ICCIT02),
Dhaka, Bangladesh, 2003.
[9] M. R. Hassan, B. Nath, and M. A. Bhuiyan, "Bengali phoneme
recognition: a new approach," in Proc. 6
th
InternationalConference on
Computer and Information Technology (ICCIT03), Dhaka, Bangladesh,
2003.
[10] K. J. Rahman, M. A. Hossain, D. Das, T. Islam, and M. G. Ali,
"Continuous bangla speech recognition system," in Proc. 6
th
International Conference on Computer and Information Technology
(ICCIT03), Dhaka, Bangladesh, 2003.
[11] S. A. Hossain, M. L. Rahman, F. Ahmed, and M. Dewan, "Bangla
speech synthesis, analysis, and recognition: an overview," in Proc.
NCCPB, Dhaka, 2004.
[12] S. Young, et al, The HTK Book (for HTK Version. 3.3), Cambridge
University Engineering Department, 2005.
http:///htk.eng.cam.ac.uk/prot-doc/ktkbook.pdf.
[13] http://en.wikipedia.org/wiki/Bengali_script, Last accessed September 28,
2011.
[14] C. Masica, The Indo-Aryan Languages, Cambridge University Press,
1991.
[15] Ghulam Muhammad, Yousef A. Alotaibi, and Mohammad Nurul Huda,
Automatic Speech Recognition for Bangla Digits, ICCIT09, Dhaka,
Bangladesh, December 2009.
[16] Mohammed Rokibul Alam Kotwal, Foyzul Hasan and Mohammad Nurul
Huda, Gender effects suppression in Bangla ASR by designing multiple
HMM based classifiers, CICN 2011, Gwalior, India.
[17] Foyzul Hasan, Mohammed Rokibul Alam Kotwal and Mohammad Nurul
Huda, Bangla ASR design by suppressing gender factor with gender-
independent and gender-based HMM classifiers, WICT 2011,
December 2011, Mumbai, India.
[18] Prothom Alo. Online: www.prothom-alo.com.
[19] T. Nitta, "Feature extraction for speech recognition based on orthogonal
acoustic-feature planes and LDA," Proc. ICASSP99, pp.421-424, 1999.
AUTHORS PROFILE
B.K.M. Mizanur Rahman was born in Jhenaidah, Bangladesh in 1972.
He completed his B.Sc. in Electrical and Electronic Engineering Degree from
BUET, Dhaka, Bangladesh. He is a student of Masters in Computer Science
and Engineering at United International University, Dhaka, Bangladesh. He is
working as a Lecturer in the Department of Electrical and Electronic
Engineering of the same university. His research interests include Speech
Recognition, Digital Signal Processing and Renewable Energy.
Bulbul Ahamed was born in Munshiganj, Bangladesh in 1982. He
obtained his B. Sc. in Computer Science and Engineering and MBA
(major in MIS & Marketing) from Northern University Bangladesh. Now he
is pursuing his M.Sc. in Computer Science and Engineering at United
International University, Bangladesh. He is now working as Senior
Lecturer in Northern University Bangladesh. His research interests include
Speech Recognition, Artificial Intelligence, Neural Network and Business. He
has published his articles in different journals and conferences of USA,
Pakistan, United Arab Emirates and Bangladesh.
Md.Asfak-Ur-Rahman was born in Kishoreganj, Bangladesh in 1985.
He obtained his B. Sc. in Computer Science and Engineering BRAC
University, Bangladesh. Now he is pursuing his M.Sc. in Computer
Science and Engineering at United International University, Bangladesh. He is
now working as Technical Project Manager in Grype Solutions,
Bangladesh. His research interests include Speech Recognition, Artificial
Intelligence and web programming.
Khaled Mahmud was born in 1984 at Pabna, Bangladesh. He was
graduated from Bangladesh University of Engineering and Technology
(BUET) in Computer Science and Engineering. He had his MBA (Marketing)
from Institute of Business Administration, University of Dhaka. He was
awarded gold medals both in his secondary and higher secondary school level
for excellent academic performance. He is a Fulbright Scholar, now
pursuing his MBA at Bentley University, Massachusetts, USA. He
previously worked as Assistant Manager in Standard Chartered Bank. He has
research interest business, technology, e-learning, e-governance, human
resource management and social issues. He has published his articles in
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
95 | P a g e
www.ijacsa.thesai.org
journals and conferences of USA, Canada, Australia, United Arab Emirates,
Malaysia, Thailand, South Korea, India and Bangladesh.
Mohammad Nurul Huda was born in Lakshmipur, Bangladesh in 1973.
He received his B. Sc. and M. Sc. in Computer Science and Engineering
degrees from Bangladesh University of Engineering & Technology (BUET),
Dhaka in 1997 and 2004, respectively. He also completed his Ph. D from the
Department of Electronics and Information Engineering, Toyohashi University
of Technology, Aichi, Japan. Now, he is working as an Associate Professor in
United International University, Dhaka, Bangladesh. His research fields
include Phonetics, Automatic Speech Recognition, Neural Networks, Artificial
Intelligence and Algorithms. He is a member of International Speech
Communication Association (ISCA).
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
96 | P a g e
www.ijacsa.thesai.org
Three Layer Hierarchical Model for Chord
Waqas A. Imtiaz, Shimul Shil, A.K.M Mahfuzur Rahman
Abstract Increasing popularity of decentralized Peer-to-Peer
(P2P) architecture emphasizes on the need to come across an
overlay structure that can provide efficient content discovery
mechanism, accommodate high churn rate and adapt to failures in
the presence of heterogeneity among the peers. Traditional p2p
systems incorporate distributed client-server communication,
which finds the peer efficiently that store a desires data item, with
minimum delay and reduced overhead. However traditional
models are not able to solve the problems relating scalability and
high churn rates. Hierarchical model were introduced to provide
better fault isolation, effective bandwidth utilization, a superior
adaptation to the underlying physical network and a reduction of
the lookup path length as additional advantages. It is more
efficient and easier to manage than traditional p2p networks. This
paper discusses a further step in p2p hierarchy via 3-layers
hierarchical model with distributed database architecture in
different layer, each of which is connected through its root. The
peers are divided into three categories according to their physical
stability and strength. They are Ultra Super-peer, Super-peer and
Ordinary Peer and we assign these peers to first, second and third
level of hierarchy respectively. Peers in a group in lower layer
have their own local database which hold as associated super-peer
in middle layer and access the database among the peers through
user queries. In our 3-layer hierarchical model for DHT
algorithms, we used an advanced Chord algorithm with optimized
finger table which can remove the redundant entry in the finger
table in upper layer that influences the system to reduce the
lookup latency. Our research work finally resulted that our model
really provides faster search since the network lookup latency is
decreased by reducing the number of hops. The peers in such
network then can contribute with improve functionality and can
perform well in P2P networks.
Keywords- Hierarchy; DHT; CHORD; P2P.
I. INTRODUCTION
Peer-to-Peer (P2P) network is a logical overlay network
which is built on top of one or more existing physical
networks. P2P networks, over the last two decades has been
recognized as a more efficient and flexible approach for
sharing resources, compared to the traditional Client-Server
model. Internet-scale decentralized architecture bases on p2p,
created an environment for millions of users, allowing them to
simultaneously connect and share content with ease and
reliability [1][8]. Efficient data location, lookups, redundant
storage and distributed content placement of p2p overlay
networks have raised a big deal of attention, not only for
researchers/academicians but also in practical usage of the
technology [7]. The distributed approach of p2p system is less
vulnerable to attacks, robust and highly available as compared
to its client-server counterpart.
P2P algorithms are a class of decentralized distributed
systems collectively, called as Distributed Hash Tables
(DHTs). DHT is a distributed data structure whose main
function is to hold the key-value pair in a completely
distributed manner and any participating peer in the overlay
network can retrieve the value associated with the given key
[8]. DHT uses consistent hashing to map the responsible node
for a key-value pair. Along with efficient mapping of a key-
value pair to nodes, DHT also has the ability to isolate the
network changes to a small part of it thus limiting the overhead
and bandwidth consumption during networks and resources
updates [8]. DHT is an abstract idea that helps us to achieve
complete independence from a central lookup entity and
tolerance to changes in the network [9]. Different algorithms
have been designed to implement and refine DHT idea. CAN,
PASTRY, TAPSTERY, CHORD are the most popular
implementations of DHT. We choose chord because of well-
organized data structures and efficient routing schemes.
Chord is an efficient distributed lookup service based on
consistent hashing which provides support for only one
operation: given a key and it efficiently maps the key onto a
node [7]. At its heart chord provides a fast and efficient
computation of the hash function by mapping keys onto the
nodes [7]. Chord basically creates a one dimensional identifier
circle which ranges from 0 to (
[1], where N is the number of peers in the overlay network. For
our proposed three layer hierarchical model, we assume single
connection structures, and take the following parameters for
comparison [7][11][14][15]:
Total number of peers in the overlay = N
Number of ultra-superpeers = U
Number of superpeers = S
Number of Ordinary peers =
Number of peers in a lower layer group =
Probability of finding data item at super node = Q
Number of entries in Finger table =
Time complexity is
Time complexity is the least amount of time required to
execute a process or program. When the number of peers in the
overlay network is N, the typical time complexity of searching
is for unstructured p2p network and for
structured p2p network [1]. We have used unstructured
topology in the lower two layers and structure topology in the
upper layer. In lower and middle layer, we apply the single
connection intra group structure, therefore the typical time
complexity of searching is, as single peer is traversed in
order to complete the query the query process. In upper layer,
the typical time complexity of searching is because
we apply structured based chord protocol. Finally, total time
complexity of the model is
In our proposed hierarchy model, three cases occur during
lookup process i.e. average number of hops required 1, if the
query process is done between ordinary peers and superpeers.
Average number of hops required is (1+1) = 2, if the query
process is done among ordinary peers, super peers and
associated ultra-superpeer (without using chord ring) because
of single connection structure. Average number of hops
required U, if the query process is done in upper layer
chord ring. Thus total number of hops required to perform a
query process in (U+2).
Lookup latency considered in our model is half of the time
required for a peer in the traditional chord. Because in the
hierarchical model, ultra-superpeers and superpeers, performs
the lookup process and have more physical capability then the
ordinary peers. That is why we assume the average link latency
of a peer in our hierarchical system is 50 % less than the
traditional chord peer which are unstable and have minimum
computation capabilities.
To analyze the performance of both traditional chord and
our proposed three layer hierarchical model we randomly
increase the number of nodes in the overlay network and
observe the number of hops and time taken to perform the
lookup process. Number of superpeers and ultra superpeers are
also randomly increased while increasing the number of nodes.
Figure 5 represents the number of hops required to perform
the lookup process against increasing number nodes. Blue line
shows the number of hops required by traditional chord,
whereas green line represents the number of hops used by three
layered hierarchical chord to perform the lookup process.
Figure shows that as we increase the number of nodes, the
number of hops required to perform the query process also
increases. However the rate of increase is abrupt in traditional
chord as compared to our proposed model. This is because of
the fact that traditional chord uses
( )
(1)
Where (x1, y1) and (x2,y2) are the coordinates of any two
FCPs P1(x1,y1) and P2(x2,y2) respectively. Also two angles
in the mouth area are computed to represent with geometric
lengths the geometric features, typically results are in Figure4
(a, b).
Figure 4. Geometric features: (a) geometric lengths, (b) mouth angles
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
105 | P a g e
www.ijacsa.thesai.org
2) Appearance Features Extraction: The second approach
used to extract feature is appearance approach. Various feature
extraction methods can be used to extract facial expression
features such as, the active appearance models (AAM),
eigenfaces, and Gabor wavelets that proposed in this paper.
The Gabor wavelets reveal much about facial expressions as
both transient and intransient facial features that often give
rise to a change in contrast.
a) pre-processing and Gabor filters:After face detection
process, 300x300 pixels is the size of face image, will be done
by using cropping process on face image to be 200x300 pixels
and then do resize to the cropped image to be 200x200. After
pre-processing process of face image will use Gabor filter to
extract features. Gabor wavelet represents the properties of
spatial localization, orientation selectivity, and spatial
frequency selectivity. The Gabor wavelet can be defined as
()=
) [(
) (
)](2)
Where = (x,y), and u and r define the orientations
and scales of the Gabor wavelet, respectively [19].
p
u,r
is defined as
a)
P
u,r
=P
r
e
iu
(3)
Where P
r
= p
max
/f
r
and
u
= u/6. P
max
is the maximum
frequency, and f is the spacing factor between kernels in the
frequency domain. In this experiment the Gabor wavelet with
three scales (r = 0,1,2) and six orientations (u =0,1,2,3,4,5),
and parameters = 2, p
max
= /2, and f = are applied.
This reproduces 18 different Gabor kernels (3 scales * 6
orientations). The Gabor representation of an image, which is
called a Gabor image, is produced from the convolution of
the image with the Gabor kernels as defined by Equation 2.
For each image pixel, it produces18 Gabor images and each
one consists of two Gabor parts: the real part and the
imaginary part. Thereafter, these two parts are transformed
into two kinds of Gabor feature: the magnitude and the phase.
In this study only the magnitude features are used to represent
the facial Gabor features. Changing the facial positions will
cause only slight variations in the magnitude of the Gabor
wavelet, while the phase of the Gabor wavelet output is very
sensitive to the position of the face [20]. The 200200 pixels
in the Gabor image produce a large number of features is
40000 feature vectors. To produce a smaller number of
features, the image size has been re-sized to 2525 pixels,
resulting in only 625 features.
b) Dimension reduction with Principal component analysis
(PCA): After re-sizing the Gabor image to 2525 pixels the
image is further compressed, resulting in 625 1 feature
vectors for each image. PCA is a way of identifying patterns
in the data and expressing the data in such a way as to
highlight their similarities and differences [21]. Since the
patterns in the data can be difficult to identify in data of high
dimensions where the graphical representation is not available,
PCA is a powerful tool for analyzing such data. The other
main advantage of PCA is its ability to reduce the number of
dimensions without much loss of information. PCA will be
used in this experiment to reduce the features of each image
(its 18 Gabor images). To perform PCA, we reshape each
Gabor image into one column producing a matrix have 625
rows and 18 columns, then the mean value of these Gabor
images columns are subtracted from each Gabor image
column. The subtracted means are the average across each
dimension, and then the covariance matrix is calculated
(which is 18*18 matrix). Then, based on the covariance
matrix, eigenvectors and eigenvalues are calculated. Next, the
16 most significant eigenvalues are selected to obtain the
eigenvectors (then will have a matrix of 16 columns and 18
rows). These eigenvectors are input into the radial basis
function neural network classifier after reshaping it into 1
row and adding the geometry features to it (16*18=288
appearance features + 19 geometry features).
C. Facial Expression Classification
After extract feature from face image, the next and last
phase in this system is the classification of the facial
expressions. The facial feature vector of 307 feature elements
(19 geometric features plus 288 appearance features) forms the
input of radial basis function RBF neural network. As shown
in Figure 5. The RBF neural network has one hidden layer
uses neurons with RBF activation functions(
1
,,
m1
)for
describing the receptors. The following equation is used to
combine linearly the outputs of the hidden neurons:
y=
(4)
Wherextis the distance of the input vector
x=(x
1
,,x
m1
) from the vector t=(t
1
,,t
m1
) which is called
the center for
m
inputs (receptors).The proposed system use the
Gaussian function described by the equation 5:
()
(
)
(5)
Where is the spread parameter of the Gaussian function.
Number of neurons is 250 and RBF spread value is 250 used
in this system with person-dependent datasets and Number of
neurons is 80 and RBF spread value is 300 used in this system
with person-independent datasets.
Figure 5. Radial basis function neural network architecture
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
106 | P a g e
www.ijacsa.thesai.org
IV. EXPERIMENTS AND RESULTS
The performance of proposed system is analyse on the
Cohn-Kanade facial expression database Anitha et al.
[22],which is commonly used in research on facial expression
recognition. This database includes approximately 2000 image
sequences with 640x480 pixels arrays and 8-bit precision for
grayscale values from over 200 subjects. Subjects in the
available portion of the database are 97 university students.
They are ranged in age from 18 to 30 years. 65% are females,
15% are African-American, and 3% are Asian or Latino,
Figure6, Samples from original Cohn-Kanade Facial
Expression Database. In order to compare the results of this
system with that of Youssif et al [18], the database used in
[18], has been adopted. They prepared two datasets from the
available portion to examine the proposed system with
different two manners. In the first manner the proposed system
trained on each subject (person) different facial expression.
So; all subjects with their facial expressions were exist in both
training and testing datasets which called person-dependent
dataset. In this dataset the last five images from each subject
facial expression were taken and used the odds (60%) in
training and the evens (40%) in testing. In the second manner
the system trained in the facial expressions independent on the
subjects. The person-independent dataset were prepared by
selecting one image that convey facial expression information
from each subject facial expressions images to form the seven
facial expression classes. Then a 60% of the images were
selected for training phase and the rest (40%) for testing. In
both datasets the normal class was prepared by picking up the
first image from each expression for subject.
Figure.6: Samples from original Cohn-Kanade Facial Expression Database.
The results in tables (1,2) show that the proposed system
can classify the facial expressions correctly with recognition
rates97.08%using person-dependent dataset and recognition
rate 93.98% using person- independent dataset. From these
results we can be conclude that the proposed system has
improved recognition rate in both database used with the
system used by Youssif et al [18], where the percentage was
96 using person-dependent dataset and 93 using person-
independent dataset. The system is implemented in MATLAB
7.7.0 on 1.73GHz Microsoft Windows XP professional
workstation.
V. CONCLUSIONS
This paper proposed a method for automatic facial
expression recognition. A system is capable detect a human
face in static images and extract features by a hybrid approach
(geometric and appearance approach) and then classify
expressions presented in those faces. A hybrid approach is
used for facial features extraction as a combination between
geometric an appearance facial features. Radial Basis Function
RBF based neural network is used for facial expression
recognition.
The proposed system can classify the facial expressions
correctly with classification rates between 92% and 100%
(recognition rate 97.08%) using person-dependent dataset and
between 84.6% and100% (recognition rate 93.98%) using
person-independent dataset. The proposed system could
classify anger, fear, normal, sad and surprise in maximum
rates but the disgust and happy in minimum rate in person-
dependent dataset. In person-independent dataset the anger,
fear, happy, normal and surprise classes could be recognized
at maximum rates but fear and sad classes registered the
minimum recognition rate.
REFERENCES
[1] A. Mehrabian, Communication without Words, Psychology Today,
Vo.1.2, No.4, pp 53-56, 1968.
[2] C. Darwin, The expression of the emotions in man and animal,
J.Murray, London, 1872.
[3] P. Ekman and W.V. Friesen, Facial Action Coding System (FACS),
Manual. Palo Alto: Consulting Psychologists Press, 1978.
[4] H. Kobayashi and F. Hara, Recognition of Six Basic Facial Expressions
and Their Strength by Neural Network, Proc. Int'lWorkshop Robot and
Human Comm , pp. 381-386, 1992.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
107 | P a g e
www.ijacsa.thesai.org
[5] H. Kobayashi and F. Hara, Recognition of Mixed Facial Expressions by
Neural Network, Proc. Int'l Workshop Robot and Human Comm., pp.
387-391, 1992.
[6] H. Kobayashi and F. Hara, Facial Interaction between Animated 3D
Face Robot and Human Beings, Proc. Int'l Conf. Systems, an,
Cybernetics, pp. 3,732-3,737, 1997.
[7] M. Pantic and L. J. M. Rothkrantz, Facial action recognition for facial
expression analysis from static face images, IEEE Transactions on
Systems, Man, and Cybernetics, Part B: Cybernetics, 34(3):14491461,
2004.
[8] X. Feng, Facial expression recognition based on local binary patterns
and coarse-to-fine classification, In Proceedings of Fourth
International Conference on Computer and Information Technology,
pages 178183, 2004.
[9] HB. Deng
1
, LW. Jin
1
, LX. Zhen
2
and JC. Huang,
2
A new facial
expression recognition method based on local Gabor filter bank and
PCA plus LDA, Int J Inf Technol 11(11):8696. 2005.
[10] S. Lajevardi and M. Lech, Facial expression recognition from image
sequences using optimized feature selection, Image and Vision
Computing New Zealand, IVCNZ 2008. 23rd International Conference,
pp. 16, 2008.
[11] G. R. S. Murthy, R. S. Jadon, Effectiveness of Eigenspaces for facial
expression recognition, International Journal of Computer Theory and
Engineering, Vol. 1, No. 5, 1793-8201, pp. 638-642. December 2009.
[12] Z.Zhang, M. Lyons, M. Schuster and S. Akamatsu Comparison
between geometry-based and Gabor wavelets-based facial expression
recognition using multi-layer perceptron, Proc. Int'l Conf. Automatic
Face and Gesture Recognition, pp. 454-459.1998.
[13] Y. Zhang and Q. Ji. Active and dynamic information fusion for facial
expression understanding from image sequences, IEEE Transactions on
Pattern Analysis and Machine Intelligence, 27(5):699714, 2005.
[14] Jun Ou, Xiao-Bo Bai, Yun Pei, Liang Ma, Wei Liu, Automatic facial
expression recognition using Gabor filter and expression analysis,
Second International Conference on Computer Modeling and Simulation
, IEEE, pp 215-218.2010.
[15] M. Monwar, and S. Rezaei, Pain recognition using artificial neural
network, Signal Processing and Information Technology, 2006 IEEE
International Symposium on. 2006.
[16] P. Viola, and M. Jones. Robust real-time object detection, Second
international workshop on statistical and computational theories of
vision-modeling,learning, computing, and sampling. 2001.
[17] Sreekar Krishna. Open CV Viola-Jones Face Detection in Matlab.
[Online] available :http://www.mathworks.com/ matlabcentral
/fileexchange/19912 .June 6, 2010.
[18] A. Youssif, and W. Asker, Automatic Facial Expression Recognition
System Based on Geometric and Appearance Feature, Computer and
Information Science,Vol.4, No.2, March 2011.
[19] TS. Lee Image representation using 2D Gabor wavelets, IEEE Trans
Pattern Anal Mach Intell 18:959971, 1996.
[20] I. Kotsia, I. Buci and I. Pitas. An analysis of facial expression
recognition under partial facial image occlusion, J Image Vision
Computer, 26:10521067, 2008.
[21] LI. Smith, A tutorial on principal component analysis, Cornell
University, pp 222, 2002.
[22] M. Anitha, K. Venkatesha and B. Adiga, A survey on facial expression
databases,. International Journal of Engineering Science and
Technology, Vol. 2(10), 5158-5174, 2010.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
108 | P a g e
www.ijacsa.thesai.org
Design of A high performance low-power
consumption discrete time Second order Sigma-Delta
modulator used for Analog to Digital Converter
Radwene LAAJIMI
University of Sfax
Electronics, Micro-technology and Communication
(EMC) research group
Sfax (ENIS), BP W, 3038 Sfax, Tunisia
Mohamed MASMOUDI
University of Sfax
Electronics, Micro-technology and Communication
(EMC) research group
Sfax (ENIS), BP W, 3038 Sfax, Tunisia
AbstractThis paper presents the design and simulations results
of a switched-capacitor discrete time Second order Sigma-Delta
modulator used for a resolution of 14 bits Sigma-Delta analog to
digital converter. The use of operational amplifier is necessary
for low power consumption, it is designed to provide large
bandwidth and moderate DC gain. With 0.35m CMOS
technology, the modulator achieves 86 dB dynamic range, and
85 dB signal to noise ratio (SNR) over an 80 KHz signal
bandwidth with an oversampling ratio (OSR) of 88, while
dissipating 9.8mW at 1.5V supply voltage.
Keywords- CMOS technology; Analog-to-Digital conversion;
Low power electronics; Sigma-Delta modulation; switched-
capacitor circuits; transconductance operational amplifier.
I. INTRODUCTION
Analog to digital converters (ADC) with high resolution are
widely used in the areas of instrumentation, measurement,
telecommunications, digital signal processing, consumer
electronics, etc. With the advancements in the Very Large
Scale Integration (VLSI) technologies, the focus is shifted on
oversampling and converters for applications requiring high
precision analog-to digital conversion with narrow bandwidth
[1, 2, 3, 4]. They are preferred because of their inherent relaxed
sensitivity to analog circuit errors and reduced analog
processing as compared to other analog-to-digital conversion
techniques. These advantages come at the expense of relatively
large amount of digital processing and the working of the
majorpart of circuit at a clock rate which is much higher than
the analog-to-digital conversion rate. Because of using higher
clock rates and feedback loop, these converters tend to be
robust in the face of analog circuit imperfections [3] and do not
require trimmed components which are considered necessary in
conventional high precision Nyquist rate ADCs.
For these reasons, high precision ADCs can be
implemented using high density VLSI processes.
ADC is a system which consists of a modulator
followed by a digital decimation filter. The modulator over-
samples the input signal i-e performs sampling at a rate much
higher than the nyquist rate. The ratio of this rate to the Nyquist
rate is called over-sampling ratio (OSR).
After over-sampling, it typically performs very coarse
analog-to-digital conversion at the resulting narrow-band
signal. By using coarse digital-to-analog conversion and
feedback, the quantization error introduced by the coarse
quantizer is spectrally shaped i-e the major portion of the noise
power is shifted outside the signal band. This process is called
quantization noise shaping. The digital decimation filter
removes the out-of-band portion of the quantization error and
brings back the output rate to Nyquist rate.
The paper is organized as follows. Section II presents a
review of modulator with theoretical analysis. Section III
presents the described structure of the second order
modulator with simulations results. In section IV, all main
parameters of the proposed modulator are indicated with a full
comparison of the most popular designs in which the
performance of each modulator is cited in table IV. Conclusion
is drawn in Section V.
II. REVIEW OF SIGMA-DELTA MODULATOR
Signal-to-noise ratio (SNR) and Dynamic range (DR) are
two most important specifications commonly used to
characterize the performance of over-sampling sigma-delta
ADCs [5]. The system design begins with the calculation of the
dynamic range. According to the design theory, the DR of a
sigma-delta modulator can be gotten by following formula.
(
H
+
=
+ 2 1 2
2
10
) 1 2 (
) 1 2 (
2
3
log 10 ) (
B L
L
OSR
L
dB DR (1)
The theoretical DR is a function of the modulator order L,
oversampling ratio OSR, and the numbers of bits in the
quantizer B.
The expression (1) reveals that an additional bit in the
internal quantizer can roughly obtain a 6-dB improvement of
DR. This improvement is independent of the OSR, while the
improvement obtained with increasing the order L is dependent
on it. The DR of a theoretical L th-order Sigma-Delta converter
increases with OSR in (L + 1/2) bits/octave. This is shown in
Figure 1, where the DR is plotted as a function of the
oversampling ratio and the modulator order, in case of a single-
bit internal quantizer.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
109 | P a g e
www.ijacsa.thesai.org
2 4 8 16 32
0
10
20
30
40
50
60
70
80
90
OSR
D
R
(
d
B
)
L = 1
L = 2
L = 3
Figure 1.DR Vs OSR of Lth-order theoretical Modulators
The SNR of a converter is the ratio of the input signal
power to the noise power measured at the output of the
converter. The maximum SNR that a converter can achieve is
called peak signal-to-noise-ratio (SNR
p
). The noise here should
include the quantization and circuit noise. The SNRP of the L-
th order Sigma-Delta modulator can be calculated as:
1 2 2
) )( 1 2 ( ) 1 2 (
2
3
+
H
+
H
=
L B
P
OSR
L SNR
(2)
The SNR of the ADC can be increased by (2L + 1) 3dB,
or L + 0,5bits by doubling the oversampling ratio, where L
denotes the order of the loop filter. It is tempting to raise the
oversampling ratio to increase the SNR of the Sigma-Delta
modulator. However, it is restricted by the speed limit of the
circuit and the power consumption.
In practice, for the same performance, it is preferred to
lower the oversampling ratio. Another driving force is the ever
increasing bandwidth requirement, which also needs to lower
the oversampling ratio.
For high bandwidth converters, the oversampling ratio
should be kept as low as possible. A lot of efforts have been
made at the system level to lower the oversampling ratio and
maintain the same performance.
Starting from the desired 14-bit of resolution, the Sigma-
Delta ADC was designed. The DR is computed as follows:
) 2 ( 3
1 2 2
=
N
DR (3)
) 2 ( 3
1 14 2 2
= DR (4)
Where N represents number of bits, hence DR (dB) is equal
to 86.04dB. For computing the OSR for first, second and third
order devices, a small MATLAB script was developed using
the following equation:
1 2
1
2 2
1 2 3
2
+
|
|
.
|
\
|
+
H
=
L
L
L
DR
OSR (5)
OSR means Over Sampling Ratio which is based on the
following formula:
b
s
f
F
OSR
=
2
(6)
If we choose an OSR equal to 88, F
s
is the sampling
frequency which is calculated to 14.08MHz, and f
b
means base
band frequency (80 KHz).Table I shows the results of the
equation (4) for a first, second and third order.
TABLE I. OSR FOR FIRST TO THIRD MODULATORS ORDER
Modulators order OSR
1
rst
960
2
nd
88
3
rd
24
For a first order modulator an OSR of 960 is needed, given
a relatively high sampling frequency for the 0.35m CMOS
ultra-low-voltage system. For the second and third order
devices, better sampling frequencies are required. Although for
the third order modulator, the lowest sampling frequency is
needed, this frequency implies the use of higher consumption
and taking up more area. For that reason, a second order
modulator was chosen for this application.
N is the effective number of bits of converter which is
given by [6]:
| |
L L
OSR L B N
2 1 2 2
2
) /( ) 1 2 ( ) 1 2 ( log
2
1
t
+
+ =
(7)
Where B represents number of bits of the quantization
circuitry (B = 1).
In this case, OSR is equal to 88, F
s
is the sampling
frequency which is calculated to 14.08MHz. We obtained an
effective number of bits equal to 14 bits which verify the two
last results in equations (1) and (3).
III. PROPOSED SECOND ORDER SIGMA -DELTA
MODULATOR
H(z)
DAC
Y
X
Qin
E
Figure 2. Linear model of conventional first order Modulator
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
110 | P a g e
www.ijacsa.thesai.org
As shown in figure 2, first-order modulator has the
advantages of being simple, robust and stable. Despite these
good points, its overall performance in terms of resolution and
idle-tone generation is inadequate for most applications.
Second-order modulator overcomes these disadvantages at
the expense of increased circuit complexity and reduced signal
range. According to figure 2, the modulator can be considered
as a two-input [E (z) and X (z)], one-output [Y (z)] linear
system. The output signal is expressed as:
) ( ). ( ) ( ). ( ) ( z E z NTF z X z STF z Y + = (8)
Where STF (z) is the signal transfer function and NTF (z) is
the noise transfer functions, which are given by:
) ( 1
) (
) (
) (
) (
z H
z H
z X
z Y
z STF
+
= =
(9)
) ( 1
1
) (
) (
) (
z H z E
z Y
z NTF
+
= =
(10)
By using superposition principle, the output signal is
obtained as the combination of the input signal and the noise
signal, with each being filtered by the corresponding transfer
function:
) ( ). ( ) ( ). ( ) ( z E z NTF z X z STF z Y + = (11)
If we choose STF(z) equal to Z
-1
and NTF(z) equal to 1-Z
-1
we obtain:
2
) (
= z z STF
(12)
2 1
) 1 ( ) (
= z z NTF
(13)
In this case, the output signal for the ideal linear model can be
written as:
2 1 2
) 1 ).( ( ). ( ) (
+ = z z E z z X z Y
(14)
To achieve the objective cited above, All main parameters
of the described modulator are summed up in Table II.
TABLE II. DESIGNED MODULATOR PARAMETERS
Parameters Value
Order of modulator 2
Sampling Frequency (clock) 14.08 MHz
Signal Band width 80 KHz
Over sample Ratio(OSR) 88
Resolution 14 bits
The functional diagram of the second order modulator
simulated using Simulink in MATLAB is shown in Figure 3.
The single bit DAC is replaced by a simple wire. The input is a
sinusoidal signal with 0.4 V amplitude and frequency 20 kHz.
This signal is fed through two integrators and is connected to
the comparator at the output.
To Workspace1
y2
Sine Wave Sign
Discrete Filter 4
1
1-z
-1
Discrete Filter 1
1
1-z
-1
Discrete
Transfer Fcn 1
1
z
Analog
Signal
input
Digital
Signal
output
First Integrator
DAC
Second
Integrator
Figure 3. Block diagram of Second Order Modulator
The modulated output as seen through the scope is shown
in Figure 4 with the input signal overlaid on it.
0 100 200 300 400 500 600 700
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Time (10 Exp-7 Second)
V
o
l
t
a
g
e
(
V
)
Figure 4. Pulse density output from a sigma-delta modulator for a sine wave
input
A discrete Fourier Transform (DFT) of the sampled output
signal is performed to calculate the SNR of the system. The
logarithm of the amplitude of the signal is plotted versus the
signal frequency and the SNR is found to be close to 85.17 dB
as shown in Figure 5.
0 1 2 3 4 5 6 7 8
x 10
4
-160
-140
-120
-100
-80
-60
-40
-20
0
X: 7.055e+004
Y: -85.17
Frequency (Khz)
O
u
t
p
u
t
s
p
e
c
t
r
u
m
(
d
B
)
Figure 5. Frequency spectrum zoom of the modulated signal
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
111 | P a g e
www.ijacsa.thesai.org
It can be seen that second order noise shaping is taking
place wherein most of the noise is pushed to the higher
frequency bands as shown in figure 6. The original signal can
be retrieved using a digital low pass filter.
0 2 4 6 8 10 12 14
x 10
6
-160
-140
-120
-100
-80
-60
-40
-20
0
Frequency (Khz)
O
u
t
p
u
t
s
p
e
c
t
r
u
m
(
d
B
)
Figure 6. Frequency spectrum of the modulated signal
Figure 7 shows a block diagram of a complete Second-
order modulator. It is made up of two integrators, a
comparator, and digital to analog converter (DAC). These
include switches Q and Q for applying one of two reference
node voltages, +V
ref
and -V
ref
, depending on comparator output
polarity. The two integrators are based on amplifier, and use
two phase, with the respective phases denoted by 1 and 2.
As shown in figure 7, signal sampling is completed by the
first integrator connecting to the input. For this reason, the
amplifier determines the whole performance of the sigma-delta
modulator and needs to be carefully designed. Each integrator
is made up of one operational amplifier. This operational
amplifier has an adequate bandwidth and higher voltage gain. It
is composed of two input transistors formed by P-channel
MOSFETs M1 and M2 in order to reduce 1/f noise [7].
This stage of op-amp is also formed by N-channel
MOSFETs, M3 and M4.The use of transistor M7 as a P-
channel common source amplifier forms the second stage of
op-amp. The polarisation block is formed by ten transistors
(M5, M6, M8, M9, M10, M11, M12, M13, M14 and M15)
with variable input voltage V
in
and input resistance R
in
.
The overall gain of the amplifier is found to be given by:
( ) ( )
7 6 7 4 2 1
// . //
ds ds m ds ds m v
r r g r r g A = (15)
Figure 7. A complete Second-order modulator
Where g
mi
is the transconductance and r
dsi
is the drain to
source resistance of i
th
transistors with i =1,2,4,6,7.
A
A
B
B
C
C
D
D
E
E
F
F
G
G
H
H
I
I
J
J
1 1
2 2
3 3
4 4
5 5
6 6
7 7
Page: 1 of 1 Size: A No:
Rev: 46 16-Jan-2012 13:11 Created: 10-Nov-2011
File: C:\Users\radwene\Desktop\essai_TOP_SPICE\proposed_miller_OTA.sch
M8
M5 M6
M1
M2
M4
M7
M3
Cc
Cl
Vdd
1.5V
Vss
-1.5V
Vin
M9
M10
1 2
3
M11
M12
M14
M15
M13
Rin
Iin
Figure 8. Miller Operational Transconductance Amplifier
Considering the appeal to moderate speed and noise of
operational amplifier together with the low power consumption
of the operational amplifier, Miller structure is chosen, and the
structure of circuit is shown in figure 8. The system has
problems of instability because the signal will pass through
two-stage circuit and may bring extra pole and zero in this
operational amplifier, so it requires importing frequency
compensation. Simulations results using T-spice shows that
output frequency response of our operational amplifier
achieves a gain of 60 dB with a large gain bandwidth of 82
MHz and a phase margin of 62 to ensure more stability (figure
9).
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
112 | P a g e
www.ijacsa.thesai.org
100 1k 10k 100k 1M 10M 100M
Frequency (Hz)
-100
-50
0
50
M
a
g
n
i
t
u
d
e
(
d
B
)
,
P
h
a
s
e
(
d
e
g
)
Figure 9. Frequency response of Operational Transconductance Amplifier
A. Why high performance and low power?
With the rapid development of computers and
communications, a very large number of chips required to have
higher performance, low power and small size. Hence
analogue, mixed signals with low power and especially Sigma-
Delta modulators become more and more important.
Operational amplifier structure for almost all modulators
circuits determine the performance of analogue structure,
which largely depends on their characteristics. One of the
popular op-amp is a two-stage Miller op-amp. It introduces an
important concept of compensation. The object of this
compensation is to maintain stability of the operational
amplifier. This last is made up of three stages even though it is
often referred to as a two-stage op-amp, ignoring the buffer
stage. The first stage is composed of the input devices of the
differential pair which are formed by P-channel or N-channel
MOSFETs. It plays a very important role due to its differential
input to single-ended output conversion and its high gain. In
addition this stage of op-amp also had the current mirror circuit
formed by N-channel MOSFETs. In the other hand the second
stage is formed by only one transistor which serves as a P-
channel common source amplifier. The current I
bias
of the op-
amp circuit goes through current mirrors formed by the rest of
P-channel transistors in order to produce a low current of 10A
[17].
According to figure 10, the DC characteristic for the V-I
converter for different values of resistance (R
out
=100 ,
R
out
=2.5 K, R
out
=5 K) is presented. In the one hand the full
input voltage swing capability is evident with truly linearity. In
the other hand, in order to ensure low and low consumption an
input voltage V
in
varied from -1.1V to 0V to provide a current
from -10A to 150A. In this case the current is equal to 10A
at input voltage V
in
equal to -1V.
-1.1 -1.0 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 -0.0
Vin (V)
0
50
100
150
O
u
t
p
u
r
C
u
r
r
e
n
t
(
u
A
)
R
out
= 100
R
out
= 2.5 K
R
out
= 5 K
For V
in
= - 1V I
out
=10A
Figure 10. DC characteristics of V-I converter for different values of
resistance Rout
From figure 8, the Miller operational amplifier is presented.
Here in the polarisation circuit, an amplifier between the
mirrors input and output transistor is necessary to achieve high
current copy accuracy. In the other hand we use a simple
amplifier composed of only two transistors.
This amplifier is proposed by K. Tanno [19] to provide low
voltage, low consumption and high performance as shown in
figure 11.
A
A
B
B
C
C
D
D
E
E
F
F
G
G
H
H
I
I
J
J
1 1
2 2
3 3
4 4
5 5
6 6
7 7
Page: 1 of 1 Size: A No:
Rev: 49 16-Jan-2012 04:18 Created: 10-Nov-2011
File: C:\Users\radwene\Desktop\essai_TOP_SPICE\essai00.sch
MP1
MP2
MN2
MN1
Vdd
Vss
1 2
3
A
A
B
B
C
C
D
D
E
E
F
F
G
G
H
H
I
I
J
J
1 1
2 2
3 3
4 4
5 5
6 6
7 7
Page: 1 of 1 Size: A No:
Rev: 53 16-Jan-2012 12:10 Created: 10-Nov-2011
File: C:\Users\radwene\Desktop\essai_TOP_SPICE\essai00.sch
MN2
Vdd
1 2
3
MN1
(a) (b)
2
3
1
I bias
V
DD
I bias
M5 M6
M7
I bias/2
1
2
3
(b)
Figure 11. amplifier configuration:
(a) Simple differential amplifier structure [18]
(b)Two-transistor amplifier structure [19]
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
113 | P a g e
www.ijacsa.thesai.org
TABLE IV. COMPARISON TABLE OF MOST POPULAR DESIGNS WITH CURRENT WORK (*)
Resolution OSR SNR
(dB)
Speed
(MHz)
Power
(mW)
Process
(CMOS)
Signal
Band width
Order Ref.
No.
14-bit 166 - 5.312 0.5 0.35m 16 KHz 2 8
15-bit 16 96 40 44 0.25m 1.25 MHz 2 9
10-bit 128 68 1 0.4 0.18m 1 MHz 2 10
16-bit 64 93 3.2 5 0.35m 25KHz 4 11
14-bit 24 85 2.2 200 0.35m 100 KHz 6 12
11-bit 10 62.5 300 70 0.13m 15 MHz 4 13
14-bit 96 85 53 15 0.18m 300 KHz 2 14
16-bit 128 - 5.12 2.6 0.18m 20 KHz 3 15
8-bit 64 49.7 1.024 6.6 0.6m 8 KHz 1 16
14-bit* 88 85 14.08 9.8 0.35m 80 KHz 2 This work
IV. RESULTS AND COMPARISON
The second order modulator is designed by using AMS
technology 0.35um CMOS process, the over sampling ratio is
88 with a signal band width of about 80 kHz.
All main parameters of the described modulator are
summed up in Table III.
TABLE III. T DESIGNED MODULATOR PARAMETERS
Parameters Value
Technology AMS 0.35 m
Order of modulator 2
Sampling Frequency (clock) 14.08 MHz
Signal Band width 80 KHz
Over sample Ratio(OSR) 88
References 1V
Maximum Input 1 Vpp
Supply voltage 1.5V
Resolution 14-bit
Signal to Noise Ratio (SNR) 85 dB
Dynamic range (DR) 86 dB
Quantizer resolution 1 bit
Power consumption 9.8 mW
The current state of the art in the design of modulator is
limited by the technology and the sampling speeds it is able to
achieve. Here is a comparison in table IV of the most popular
designs which also compares the published works with the
current work. It can be seen that the current work consumes
less power than most published work and achieves the
resolution of 14 bits using one of the technology AMS 0.35m
CMOS process.
V. CONCLUSION
The low-power-consumption modulator is designed with
switched-capacitor techniques, and the resolution reaches 14
bits in AMS 0.35 m CMOS process. Compared to other
modulators, the second order single modulator has many
advantages on performance, stability, area and system
specification requirements, especially the power consumption.
When the power supply is 1.5 V, the power consumption
is only 9.8 mW.
In the future work, we will study and design of the different
blocks constituting complete analogue to digital converter
(ADC) in transistor level, which is composed of an analog
Sigma Delta modulator with a digital filter, in the other hand
we will discuss about multi-bit discrete-time Sigma-Delta
ADC.
REFERENCES
[1] Delta Sigma Data Converters: Theory, Design and Simulation. IEEE
Press, IEEE Circuits and Systems Society, 1997.
[2] B. E. Boser and B. A. Wooley. The design of sigma-delta modulation
and analog-to-digital converters. IEEE Journal of Solid-State Circuits,
23:12981308, December 1988.
[3] Oversampling Delta-Sigma Converters. IEEE Press, 1992.
[4] Top-Down Design of High-Performance Sigma-Delta Modulators.
Kluwer Academic Publishers, 1999
[5] Vineeta Upadhyay and Aditi Patwa, Design Of First Order And Second
Order Sigma Delta Analog To Digital Converter,International Journal
of Advances in Engineering & Technology, July 2012.
[6] Medeiro F., del Rio R., de la Rosa J.M., Prez-Verd B., A Sigma-Delta
modulator design exemple : from specs to measurements, Baecelonea,
May 6-10, 2002
[7] David Johns and Kenneth W. Martin : "Analog Integrated Circuit
Design" John Wiley & Sons, 1997.
[8] F. Munoz, A. P. VegaLeal, R. G. Carvajal, A. Torralba, J. Tombs, J.
Ramirez-Angulo, \A 1.lV Low-Power __ Modulator For 14-b 16KHz
A/D Conversion"; The 2001 IEEE International Symposium on Circuits
and Systems, Vol. 1, 6-9 May 2001
[9] KiYoung Nam, Sang-Min Lee, David K. Su, and Bruce A. Wooley : "
Voltage Low-Power Sigma-Delta Modulator for Broadband Analog-to-
Digital Conversion" IEEE JOURNAL OF SOLID-STATE CIRCUITS,
VOL. 40, NO. 9, SEPTEMBER 2005
[10] H. Lee, C. Hsu, S. Huang, Y. Shih, C. Luo : " Designing low power of
Sigma Delta Modulator for Biomedical Application", Biomedical
Engineering-Applications, Basis & Communications, Vol. 17 No. 4,
August 2005.
[11] Hsin-Liang Chen, Yi-Sheng Lee, and Jen-Shiun Chiang: "Low Power
Sigma Delta Modulator with Dynamic Biasing for Audio applications",
Circuits and Systems, 5-8 Aug 2007. MWSCAS 2007. 50th Midwest
Symposium on
[12] Morizio J, Hoke I M, Kocak T, Geddie C, Hughes C, Perry
J,MadhavapeddiS, Hood M, Lynch G, Kondoh H, Kumamoto T, Okuda
T,Noda H, Ishiwaki M, Miki T, Nakaya M, 14-bit 2.2-MS/s sigma-delta
ADC's,Solid-State Circuits, IEEE Journal of, Volume 35, Issue 7, July
2000 Page(s):968 976.
[13] Di Giandomenico A, Paton S, Wiesbauer A, Hernandez L, Potscher T,
Dorrer L, A 15 MHz bandwidth sigma-delta ADC with 11 bits of
resolution in 0.13/spl m CMOS, Solid-State Circuits, IEEE Journal of,
Volume 39, Issue 7,July 2004, Page(s): 1056- 1063.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
114 | P a g e
www.ijacsa.thesai.org
[14] Gaggl R, Wiesbauer A, Fritz G, Schranz Ch., Pessl P, A 85-dB
Dynamic Range Multibit DeltaSigma ADC for ADSL-CO Applications
in 0.18m CMOS, Solid-State Circuits, IEEE Journal of, Volume 38,
Issue 7, July 2003,Page(s): 1105 1115
[15] Chen Yueyang, Zhong Shunan, Dang Hua, Design of A low-power-
consumption and high-performance sigma-delta modulator, 2009 World
Congress on Computer Science and Information Engineering
[16] Boujelben S, Rebai Ch., Dallet D, Marchegay Ph., Design and
implementation of an audio analog to digital converter using
oversamplingtechniques. 2001 IEEE
[17] Laajimi Radwene; Gueddah Nawfil; Masmoudi Mohamed : "Low
Power Variable Gain Amplifier with Bandwidth Of 80-300 MHz Using
For Sigma-Delta Analogue to Digital Converter in Wireless Sensor
Receiver" International Journal of Computer Applications;Apr2012,
Vol. 43
[18] Ahmed Nader Mohieldin, Edgar Snchez-Sinencio, and Jos Silva-
Martnez : Nonlinear effects in pseudo differential OTAs with CMFB,
IEEE Transactions On Circuits and Systems-II: Analog and Digital
Signal Processing, Vol. 50, No. 10, October 2003.
[19] K. tanno, O. Ishizuka and Z. Tang : Low voltage and low frequency
current mirror using a two- MOS subthreshold op-amp, Electronics
Letters 28th March 1996 Vol. 32 No. 7
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
115 | P a g e
www.ijacsa.thesai.org
Short Answer Grading Using String Similarity And
Corpus-Based Similarity
Wael H. Gomaa
Computer Science Department
Modern Academy for Computer Science & Management
Technology, Cairo, Egypt
Aly A. Fahmy
Computer Science Department
Faculty of Computers and Information, Cairo University
Cairo, Egypt
AbstractMost automatic scoring systems use pattern based that
requires a lot of hard and tedious work. These systems work in
a supervised manner where predefined patterns and scoring rules
are generated. This paper presents a different unsupervised
approach which deals with students answers holistically using
text to text similarity. Different String-based and Corpus-based
similarity measures were tested separately and then combined to
achieve a maximum correlation value of 0.504. The achieved
correlation is the best value achieved for unsupervised approach
Bag of Words (BOW) when compared to previous work.
Keywords-Automatic Scoring; Short Answer Grading; Semantic
Similarity; String Similarity; Corpus-Based Similarity.
I. INTRODUCTION
Educational community is growing endlessly with a
growing number of students, curriculums and exams. Such a
growing community raised the need for scoring systems that
ease the burden of scoring numerous numbers of exams and in
same time guarantees the fairness of the scoring process.
Automatic Scoring (AS) systems evaluate students answer by
comparing it to model answer(s). The higher correlation
between student and model answers the more efficient is
the scoring system. The variety in curriculums forced the AS
technology to handle different kinds of students responses,
such as writing, speaking and mathematics. Writing
assessment comes in two forms Automatic Essay Scoring
(AES) and Short Answer Grading. Speaking assessment
includes low and high entropy spoken responses while
mathematical assessments include textual, numeric or
graphical responses. Design and implementation of Automatic
Scoring system for questions as Multiple Choice, True-False,
Matching and Fill in the blank is an easy task. AS systems
designed for scoring essay questions is a more difficult and
complicated task as students answers require text
understanding and analysis. This paper is concerned with the
automatic scoring for answers for essay questions. This
research presents an unsupervised approach that deals with
student's answers holistically and uses text to text similarity
measures [1, 2]. The proposed model calculates the automatic
score by measuring the text similarity between each word in
model answer to all words in the students answer which saves
the time spent by experts to generate predefined patterns and
scoring rules.
Two types of text similarity measures are presented in this
research, String-based similarity, and Corpus-based similarity.
String-based similarity measures operate on string sequences
and character composition. Corpus-based works to identify the
degree of semantic similarity between words; it depends on
information derived from large corpora [3].
This paper is organized as follows:
Section II presents related work of the main
automatic short answer grading systems.
Section III introduces the two main categories of used
Similarity Algorithms.
Section IV presents the used Data Set .
Section V describes the proposed answer grading
system.
Section VI shows experiments results.
Finally, section VII presents conclusion.
II. RELATED WORK
This section describes the most famous short answer
grading systems implemented for English language: C-rater
[4, 5, 6], Oxford-UCLES [7, 8], Automark [9], IndusMarker
[10], and Text-to-Text system [1, 2].
C-rater is the system developed by ETS, and is very
reputational for high scoring accuracy for short answer
responses. The reason behind high accuracy is using deep
natural language processing to determine the relatedness of
student response to the concepts listed in the rubric for an
item. The C-rater engine applies a sequence of natural
language processing steps including correcting students'
spelling, determining the grammatical structure of each
sentence, resolving pronoun reference, and analyzing
paraphrases in the student responses [5, 6]. It has been
validated on responses from multiple testing programs with
different content areas including science, reading
comprehension and history.
Oxford-UCLES is an information extraction short-answer
scoring system that was developed at Oxford University. It
uses pattern matching to evaluate the students answers where
patterns are discovered by human experts. First, it applies
simple IE techniques as the nearest neighbors classification
[7], then the machine learning methods like decision tree
learning, Bayesian learning and inductive logic programming
[8] are used.
Automarkis another system that uses IE techniques to
explore the meaning or concept of text. The marking process
depends mainly on content analysis in addition to specific
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
116 | P a g e
www.ijacsa.thesai.org
style features. Marking goes through 5 stages, they are
discovering mark scheme templates, syntactic preprocessing,
sentence analysis, pattern matching, and feedback module [9].
Indus Marker is a system that works on the structure of
students' answer. It simply uses question answer markup
language (QAML) to represent the required answer structures.
The evaluation process starts with spell checking and some
basic linguistic analysis, then the system matches the students
answer text structure with the required saved structure to
compute the final mark [10].
Text-to-Text system as shown from the name; this system
depends mainly on text comparison between student's answer
and model answer. It doesn't use any predefined concepts or
scoring rules in the evaluation process. In this approach, the
evaluation process doesnt pay much attention to the subject
materials, the students answer methodology, the question
type, the length of the answer and all such factors. Different
semantic similarity measures were compared in [1, 2],
including Knowledge-based and Corpus-based algorithms.
This research depends on this approach by combining several
String-based and Corpus-based similarity methods.
III. TEXT SIMILARITY MEASURES
Two categories of similarity algorithms are introduced;
String-based and Corpus-based similarity .This section will
handle the two measures in brief.
A. String-Based Similarity
String similarity measures operate on string sequences and
character composition. A string metric is a metric that
measures similarity or dissimilarity (distance) between two
text strings for approximate string matching or comparison.
Applying the concept of string metric; 13 algorithms of text
similarity using Sim Metrics [11] are implemented. Six of
them are character-based while the other seven are term-based
distance measures
1) Character-based distance measures
Damerau-Levenshtein distance is a distance between two
strings, given by counting the minimum number of operations
needed to transform one string into the other, where an
operation is defined as an insertion, deletion, or substitution of
a single character, or a transposition of two adjacent characters
[12,13].
Jaro algorithm is based on the number and order of the
common characters between two strings; it takes into account
typical spelling deviations and mainly used in the area of
record linkage. [14, 15].
JaroWinkler distance is an extension of Jaro distance; it
uses a prefix scale which gives more favorable ratings to
strings that match from the beginning for a set prefix
length [16].
Needleman-Wunsch algorithm is an example of dynamic
programming, and was the first application of dynamic
programming to biological sequence comparison. It performs a
global to find the best alignment over the entire of two
sequences. It is Suitable when the two sequences are of
similar length, with a significant degree of similarity
throughout [17].
Smith-Waterman algorithm is an example of dynamic
programming; it performs a local alignment to find the best
alignment over the conserved domain of two sequences. It is
useful for dissimilar sequences that are suspected to contain
regions of similarity or similar sequence motifs within their
larger sequence context [18].
N-gram is a sub-sequence of n items from a given
sequence of text; N-gram similarity algorithms compare the n-
grams from each character or word in two strings. Distance is
computed by dividing the number of similar n-grams by
maximal number of n-grams [19].
2) Term-based distance measures
Block Distance is also known as Manhattan distance,
boxcar distance, absolute value distance, L1 distance, city
block distance and Manhattan distance, it computes the
distance that would be traveled to get from one data point to
the other if a grid-like path is followed. The Block distance
between two items is the sum of the differences of their
corresponding components [20].
Cosine similarity is a measure of similarity between two
vectors of an inner product space that measures the cosine of
the angle between them.
Dices coefficient is defined as twice the number of
common terms in the compared strings divided by the total
number of terms in both strings [21].
Euclidean distance or L2 distance is the square root of the
sum of squared differences between corresponding elements
of the two vectors.
Jaccard similarity is computed as the number of shared
terms over the number of all unique terms in both strings [22].
Matching Coefficient is a very simple vector based
approach which simply counts the number of similar terms,
(dimensions), on which both vectors are non-zero.
Overlap coefficient is similar to the Dice's coefficient, but
considers two strings a full match if one is a subset of the
other.
B. Corpus-Based Similarity
Corpus-Based similarity is a semantic similarity measure
that determines the similarity between words according to
information gained from large corpora.
Latent Semantic Analysis (LSA) [23] is the most popular
technique of Corpus-Based Similarity, LSA assumes that
words that are close in meaning will occur in similar pieces of
text. A matrix containing word counts per paragraph (rows
represent unique words and columns represent each
paragraph) is constructed from a large piece of text and
a mathematical technique which called singular value
decomposition (SVD) is used to reduce the number of
columns while preserving the similarity structure among rows.
Words are then compared by taking the cosine of the angle
between the two vectors formed by any two rows.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
117 | P a g e
www.ijacsa.thesai.org
Explicit Semantic Analysis (ESA) [24] is a measure used
to compute the semantic relatedness between two arbitrary
texts. The Wikipedia-Based technique represents terms (or
texts) as high-dimensional vectors, each vector entry
presenting the TF-IDF weight between the term and one
Wikipedia article. The semantic relatedness between two
terms (or texts) is expressed by the cosine measure between
the corresponding vectors.
Pointwise Mutual Information - Information Retrieval
(PMI-IR) [25] is a method for computing the similarity
between pairs of words, it uses AltaVista's Advanced Search
query syntax to calculate probabilities. The more often two
wordsco-occur near each other on a web page, the higher is
their PMI-IR similarity score.
Extracting DIS tributionally similar words using CO-
occurrences (DISCO
1
)[26, 27]Distributional similarity
between words assumes that words with similar meaning
occur in similar context. Large text collections are statistically
analyzed to get the distributional similarity. DISCO is a
method that computes distributional similarity between words
by using a simple context window of size 3 words for
counting co-occurrences. When two words are subjected for
exact similarity DISCO simply retrieves their word vectors
from the indexed data, and computes the similarity according
to Lin measure [28]. If the most distributionally similar word
is required; DISCO returns the second order word vector for
the given word.
DISCO has two main similarity measures DISCO1 and
DISCO2:
DISCO1: Computes the first order similarity between
two input words based on their collocation sets.
DISCO2: Computes the second order similarity
between two input words based on their sets of
distributionally similar words.
This research, handled the corpus-based approach via
DISCO using the two main similarity measures DISCO1 and
DISCO2.
IV. THE DATA SET
Texas
2
short answer grading data set is used [2]. It consists
of ten assignments between four and seven questions each and
two exams with ten questions each. These assignments/exams
were assigned to an introductory computer science class at the
University of North Texas. The assignments were
administered as part of a Data Structures course at the
University of North Texas. For each assignment, the student
answers were collected via an online learning environment.
The data set as a whole contains 80 questions and 2273
student answers. The answers were scored by two human
judges, using marks between 0 (completely incorrect) and 5
(perfect answer). Data set creators treated the average grade of
the two evaluators as the gold standard to examine the
automatic scoring task.
1
http://www.linguatools.de/disco/disco_en.html
2
http://lit.csci.unt.edu/index.php?P=research/downloads
Figure 1. Students' Marks Distribuation
Table I. Sample Questions, Model Answer and Students Answers
Question, Model Answer and Student
answers
Average
Grades
Question : What is a variable?
Model Answer : A location in memory that can store a value.
Student answer 1:
A variable is a location in memory where
a value can be stored.
5
Student answer 2:
A named object that can hold a numerical
or letter value.
4
Student answer 3:
Variable can be a integer or a string in a
program
2
Question : What is the role of a header-file?
Model Answer :
To store a class interface, including data members and
member function prototypes.
Student answer 1:
a header file is a file used to store a list of
prototype functions and data members.
5
Student answer 2:
to declare the functions being used in the
classes.
3
Student answer 3:
Header files have reusable source code in
a file that a programmer can use.
2.5
Figure 1 shows the students marks distribution and table I
represents a sample question, model answer, student answers
and average grade.
V. ANSWER GRADING SYSTEM
Similar to all systems of automatic short answer grading,
this system is based on measuring the similarity between the
students answer and the model answer to produce the final
score. Then Pearson's correlation coefficient is used to specify
the correlation between automatic score and average human
grades.
The system goes through three stages:
The First stage is measuring the similarity between model
answer and student answer using 13 String-Based algorithms
previously described in section III. In this stage four methods
are used to deal with strings in model and students answer;
Raw, Stop, Stem, StopStem. The similarity in Raw method is
computed without applying any Natural Language Processing
(NLP) task. Stop Words Removing is applied in the Stop
method using stop list that contains 429 words. In Stem
method Porter Stemmer [29] is used to replace each non-stop
with its stem without removing the stop words. Both Stop
Words Removing and Stemming tasks are applied in StopStem
method. Table II represents a sample of student answer using
4 methods.
Students' Marks Distribution
0
200
400
600
800
1000
1200
1400
0 1 2 3 4 5
Students' Marks
N
u
m
b
e
r
o
f
S
t
u
d
e
n
t
s
Number of Students
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
118 | P a g e
www.ijacsa.thesai.org
Table II. Sample Student Answer with 4 Forms
Method Student Answer
Raw removing logical errors testing for valid data random data and actual data
Stop removing logical errors testing valid data random data actual data
Stem remov logic error test for valid data random data and actual data
StopStem remov logic error test valid data random data actual data
The Second stage is measuring the similarity using
DICSO1 and DISCO2 corpus-based similarity. In this stage
three tasks are performed; Removing the stop words, Getting
distinct words and Constructing the similarity matrix. The
similarity matrix represents the similarity between each
distinct word in the model answer and each distinct word in
the students answer. Each row represents one word in the
model answer, and each column represents one word in the
students answer. The last two columns represent the
maximum and the average similarity of each word in the
model answer.
After constructing the similarity matrix, the final overall
similarity is computed with two methods - Max Overall
Similarity and Average Overall Similarity - by computing the
average of the last two columns (Max, Average).The final
overall similarity refers to the students mark. For more
clarification consider the following walkthrough example to
measure the similarity between the following Model and
student answer:
Model Answer: "To store a class interface including
data members and member function prototypes."
Student Answer: "Header files have reusable source
code in a file that a programmer can use."
First step is removing the stop words from the two strings
("To, a, and" in model answer and "have, in, a, that, can, use
in student answer).
Second step is getting the distinct words from the two
strings; two words are considered equal if they have the same
stem ("members, member" in model answer and "files, file in
student answer").
Third step is constructing the similarity matrix. Table III
represents the similarity matrix using DISCO2 with Wikipedia
data packets.
The Third stage is combining the similarity values
obtained from both string-based and corpus-based measures.
Many researches adopted the idea of mixing the results from
different measures to enhance the overall similarity [30, 31,
32, and 33]. The steps of the proposed combining task are
illustrated in the next section.
Table III. Similarity matrix using DISCO2 Corpus-based similarity
Header File reusable Source code Programmer MAX AVG
Store 0.282 0.399 0.12 0.285 0.266 0.193 0.399 0.257
Class 0.035 0.052 0.044 0.049 0.067 0.031 0.067 0.046
Interface 0.439 0.697 0.234 0.383 0.522 0.389 0.697 0.444
Including 0.003 0.009 0.004 0.009 0.005 0.008 0.009 0.006
Data 0.468 0.61 0.163 0.017 0.45 0.253 0.61 0.326
Member 0.026 0.037 0.006 0.095 0.046 0.132 0.132 0.057
Function 0.2 0.261 0.057 0.3 0.285 0.147 0.3 0.208
prototypes 0.078 0.106 0.139 0.125 0.019 0.094 0.139 0.093
Final Overall Similarity 0.294 0.179
VI. EXPERIMENTS' RESULTS AND DISCUSSION
Pearson's correlation coefficient measure was used to
specify the correlation between automatic score and average
human grades.
A. Experiments Results using String-Based Similarity
As mentioned above; 13 string-based algorithms were
tested with four different methods Raw, Stop, Stem and Stop
Stem. Table IV represents the correlation results between
model and student answer using both Character-based and
Term-based measures. In character-based distance measures;
N-gram similarity got the best correlation value 0.435 applied
to the raw text by mixing the results obtained from both bi-
gram and tri-gram similarity measures.
In Term-based distance measures; Block Distance got the
best correlation value 0.382 applied to the text after removing
the stop words from both model and student answers.
Stop word removing task enhanced the correlation results
especially in Term-based measures. Stemming process didn't
enhance the results for all cases.
B. Experiments Results using Corpus-Based Similarity
Disco measures are applied by using two data packets -
Wikipedia and British National Corpus (BNC); features of
both are presented in table V.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
119 | P a g e
www.ijacsa.thesai.org
Table IV. Similarity matrix using DISCO2 Corpus-based similarity
Raw Stop Stem StopStem
Character-based distance measures
Damerau-Levenshtein 0.338 0.324 0.317 0.315
Jaro 0.144 0.229 0.146 0.205
JaroWinkler 0.151 0.245 0.169 0.223
Needleman-Wunsch 0.265 0.265 0.255 0.258
Smith-Waterman 0.361 0.341 0.351 0.331
N-gram (bi-gram+tri-gram) 0.435 0.416 0.413 0.398
Term-based distance measures
Block Distance 0.375 0.382 0.34 0.291
Cosine similarity 0.376 0.377 0.344 0.308
Dices coefficient 0.368 0.379 0.337 0.307
Euclidean distance 0.326 0.338 0.312 0.281
Jaccard similarity 0.332 0.349 0.311 0.294
Matching Coefficient 0.305 0.339 0.294 0.264
Overlap coefficient 0.374 0.368 0.336 0.286
Table V. Disco Data Packets
Wikipedia BNC
Packet Name
en-wikipedia-
20080101
en-BNC-
20080721
Packet Size 5.9 Gigabyte 1.7 Gigabyte
Number of Tokens 267 million 119 million
Number of queriable words 220,000 122,000
Table VI represents the correlation results between all the
model and student answer using Disco1 and Disco2 measures.
As mentioned in section V, MAX and AVG refer to Max
Overall Similarity and Average Overall Similarity
respectively.
Table VI. Disco Data Packets
Wikipedia BNC
MAX AVG MAX AVG
Disco1 0.465 0.445 0.450 0.412
Disco2 0.428 0.410 0.415 0.409
In Corpus-based measures; Disco1 similarity got the best
correlation value 0.475 using Wikipedia data packet and Max
overall similarity method. Similarity measures using
Wikipedia packet got higher correlation than BNC due to the
role of the corpus size and other features shown in table V.
Using Max overall similarity method clearly enhanced the
correlation results in all cases in corpus-based measures.
C. Experiments Results via Combining String-Based and
Corpus-Based similarity
As previously mentioned, many researches adopt the idea
of mixing results from different measures to enhance the
overall similarity. The proposed system combined the best
algorithm for each category. The three selected measures are:
N-gram represents character-based string similarity
and is applied to raw text.
Block Distance represents term-based string
similarity and is applied to text after removing stop
words.
Disco1 using Max overall similarity which represents
corpus-based similarity.
Similarity values resulting from the three measures are
compared, the max and average similarity value for each
students answer are selected, and then the correlation between
all students and model answers is recomputed.
The four possible combinations are represented in table
VII. These cases emphasize the idea of mixing String-Based
Similarity measures with the Corpus-based similarity
measures to get the advantages of both. The correlation results
are enhanced from the best value achieved from applying all
the measures separately 0.465 to 0.504 resulting from
combining N-gram and Disco1 measures.
Table VII. Correlation Results based on combining method
N-gram Block Distance Disco1 MAX AVG
0.457 0.414
0.504 0.470
0.411 0.394
0.475 0.443
D. Discussion
As previously mentioned in section II; the most related
research to this work was introduced in [1, 2]. The discussion
here is a comparison between results from the previously
related researches and results from the proposed research.
Care is given to researches that deal with text as bag of words
(BOW) in unsupervised way, where neither complex NLP
tasks nor machine learning algorithms were applied. The
dataset experimented in [1] contained 21 questions and 610
answers. LSA (BNC), LSA (Wikipedia), ESA (Wikipedia) and
tf*idf were experimented. The results were 0.407, 0.428, 0.468
and 0.364 respectively. The dataset experimented in [2] was
Texas dataset previously introduced in section IV .LSA, ESA
and tf*idf were experimented and results were 0.328, 0.395
and 0.281 respectively.
Compared to previous results the proposed system
achieved better results in most experimented cases. String-
based similarity measures enhanced the correlation results if
compared to simple tf*idf method. Also Disco similarity
measures achieved better results than the most known Corpus-
based methods LSA and ESA.
Combining String-based and Corpus-based similarity in
unsupervised way raised the correlation results to 0.504. This
value is the best correlation result achieved compared to other
previous work and is very promising as these measures don't
need any complex supervised learning task or NLP tasks such
as Part of Speech and Syntax parsing. Also this value is very
near to correlation values obtained from learning using
different supervised machine learning algorithms and graph
alignment in [2].
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
120 | P a g e
www.ijacsa.thesai.org
VII. CONCLUSION
In this research, short answer grading task is handled from
an unsupervised approach which is bag of words. This
approach is easy to implement as it neither requires complex
NLP tasks nor supervised learning algorithms. The used data
set contains 81 questions and 2273 student answers.
The proposed model goes through three stages: The First
stage is measuring the similarity between model answer and
student answer using 13 String-Based algorithms. Six of them
were Character-based and the other seven were Term-based
measures. The best correlation values achieved using
Character-based and term-based were 0.435 and 0.382 using
N-gram and Block distance respectively. The Second stage
was measuring the similarity using DICSO1 and DISCO2
Corpus-based similarity. Disco1 achieved 0.465 correlation
value using the max overall similarity.
The Third stage was measuring the similarity by combing
String-based and Corpus-based measures. The best correlation
value 0.504 was obtained from mixing N-gram with Disco1
similarity values. Proposed model achieved great results
compared to previous works. The near future work focus on
applying short answer grading to other language like Arabic.
A very encouraging factor is the ability of Disco package to
work with nine languages. A main obstacle for this task is the
unavailability of short answer grading data sets in other
language than English language.
REFERENCES
[1] M. Mohler and R. Mihalcea, "Text-to-text semantic similarity for
automatic short answer grading", In Proceedings of the European
Association for Computational Linguistics (EACL 2009), Athens,
Greece, 2009.
[2] M. Mohler, R. Bunescu & R. Mihalcea, "Learning to Grade Short
Answer Questions using Semantic Similarity Measures and Dependency
Graph Alignments", In Proceedings of the 49th Annual Meeting of the
Association for Computational Linguistics: Human Language
Technologies. Portland, Oregon, USA: Association for Computational
Linguistics, pp. 752762, 2011.
[3] R. Mihalcea, C. Corley, and C. Strapparava, "Corpus-based and
knowledge-based approaches to text semantic similarity", In Proceedings
of the American Association for Artificial Intelligence (AAAI 2006),
Boston, 2006.
[4] C. Leacock and M. Chodorow, "C-rater: Automated Scoring of Short-
Answer Question", Computers and the Humanities, vol. 37, no. 4, pp.
389-405, Nov. 2003.
[5] J. Sukkarieh and S. Stoyanchev, "Automating model building in
C-rater", In Proceedings of the Workshop on Applied Textual Inference,
pages 6169, Suntec, Singapore, August, 2009.
[6] J. Z. Sukkarieh & J. Blackmore, "c-rater: Automatic Content Scoring for
Short Constructed Responses", Proceedings of the 22nd International
FLAIRS Conference, Association for the Advancement of Artificial
Intelligence, 2009.
[7] J.Z. Sukkarieh, S.G. Pulman, and N. Raikes, "Auto-Marking 2: An
Update on the UCLES-Oxford University research into using
Computational Linguistics to Score Short, Free Text Responses",
International Association of Educational Assessment, Philadephia, 2004.
[8] J. Z. Sukkarieh and S. G. Pulman, "Automatic Short Answer Marking".
Proceedings of the 2nd Workshop on Building Educational Applications
Using NLP, Association for Computational Linguistics, pp. 9-16, June
2005.
[9] T. Mitchell, T. Russel, P. Broomhead and N. Aldridge, "Towards robust
computerized marking of free-text responses". Proceedings of the Sixth
International Computer Assisted Assessment Conference,
Loughborough, UK: Loughborough University, 2002.
[10] Raheel Siddiqi, Christopher J. Harrison, and Rosheena
Siddiqi,"Improving Teaching and Learning through Automated Short-
Answer Marking" IEEE TRANSACTIONS ON LEARNING
TECHNOLOGIES, VOL. 3, NO. 3, JULY-SEPTEMBER, 2010.
[11] S. Chapman, "Simmetrics: a java & c# .net library of similarity
metrics", http://sourceforge.net/projects/simmetrics/, 2006.
[12] P. A. V. Hall and G. R. Dowling, "Approximate string matching,
Comput. Surveys", 12:381-402 ,1980.
[13] J. L. Peterson, "Computer programs for detecting and correcting spelling
errors", Comm. Assoc. Comput. Mach., 23:676-687 ,1980.
[14] M. A. Jaro, "Advances in record linkage methodology as applied to the
1985 census of Tampa Florida". Journal of the American Statistical
Society 84 (406): 41420, 1989.
[15] M. A. Jaro, "Probabilistic linkage of large public health data file",
Statistics in Medicine 14 (5-7), 491-8, 1995.
[16] W. E. Winkler, "String Comparator Metrics and Enhanced Decision
Rules in the Fellegi-Sunter Model of Record Linkage", Proceedings of
the Section on Survey Research Methods(American Statistical
Association): 354359, 1990.
[17] Needleman, B.Saul, and Wunsch, D. Christian , "A general method
applicable to the search for similarities in the amino acid sequence of
two proteins", Journal of Molecular Biology 48(3): 44353, 1970.
[18] Smith, F. Temple, Waterman, S. Michael , "Identification of Common
Molecular Subsequences". Journal of Molecular Biology 147: 195197,
1981.
[19] Alberto Barron-Cedeno, Paolo Rosso, Eneko Agirre, and Gorka
Labaka, "Plagiarism Detection across Distant Language Pairs", In
Proceedings of the 23rd International Conference on Computational
Linguistics, pages 3745, 2010.
[20] Eugene F. Krause , "Taxicab Geometry" , Dover. ISBN 0-486-25202-7,
1987.
[21] L. Dice, "Measures of the amount of ecologic association between
species", Ecology, 26(3), 1945.
[22] P. Jaccard, "tude comparative de la distribution florale dans une portion
des Alpes et des Jura", Bulletin de la Socit Vaudoise des Sciences
Naturelles 37, 547-579, 1901.
[23] T.K. Landauer and S.T. Dumais, "A solution to platos problem: The
latent semantic analysis theory of acquisition, induction, and
representation of knowledge", Psychological Review, 104, 1997.
[24] E. Gabrilovich and S. Markovitch,"Computing Semantic Relatedness
using Wikipedia-based Explicit Semantic Analysis", Proceedings of the
20th International Joint Conference on Artificial Intelligence, pages 6
12, 2007.
[25] P. Turney, "Mining the web for synonyms: PMI-IR versus LSA on
TOEFL", In Proceedings of the Twelfth European Conference on
Machine Learning (ECML-2001), 2001.
[26] Peter Kolb, "Experiments on the difference between semantic similarity
and relatedness", In Proceedings of the 17th Nordic Conference on
Computational Linguistics - NODALIDA '09, Odense, Denmark, May
2009.
[27] Peter Kolb, "DISCO: A Multilingual Database of Distributionally
Similar Words", In Proceedings of KONVENS-2008, Berlin, 2008.
[28] D. Lin, "Extracting Collocations from Text Corpora. InWorkshop on
Computational Terminology" ,5763, Montreal, Kanada,1998.
[29] M. F. Porter, "An algorithm for suffix stripping", Program, 14, 130,
1980.
[30] Y. Li, D. McLean, Z. Bandar, J. OShea, K. Crockett, "Sentence
similarity based on semantic nets and corpus statistics", IEEE
Transactions on Knowledge and Data Engineering, 18(8), 1138
1149,2006.
[31] A. Islam, D. Inkpen, "Semantic text similarity using corpus-based word
similarity and string similarity", ACM Transactions on Knowledge
Discovery from Data, 2(2), 125, 2008.
[32] Nitish Aggarwal, Kartik Asooja, Paul Buitelaar. "DERI&UPM: Pushing
Corpus Based Relatedness to Similarity: Shared Task System
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
121 | P a g e
www.ijacsa.thesai.org
Description", First Joint Conference on Lexical and Computational
Semantics (*SEM), pages 643647, Montreal, Canada, Association for
Computational Linguistics ,June 7-8, 2012.
[33] Davide Buscaldi, Ronan Tournier, Nathalie Aussenac-Gilles and Josiane
Mothe, "IRIT: Textual Similarity Combining Conceptual Similarity with
an N-Gram Comparison Method", First Joint Conference on Lexical and
Computational Semantics (*SEM), pages 552556, Montreal, Canada,
Association for Computational Linguistics, June 7-8 , 2012.
AUTHORS PROFILE
Wael Hasan Gomaa, is currently working as a
teacher assistant, Computers Science department, Modern
Academy for Computer Science & Management
Technology, Cairo, Egypt. He is a Ph.D student, Faculty of
Computer and Information, Cairo University, Egypt in the
field of Automatic Assessment under supervision of Prof.
Aly Aly Fahmy. He received his B.Sc and Master degrees
from Faculty of Computers and Information, Helwan
University, Egypt. His master thesis was entitled Text Mining Hybrid
Approach for Clustered Semantic Analysis". His research interests include
Natural Language Processing, Artificial Intelligence, Data Mining and Text
Mining. E-mail:[email protected]
Prof. Aly Aly Fahmy, is the former Dean of the
Faculty of Computing and Information, Cairo University
and a Professor of Artificial Intelligence and Machine
Learning, in the department of Computer Science. He
graduated from the Department of Computer Engineering,
Technical College with honor degree. He specialized in
Mathematical Logic and did his research with Dr. Hervey
Gallaire, the former vice president of Xerox Global. He
received a master's degree from the National School of Space and Aeronautics
ENSAE, Toulouse, France, 1976 in the field of Logical Data Base systems
and then obtained his PhD from the Centre for Studies and Research CERT-
DERI, 1979, Toulouse - France in the field of Artificial Intelligence.
He received practical training in the field of Operating Systems and
Knowledge Based Systems in Germany and the United States of America. He
participated in several national projects including the establishment of the
Egyptian Universities Network (currently hosted at the Egyptian Academy of
Scientific Research at the Higher Ministry of Education), building Expert
Systems in the field of iron and steel industry and building Decision Support
Systems for many national entities.
Prof. Fahmys main research areas are: Data and Text Mining,
Mathematical Logic, Computational Linguistics, Text Understanding and
Automatic Essay Scoring and Technologies of Man- Machine Interface in
Arabic. He published many refereed papers and authored the book "Decision
Support Systems and Intelligent Systems" in Arabic.
He was the Director of the first Center of Excellence in Egypt in the field
of Data Mining and Computer Modeling (DMCM) in the period of 2005-
2010. DMCM was a virtual research center with more than 40 researchers
from universities and industry. It was founded by an initiative of Dr. Tarek
Kamel, Minister of Communications and Information Technology.
Prof. Aly Fahmy is currently involved in the implementation of the
exploration project of Masters and Doctorate theses of Cairo University with
the supervision of Prof. Dr. Hussein Khaled, Vice President of Cairo
University for Postgraduate Studies and Research. The project aims to assess
the practical implementation of Cairo University late strategic research plan,
and to assist in the formulation of the new strategic research plan for the
coming 2011 - 2015.E-mail: [email protected]
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
122 | P a g e
www.ijacsa.thesai.org
Hybrid intelligent system for Sale Forecasting using
Delphi and adaptive Fuzzy Back-Propagation Neural
Networks
Attariuas Hicham
LIST, Group Research in Computing and
Telecommunications (ERIT) FST
Tangier, BP : 416, Old Airport Road, Morocco
Bouhorma Mohammed, Sofi Anas
LIST, Group Research in Computing and
Telecommunications (ERIT) FST
Tangier, BP: 416, Old Airport Road, Morocco
AbstractSales forecasting is one of the most crucial issues
addressed in business. Control and evaluation of future sales still
seem concerned both researchers and policy makers and
managers of companies. this research propose an intelligent
hybrid sales forecasting system Delphi-FCBPN sales forecast
based on Delphi Method, fuzzy clustering and Back-propagation
(BP) Neural Networks with adaptive learning rate. The proposed
model is constructed to integrate expert judgments, using Delphi
method, in enhancing the model of FCBPN. Winters Exponential
Smoothing method will be utilized to take the trend effect into
consideration. The data for this search come from an industrial
company that manufactures packaging. Analyze of results show
that the proposed model outperforms other three different
forecasting models in MAPE and RMSE measures.
Keywords-Component; Hybrid intelligence approach; Delphi
Method; Sales forecasting; fuzzy clustering; fuzzy system; back
propagation network.
I. INTRODUCTION
Sales forecasting plays a very important role in business
strategy. To strengthen the competitive advantage in a
constantly changing environment, the manager of a company
must make the right decision in the right time based on thein
formation at hand. Obtaining effective sales forecasts in
advance can help decision makers to calculate production and
material costs, determine also the sales price, strategic
Operations Management, etc.
Hybrid intelligent system denotes system that utilizes a
parallel combination of methods and techniques from artificial
intelligence. As hybrid intelligent systems can solve non-linear
prediction, this article proposes an integration of a hybrid
system FCBPN within an architecture of ERP to improve and
extend the management sales module to provide sales
forecasts and meet the needs of decision makers of the
company.
The remainder of the article is constructed as follows:
Section 2 is the literature review. Section 3 describes the
construction and role of each component of the proposed sale
forecasting system Delphi-FCBPN Section 4 describes the
sample selection and data analysis. Finally, section 5 provides
a summary and conclusions.
II. LITERATURE AND RELATED RESEARCH
Enterprise Resource Planning (ERP) is a standard of a
complete set of enterprise management system .It emphasizes
integration of the flow of information relating to the major
functions of the firm [29]. There are four typical modules of
ERP, and the sales management is one of the most important
modules. Sales management is highly relevant to today's
business world; it directly impacts the working rate and
quality of the enterprise and the quality of business
management. Therefore, Integrate sales forecasting system
with the module of the sales management has become an
urgent project for many companies that implement ERP
systems.
Many researchers conclude that the application of BPN is
an effective method as a forecasting system, and can also be
used to find the key factors for enterprisers to improve their
logistics management level. Zhang, Wang and Chang (2009)
[28] utilized Back Propagation neural networks (BPN) in order
to forecast safety stock. Zhang, Haifeng and Huang (2010)[29]
used BPN for Sales Forecasting Based on ERP System. They
found out that BPN can be used as an ac-curate sales
forecasting system.
Reliable prediction of sales becomes a vital task of
business decision making. Companies that use accurate sales
forecasting system earn important benefits. Sales forecasting is
both necessary and difficult. It is necessary because it is the
starting point of many tools for managing the business:
production schedules, finance, marketing plans and budgeting,
and promotion and advertising plan. It is difficult because it is
out of reach regardless of the quality of the methods adopted
to predict the future with certainty. Parameters are numerous,
complex and often unquantifiable.
Recently, the combined intelligence technique using
artificial neural networks (ANNs), fuzzy logic, Particle Swarm
Optimization (PSO), and genetic algorithms (GAs) has been
demonstrated to be an innovative forecasting approach. Since
most sales data are non-linear in relation and complex, many
studies tend to apply Hybrid models to time-series forecasting.
Kuo and Chen (2004)[20] use a combination of neural
networks and fuzzy systems to effectively deal with the
marketing problem.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
123 | P a g e
www.ijacsa.thesai.org
The rate of convergence of the traditional back-
propagation networks is very slow because its dependent
upon the choice of value of the learning rate parameter.
However, the experimental results (2009 [25]) showed that the
use of an adaptive learning rate parameter during the training
process can lead to much better results than the traditional
neural net-work model (BPN).
Many papers indicate that the system which uses the
hybridization of fuzzy logic and neural networks can more
accurately perform than the conventional statistical method
and single ANN. Kuo and Xue (1999) [21] proposed a fuzzy
neural network (FNN) as a model for sales forecasting. They
utilized fuzzy logic to extract the experts fuzzy knowledge.
Toly Chen (2003) [27] used a model for wafer fab prediction
based on a fuzzy back propagation network (FBPN). The
proposed system is constructed to incorporate production
control expert judgments in enhancing the performance of an
existing crisp back propagation network. The results showed
the performance of the FBPN was better than that of the BPN.
Efendigil, n, and Kahraman (2009) [16] utilized a
forecasting system based on artificial neural networks ANNs
and adaptive network based fuzzy inference systems (ANFIS)
to predict the fuzzy demand with incomplete information.
Attariuas, Bouhorma and el Fallahi[30] propose hybrid
sales forecasting system based on fuzzy clustering and Back-
propagation (BP) Neural Networks with adaptive learning rate
(FCBPN).The experimental results show that the proposed
model outperforms the previous and traditional approaches
(BPN, FNN , WES, KGFS). Therefore, it is a very promising
solution for industrial forecasting.
Chang and Wang (2006)[6] used a fuzzy back-propagation
network (FBPN) for sales forecasting. The opinions of sales
managers about the importance of each input, were converted
into prespecified fuzzy numbers to be integrated into a
proposed system. They concluded that FBPN approach
outperforms other traditional methods such as Grey
Forecasting, Multiple Regression Analysis and back
propagation networks.
Chang, Liu, and Wang (2006)[7] proposed a fusion of
SOM, ANNs, GAs and FRBS for PCB sales forecasting. They
found that performance of the model was superior to previous
methods that proposed for PCB sales forecasting.
Chang, Wang and Liu (2007) [10] developed a weighted
evolving fuzzy neural network (WEFuNN) model for PCB
sales forecasting. The proposed model was based on
combination of sales key factors selected using GRA. The
experimental results that this hybrid system is better than
previous hybrid models.
Chang and Liu (2008)[4] developed a hybrid model based
on fusion of cased-based reasoning (CBR) and fuzzy
multicriteriadecision making. The experimental results showed
that performance of the fuzzy casedbased reasoning (FCBR)
model is superior to traditional statistical models and BPN.
A Hybrid Intelligent Clustering Forecasting System was
proposed by Kyong and Han (2001)[22]. It was based on
Change Point Detection and Artificial Neural Networks. The
basic concept of proposed model is to obtain significant
intervals by change point detection. They found out that the
proposed models are more accurate and convergent than the
traditional neural network model (BPN).
Chang, Liu and Fan (2009) [5] developed a K-means
clustering and fuzzy neural network (FNN) to estimate the
future sales of PCB. They used K-means for clustering data in
different clusters to be fed into independent FNN models. The
experimental results show that the proposed approach
outperforms other traditional forecasting models, such as,
BPN, ANFIS and FNN.
Recently, some researchers have shown that the use of the
hybridization between fuzzy logic and GAs leading to genetic
fuzzy systems (GFSs) (Cordn, Herrera, Hoffmann, &
Magdalena (2001) [13]) has more accurate and efficient results
than the traditional intelligent systems. Casillas, & Martnez
Lpez (2009) [24], Martnez Lpez & Casillas (2009) [23],
utilized GFS in various case Management. They have all
obtained good results.
Chang, Wang and Tsai (2005)[3] used Back Propagation
neural networks (BPN) trained by a genetic algorithm (ENN)
to estimate demand production of printed circuit board (PCB).
The experimental results show that the performance of ENN is
greater than BPN.
Hadavandi, Shavandi and Ghanbari (2011) [18] proposed a
novel sales forecasting approach by the integration of genetic
fuzzy systems (GFS) and data clustering to construct a sales
forecasting expert system. They use GFS to extract the whole
knowledge base of the fuzzy system for sales forecasting
problems. Experimental results show that the proposed
approach outperforms the other previous approaches.
This article proposes an intelligent hybrid sales forecasting
system Delphi-FCBPN sales forecast based on Delphi Method,
fuzzy clustering and Back-propagation (BP) Neural Networks
with adaptive learning rate (FCBPN) for sales forecasting in
packaging industry
III. DEVELOPMENT OF THE DELPHI-FCBPN MODEL
The proposed approach is composed of three stage as
shown in Figure 1 :(1) Stage of data collection: collection of
key factors that influence sales will be made using the Delphi
method through experts judgments; (2) Stage of Data
preprocessing: Use Rescaled Range Analysis (R/S) to evaluate
the effects of trend. Winters Exponential Smoothing method
will be utilized to take the trend effect into consideration.(3)
Stage of learning by FCBPN: We use hybrid sales forecasting
system based on Delphi, fuzzy clustering and Back-
propagation (BP) Neural Networks with adaptive learning rate
(FCBPN).
The data for this study come from an industrial company
that manufactures packaging in Tangier from 2001-2009.
Amount of monthly sales is seen as an objective of the
forecasting model.
A. Stage of data collection
Data collection will be implemented based on production-
control expert judgments. Some sales managers and
production control experts are requested to express their
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
124 | P a g e
www.ijacsa.thesai.org
opinions about the importance of each input parameter in
predicting the sales, and then we apply the Delphi Method to
select the key factors for forecasting packaging industry.
1) The collection of factor that influence sales
Packaging industry is an industry with highly variant
environment. The variables of packaging industry sales can be
subdivided in three domains: (1) the market demand domain;
(2) macroeconomics domain; (3) industrial production
domain. To collect all possible factors, which can affect the
packaging industry sales from the three domains mentioned
earlier, some sales managers and production control experts
are requested to list all possible attributes affecting packaging
industry sales
2) Delphi Method to select the factors affecting sales
The Delphi Method was first developed by Dalkey and
Helmer (1963) in corporation and has been widely applied in
many management areas, e.g. forecasting, public policy
analysis, and project planning. The principle of the Delphi
method is the submission of a group of experts in several
rounds of questionnaires. After each round, a synthesis
anonymous of response with experts arguments is given to
them. The experts were then asked to revise their earlier
answers in light of these elements. It is usually found as a
result of this process (which can be repeated several times
times if necessary), the differences fade and responses
converge towards the "best" answer.
Figure 1: Architecture of the Delphi-FCBPN model
The Delphi Method was used to choose the main factors,
which would influence the sales quantity from all possible that
were collected in this research. The procedures of Delphi
Method are listed as follows:
1. Collect all possible factors from ERP database, which
may affect the monthly sales from the domain experts.
This is the first questionnaire survey.
2. Conduct the second questionnaire and ask domain
experts select assign a fuzzy number ranged from 1 to 5
to each factor. The number represents the significance
to the sales.
3. Finalize the significance number of each factor in the
questionnaire according to the index generated in step
3.
Repeat 2 to 3 until .the results of questionnaire converge
B. Data preprocessing stage
Based on Delphi method, the key factors that influence
sales are (K1, K2, K3) (see Table 1): When the seasonal and
trend variation is present in the time series data, the accuracy
of forecasting will be influenced. R/S analysis will be utilized
to detect if there is this kind of effects of serious data. If the
effects are observed, Winters exponential smoothing will be
used to take the effects of seasonality and trend into
consideration.
Input Description
K
1
Manufacturing consumer index
N
1
Normalized manufacturing consumer index
K
2
Offerscompetitiveindex
N
2
Normalized offers competitive index
K
3
packaging total production value Index
N
3
Normalized packaging total production value Index
K
4
Preprocessed historical data (WES)
N
4
Normalized preprocessed historical data (WES)
Y
0
Actual historical monthly packaging sales
Y Normalized Actual historical monthly packaging sales
Table1: Description of input forecasting model.
1) R/S analysis (rescaled range analysis)
For eliminating possible trend influence, the rescaled range
analysis, invented by Hurst (Hurst, Black, & Simaika, 1965),
is used to study records in time or a series of observations in
different time. Hurst spent his lifetime studying the Nile and
the problems related to water storage. The problem is to
determine the design of an ideal reservoir on the basis of the
given record of observed discharges from the lake. The R/S
analysis will be introduced as follows:
Consider the XZ{x1, x2,.,xn}, xi is the sales amount in
period i, and compute MN where
The standard deviation S is defined as
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
125 | P a g e
www.ijacsa.thesai.org
For each point i in the time series, we compute
We computed the H coefficient as
, here a=1
When 0<H<0.5, the self-similar correlations at all
timescales are anti-persistent, i.e. increases at any time are
more likely to be followed by decreases over all later time
scales. When H=0.5, the self-similar correlations are
uncorrelated. When 0.5<H<1, the self-similar correlations at
all timescales are persistent, i.e. increases at any time are more
likely to be followed by increases over all later time scales.
2) Winters Exponential Smoothing
In order to take the effects of seasonality and trend into
consideration, winters Exponential Smoothing is used to
preliminarily forecast the quantity of sales. For time serial data,
Winters Exponential Smoothing is used to preprocess all the
historical data and use them to predict the production demand,
which will be entered into the proposed hybrid model as input
variable (K4)(see Table 1).Similar to the previous researches,
we assume = 0.1 , = 0.1 and = 0.9 . The data generating
process is assumed to be of the form
Where C
t
seasonal factor
is exponentially smoothed level of the process at the end of
period t
x
t
actual monthly sales in period t
N number of periods in the season (N=12 months)
t-1
trend for period t-1
smoothing constant for
0
.
The season factor, C
t
, is updated as follows
Where is the smoothing constant for Ct. For updating the
trend component
Where is the smoothing constant for
1
. Winters
forecasting model is then constructed by
Where is the estimate in time period t.
C. FCBPN forecasting stage
FCBPN [30] (Fuzzy Clustering and Back-Propagation
(BP) Neural Networks with adaptive learning rate) is utilized
to forecast the future packaging industry sales. As shown in
figue 2, FCBPN is composed of two steps: (1) utilizing Fuzzy
C-Means clustering method (Used in an clusters memberships
fuzzy system (CMFS)), the clusters membership levels of
each normalized data records will be extracted; (2) Each
cluster will be fed into parallel BP networks with a learning
rate adapted as the level of cluster membership of training data
records.
1) Extract membership levels to each cluster (CMFS)
Using Fuzzy C-Means clustering method (utilized in an
adapted fuzzy system (CMFS)), the clusters centers of the
normalized data records will be founded, and consequently,
we can extract the clusters membership levels of each
normalized data records.
1.1) Data normalization
The input values (K1, K2, K3, K4) will be ranged in the
interval [0.1, 0.9] to meet property of neural networks. The
normalized equation is as follows:
Where K
i
presents a key variable, N
i
presents normalized
input (see Table 1), max (K
i
) and min (K
i
) represent
maximum and minimum of the key variables, respectively.
1.2) Fuzzy c-means clustering
In hard clustering, data is divided into distinct clusters,
where each data element belongs to exactly one cluster. In
Fuzzy c-means (FCM) (developed by Dunn 1973 [14] and
improved by Bezdek 1981 [1]), data elements can belong to
more than one cluster, and associated with each element is a
set of membership levels. It is based on minimization of the
following objective function:
Where u
ij
is the degree of membership of x
i
in the cluster j
,x
i
is the i
th
of measured data and c
j
is the center of the j
th
cluster. The algorithm is composed of the following steps:
Step 1: Initialize randomly the degrees of membership
matrix
Step 2: Calculate the centroid for each cluster C(k )=[c
j
]
withU(k ) :
Step 3: For each point, update its coefficients of being in
the clusters (U(k ) ,U(k+1)) :
Step 4:If then STOP;
otherwise return to step 2.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
126 | P a g e
www.ijacsa.thesai.org
This procedure converges to a local minimum or a saddle
point of J
m
. According to Bezdek [1], the appreciated
parameter combination of two factors (m and ) ism = 2 and
= 0.5
Using fuzzy c-means, Table2 shows that the use of four
clusters is the best among all different clustering numbers.
Clustering groups fuzzy c-means total distance
Clustering 2 groups 4.5234
Clustering 3 groups 2.2122
Clustering 4 groups 1.8434
Table 2: Comparison of total distance of different clustering algorithms in.
1.3) The degree of Membership levels (MLC
k
)
In this stage, we will use the sigmoid function (Figure3) to
improve the precision of results and to accelerate the training
process of neural networks. Then, the advanced fuzzy distance
between records data (X
i
) and a cluster center (c
k
) (AFD
k
)
will be presented as follow:
The membership levels of belonging of a record X
i
to k
ith
cluster (MLC
k
(X
i
)) is related inversely to thedistance from
records data X
i
to the cluster center c
k
(AFD
k
(X
i
)):
Figure 3: Sigmoid function, a = 50 and c = 0,5.
The clusters memberships fuzzy system (CMFS) return
the memberships level of belonging of data record X to each
clusters:
Thus, we can construct a new training sample ( X
i
,
MLC
1
(X
i
), MLC
2
(X
i
), MLC
3
(X
i
), MLC
4
(X
i
)) for the adaptive
neural networks evaluating (Figure2).
2) 2) Adaptive neural networks evaluating stage
The artificial neural networks (ANNs) concept is
originated from the biological science (neurons in an
organism). Its components are connected according to some
pattern of connectivity, associated with different weights. The
weight of a neural connection is updated by learning. The
ANNs possess the ability to identify nonlinear patterns by
learning from the data set. The back-propagation (BP) training
algorithms are probably the most popular ones. The structure
of BP neural networks consists of an input layer, a hidden
layer, as well as an output layer. Each layer contains I ;J and
L nodes denoted. The w
i j
is denoted as numerical weights
between input and hidden layers and so is w
jl
between hidden
and output layers as shown in Figure4.
In this stage, we propose an adaptive neural networks
evaluating system which consists of four neural networks.
Each cluster K is associated with the K
ith
BP network. For
each cluster, the training sample will be fed into a parallel
Back Propagation networks (BPN) with a learning rate
adapted according to the level of clusters membership (MLC
k
)
of each records of training data set. The structure of the
proposed system is shown in Figure 2.
FIGURE 4: The structure of back-propagation neural network
The Adaptive neural networks learning algorithm is
composed of two procedures: (a) a feed forward step and (b) a
back-propagation weight training step. These two separate
procedures will be explained in details as follows:
Step 1- All BP networks are initialized with the same
random weights.
Step 2 - Feed forward.
For each BPN
k
(associate to the K
th
cluster), we assume
that each input factor in the input layer is denoted by x
i
.y
j
k
and
o
k
l
represent the output in the hidden layer and the output
layer, respectively. And y
j
k
and o
k
l
can be expressed as
follows:
where the w
oj
k
and w
ol
k
are the bias weights for setting
threshold values, f is the activation function used in both
hidden and output layers and X
j
k
and Y
l
k
are the temporarily
computing results before applying activation function f . In
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
127 | P a g e
www.ijacsa.thesai.org
this study, a sigmoid function (or logistic function) is selected
as the activation function. Therefore, the actual outputs y
j
k
and
o
k
l
in hidden and output layers, respectively, can also be
written as:
The activation function f introduces the nonlinear effect to
the network and maps the result of computation to a domain
(0, 1). In our case, the sigmoid function is used as the
activation function.
The globte output of the adaptive neural networks is
calculate as:
As shown above, the effect of the output o
k
l
on the global
output o
l
is both strongly and positively related to the
membership level (MLC
k
) of data record X
i
to k
ith
cluster.
Step 3-Back-propagation weight training. The error
function is defined as:
Where t
k
is a predefined network output (or desired output
or target value) and e
k
is the error in each output node. The
goal is to minimize E so that the weight in each link is
accordingly adjusted and the final output can match the
desired output. The learning speed can be improved by
introducing the momentum term. Usually, falls in the range [0,
1]. For the iteration n and for BPN
k
(associated to k
th
cluster),
the adaptive learning rate in BPN
k
and the variation of weights
w
k
can be expressed as
As shown above, we can conclude that the variation of the
BPN
k
network weights (w
oj
k
andw
ol
k
) are more important as
long as the membership level (MLC
k
) of data record X
j
to k
th
cluster is high. If the value of membership level (MLC
k
) of
data record X
j
to k
ith
cluster is close to zero then the changes in
BPN
k
network weights are very minimal.
The configuration of the proposed BPN is established as
follows:
Number of neurons in the input layer: I =4
Number of neurons in the output layer: L = 1
Single hidden layer
Number of neurons in the hidden layer: J =2
Network-learning rule: delta rule
Transformation function: sigmoid function
learning rate: =0.1
Momentum constant: = 0.02
learning times :20000
IV. EXPERIMENT RESULTS AND ANALYSIS
1) Constructing DELPHI-FCBPN System
The data test used in this study was collected from sales
forecasting case study of manufactures packaging In Tangier.
The total number of training samples was collected from
January 2001 to December 2008 while the total number of
testing samples was from January 2009 to December 2009.The
proposed DELPHI-FCBPN system was applied as case to
forecast the sales. The results are presented in Table3.
Month Actual values Forecasted values
2009/1 5322 5176
2009/2 3796 3766
2009/3 3696 3700
2009/4 5632 5592
2009/5 6126 6097
2009/6 5722 5766
2009/7 6040 6140
2009/8 6084 6084
2009/9 7188 7088
2009/10 7079 7079
2009/11 7287 7376
2009/12 7461 7532
Table3: The forecasted results by DELPHI-FCBPN method.
Figure 5: The MAPE of DELPHI-FCBPN.
4.2. Comparisons of FCBPN model with other previous
models
Experimental comparison of outputs of DELPHI-FCBPN
with other methods shows that the proposed model
outperforms the previous approaches (Tables 4-6). We apply
two different performance measures called mean absolute
percentage error (MAPE) and root mean square error (RMSE),
to compare the FCBPN model with the previous methods:
BPN, WES and FNN.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
128 | P a g e
www.ijacsa.thesai.org
Month Actual values BPN forecasts
2009/1 5408 5437
2009/2 4089 3512
2009/3 3889 3880
2009/4 5782 5865
2009/5 6548 6064
2009/6 5660 6403
2009/7 6032 6181
2009/8 6312 6329
2009/9 6973 7159
2009/10 6941 7427
2009/11 7174 7342
2009/12 7601 8100
Table4: The forecasted results by BPN method
Figure 6: The MAPE of BPN.
Month Actual values Forecasted values
2009/1 5408 5399
2009/2 4089 3774
2009/3 3889 3722
2009/4 5782 5836
2009/5 6548 6252
2009/6 5660 6412
2009/7 6032 6658
2009/8 6312 7244
2009/9 6973 7233
2009/10 6941 7599
2009/11 7174 7543
2009/12 7601 8022
Table5: The forecasted results by Winters method.
Figure 7: The MAPE of WES
Month Actual values Forecasted values
2009/1 5408 5131
2009/2 4089 3562
2009/3 3889 3589
2009/4 5782 5651
2009/5 6548 6504
2009/6 5660 5905
2009/7 6032 6066
2009/8 6312 6123
2009/9 6973 6968
2009/10 6941 7489
2009/11 7174 7365
2009/12 7601 7768
Table6: The forecasted results by FNN method.
Figure 8: The MAPE of FNN.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
129 | P a g e
www.ijacsa.thesai.org
Figure 9: SummaryMAPEandRMSEvaluesof prediction methods Delphi-
FCBPNERP, WES, BPNandFNN
Methods precision
MAPE RMSE
Delphi-FCBPN
WES
BPN
FNN
3,49
6,66
4,85
4,11
221
488
376
278
Table7: Summary MAPE and RMSE values of prediction methods Delphi-
FCBPNERP, WES, BPN and FNN.
Where, P
t
is the expected value for period t ,Y
t
is the actual
value for period t and N is the number of periods.
As shown in Table7, the forecasting accuracy of Delphi-
FCBPN is superior to the other traditional approaches
regarding MAPE and RMSE evaluations which are
summarized in Table7.
V. CONCLUSION
Recently, more and more researchers and industrial
practitioners are interested in applying fuzzy theory and neural
network in their routine problem solving. This research
combines fuzzy theory and back-propagation network into a
hybrid system, which will be applied in the sales forecasting
of packaging industries sales. This research proposes a hybrid
system based on Delphi method, fuzzy clustering and Back-
propagation Neural Networks with adaptive learning rate
(FCBPN) for sales forecasting.
We applied DELPHI-FCBPN for sales forecasting in a
case study of manufactures packaging in Tangier. The
experimental results of the proposed approach in Section 4
demonstrated that the effectiveness of the DELPHI-FCBPN is
superiorto the previous and traditional approaches: WES, BPN
and FNN regarding MAPE and RMSE evaluations.
REFERENCES
[1] Bezdek J. C. (1981) : "Pattern Recognition with Fuzzy Objective
Function Algoritms", Plenum Press, New York.
[2] Casillas, J., Cordn, O., Herrera, F., &Villar, P. (2004). A hybrid
learning process for the knowledge base of a fuzzy rule-based system. In
Proceedings of the 2004 international conference on information
processing and management of uncertainty in knowledge-based systems,
Perugia, Italy (pp. 2189 2196).
[3] Chang, P.-C, & Lai, C.-Y. (2005). A hybrid system combining self-
organizing maps with case-based reasoning in wholesalerbs new-release
book forecasting. Expert Systems with Applications, 29(1),183 192.
[4] Chang, P.-C, & Liu, C.-H. (2008). A TSK type fuzzy rule based system
for stock price prediction. Expert Systems with Applications, 34, 135
144.
[5] Chang, P.-C, Liu, C.-H., & Fan, C. Y. (2009). Data clustering and fuzzy
neural network for sales forecasting: A case study in printed circuit
board industry. Knowledge Based Systems, 22(5), 344 355.
[6] Chang, P.-C, Liu, C.-H., & Wang, Y.-W. (2006). A hybrid model by
clustering and evolving fuzzy rules for sales decision supports in printed
circuit board industry. Decision Support Systems, 42(3), 1254 1269.
[7] [7] Chang, P.-C, & Wang, Y.-W. (2006). Fuzzy Delphi and back-
propagation model for sales forecasting in PCB industry. Expert Systems
with Applications, 30(4).
[8] Chang, P.-C, Wang, Y.-W., & Liu, C.-H. (2007). The development of a
weighted evolving fuzzy neural network for PCB sales forecasting.
Expert Systems with Applications, 32(1),86 96.
[9] Chang, P.-C, Wang, Y.-W., & Tsai, C.-Y. (2005). Evolving neural
network for printed circuit board sales forecasting. Expert Systems with
Applications, 29,83 92.
[10] Chang, P.-C, Yen-Wen Wang, Chen-HaoLiu(2007). The development of
a weighted evolving fuzzy neural network for PCB sales forecasting.
Expert Systems with Applications, 32(1),86 96.
[11] Chang PeiChann , Wang Di-di , Zhou Changle (2011). A novel model
by evolving partially connected neural network for stock price trend
forecasting. Expert Systems with Applications, Volume 39 Issue 1,
January, 2012 .
[12] Cordn, O., & Herrera, F. (1997). A three-stage evolutionary process for
learning descriptive and approximate fuzzy logic controller knowledge
bases from examples. International Journal of Approximate Reasoning,
17(4),369407.
[13] Cordn, O., Herrera, F., Hoffmann, F., & Magdalena, L. (2001). Genetic
fuzzy systems: Evolutionary tuning and learning of fuzzy knowledge
bases. Singapore: World Scientific.
[14] Dunn J. C. (1973) : "A Fuzzy Relative of the ISODATA Process and Its
Use in Detecting Compact Well-Separated Clusters", Journal of
Cybernetics 3 : 32 57.
[15] Eberhart, R. C., and Kennedy, J. (1995). A new optimizer using particle
swarm theory. Proceedings of the Sixth International Symposium on
Micro Machine and Human Science, Nagoya, Japan, 39 43. Piscataway,
NJ : IEEE Service Center.
[16] Efendigil, T., n, S., & Kahraman, C. (2009). A decision support
system for demand forecasting with artificial neural networks and neuro-
fuzzy models: A comparative analysis. Expert Systems with
Applications, 36(3),6697 6707.
[17] Esmin, A (2007).Generating Fuzzy Rules from Examples Using the
Particle Swarm Optimization Algorithm. Hybrid Intelligent Systems,
2007. HIS 2007. 7th International Conference IEEE.
[18] Hadavandi Esmaeil, Shavandi Hassan, & Ghanbari Arash (2011),An
improved sales forecasting approach by the integration of genetic fuzzy
systems and data clustering : Case study of printed circuit board. Expert
Systems with Applications 38 (2011) 9392 9399.
[19] Goldberg, D. A. (1989). Genetic algorithms in search, optimization, and
machine learning. Reading, MA: Addison-Wesley.
[20] Kuo, R. J., & Chen, J. A. (2004). A decision support system for order
selection in electronic commerce based on fuzzy neural network
supported by realcoded genetic algorithm. Expert Systems with
Applications, 26(2),141 154.
[21] Kuo .R.J &Xue. K.C (1999).Fuzzy neural networks with application to
sales forecasting. Fuzzy Sets and Systems, 108,123 143.
[22] KyongJoo Oh, Ingoo Han (2001),An intelligent clustering forecasting
system based on change-point detection and artificial neural networks
0
1
2
3
4
5
6
7
MAPE
RMSE x 100
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
130 | P a g e
www.ijacsa.thesai.org
:application to financial economics. System Sciences, 2001. Proceedings
of the 34th Annual Hawaii
[23] Martnez-Lpez, F., & Casillas, J. (2009). Marketing intelligent systems
for consumer behavior modelling by a descriptive induction approach
based on genetic fuzzy. Industrial Marketing Management.
[24] Orriols-Puig, A., Casillas, J., &Martnez-Lpez, F. (2009). Unsupervised
learning of fuzzy association rules for consumer behavior modeling.
Mathware and Soft Computing, 16(1),29 43.
[25] SaeidIranmanesh, M. Amin Mahdavi (2009). A Differential Adaptive
Learning Rate Method for Back-Propagation Neural Networks. World
Academy of Science, Engineering and Technology 50 2009.
[26] Sivanandam.S. N, VisalakshiP : Dynamic task scheduling with load
balancing using parallel orthogonal particle swarm optimisation. IJBIC
1(4) : 276 286 (2009)
[27] Toly Chen (2003).A fuzzy back propagation network for output time
prediction in a wafer fab. Applied Soft Computing 2/3F (2003),211 222.
[28] Zhang, Wang & Chang (2009) . A Model on Forecasting Safety Stock of
ERP Based on BP Neural Network. Proceedings of the 2008 IEEE
ICMIT [29] Zhang, Haifeng and Huang(2010).Sales Forecasting Based
on ERP System through BP Neural Networks.ISICA10 Proceedings of
the 5th international conference on Advances in computation and
intelligence
[29] Chang, I-C., Hwang, H-G., Hung, M-C., Chen, S-L., Yen, D., A Neural
Network Evaluation Model for ERP Performance from SCM Perspective
to Enhance Enterprise Competitive Advantage, [J]. Expert Systems
with Applications 2007.08:102
[30] Attariuas, Bouhorma and el Fallahi An improved approach based on
fuzzy clustering and Back-Propagation Neural Networks with adaptive
learning rate for sales forecasting: Case study of PCB industry. IJCSI
International Journal of Computer Science Issues, Vol. 9, Issue 3, No 1,
May 2012ISSN (Online): 1694-0814
AUTHORS PROFILE
AttariuasHichamreceived the computer engineer degree in 2009 from
ENSAIS national school of computer science and systems analysis in Rabat,
Morocco. Currently, he is a PhD Student in Computer Science. Current
research interests: fuzzy system, intelligence system, bac-propagation
network, genetic intelligent system.
Bouhorma Mohammed received the PhD degree in Telecommunications and
Computer Engineering. He is a Professor of Telecommunications and
Computer Engineering in Abdelmalek Essaadi University. He has been a
member of the Organizing and the Scientific Committees of several symposia
and conferences dealing with intelligent system, Mobile Networks,
Telecommunications technologies.
SofiAnasreceived the Master degree in Telecommunications from
AbdelmalekEssaadi University. He is a PhD Student in Telecommunications
and intelligent system. Current research interests: Mobile Networks,
Telecommunications technologies and intelligent system.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
131 | P a g e
www.ijacsa.thesai.org
Continuous Bangla Speech Segmentation using
Short-term Speech Features Extraction Approaches
Md. Mijanur Rahman
Dept. of Computer Science & Engineering,
Jatiya Kabi Kazi Nazrul Islam University
Trishal, Mymensingh, Bangladesh.
Md. Al-Amin Bhuiyan
Dept. of Computer Science & Engineering,
Jahangir nagar University
Savar, Dhaka, Bangladesh.
AbstractThis paper presents simple and novel feature
extraction approaches for segmenting continuous Bangla speech
sentences into words/sub-words. These methods are based on two
simple speech features, namely the time-domain features and the
frequency-domain features. The time-domain features, such as
short-time signal energy, short-time average zero crossing rate
and the frequency-domain features, such as spectral centroid and
spectral flux features are extracted in this research work. After
the feature sequences are extracted, a simple dynamic
thresholding criterion is applied in order to detect the word
boundaries and label the entire speech sentence into a sequence
of words/sub-words. All the algorithms used in this research are
implemented in Matlab and the implemented automatic speech
segmentation system achieved segmentation accuracy of 96%.
Keywords-Speech Segmentation; Features Extraction; Short-time
Energy; Spectral Centroid; Dynamic Thresholding.
I. INTRODUCTION
Automated segmentation of speech signals has been under
research for over 30 years [1]. Speech Recognization system
requires segmentation of Speech waveform into fundamental
acoustic units [2]. Segmentation is a process of decomposing
the speech signal into smaller units. Segmentation is the very
basic step in any voiced activated systems like speech
recognition system and speech synthesis system. Speech
segmentation was done using wavelet [3], fuzzy methods [4],
artificial neural networks [5] and Hidden Markov Model [6].
But it was found that results still do not meet expectations. In
order to have results more accurate, groups of several features
were used [7, 8, 9 and 10]. This paper is continuation of
feature extraction for speech segmentation research. The
method implemented here is a very simple example of how the
detection of speech segments can be achieved.
This paper is organized as follows: Section 2 describes
techniques for segmentation of the speech signal. In Section 3,
we will describe different short-term speech features. In
section 4, the methodological steps of the proposed system
will be discussed. Section 5 and 6 will describe the
experimental results and conclusion, respectively.
II. SPEECH SEGMENTATION
Automatic speech segmentation is a necessary step that
used in Speech Recognition and Synthesis systems. Speech
segmentation is breaking continuous streams of sound into
some basic units like words, phonemes or syllables that can be
recognized. The general idea of segmentation can be described
as dividing something continuous into discrete, non-
overlapping entities [11]. Segmentation can be also used to
distinguish different types of audio signals from large amounts
of audio data, often referred to as audio classification [12].
Automatic speech segmentation methods can be classified
in many ways, but one very common classification is the
division to blind and aided segmentation algorithms. A central
difference between aided and blind methods is in how much
the segmentation algorithm uses previously obtained data or
external knowledge to process the expected speech. We will
discuss about these two approaches in the following sub-
sections.
A. Blind segmentation
The term blind segmentation refers to methods where there
is no pre-existing or external knowledge regarding linguistic
properties, such as orthography or the full phonetic annotation,
of the signal to be segmented. Blind segmentation is applied in
different applications, such as speaker verification systems,
speech recognition systems, language identification systems,
and speech corpus segmentation and labeling [13].
Due to the lack of external or top-down information, the
first phase of blind segmentation relies entirely on the
acoustical features present in the signal. The second phase or
bottom-up processing is usually built on a front-end
parametrization of the speech signal, often using MFCC, LP-
coefficients, or pure FFT spectrum [14].
B. Aided segmentation
Aided segmentation algorithms use some sort of external
linguistic knowledge of the speech stream to segment it into
corresponding segments of the desired type. An orthographic
or phonetic transcription is used as a parallel input with the
speech, or training the algorithm with such data [15]. One of
the most common methods in ASR for utilizing phonetic
annotations is with HMM-based systems [16]. HMM-based
algorithms have dominated most speech recognition
applications since the 1980s due to their so far superior
performance in recognition and relatively small computational
complexity in the field of speech recognition [17].
III. FEATURE EXTRACTION FOR SPEECH SEGMENTATION
The segmentation method described here is a purely
bottom-up blind speech segmentation algorithm. The general
principle of the algorithm is to track the amplitude or spectral
changes in the signal by using short-time energy or spectral
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
132 | P a g e
www.ijacsa.thesai.org
features and to detect the segment boundaries at the locations
where amplitude or spectral changes exceed a minimum
threshold level. Two types of features are used for segmenting
speech signal: time-domain signal features and frequency-
domain signal features.
A. Time-Domain Signal Features
Time-domain features are widely used for speech segment
extraction. These features are useful when it is needed to have
algorithm with simple implementation and efficient
calculation. The most used features are short-time energy and
short-term average zero-crossing rate.
1) Short-Time Signal Energy
Short-term energy is the principal and most natural feature
that has been used. Physically, energy is a measure of how
much signal there is at any one time. Energy is used to
discover voiced sounds, which have higher energy than
silence/un-voiced, in a continuous speech, as shown in Figure-
1.
The energy of a signal is typically calculated on a short-
time basis, by windowing the signal at a particular time,
squaring the samples and taking the average [18]. The square
root of this result is the engineering quantity, known as the
root-mean square (RMS) value, also used. The short-time
energy function of a speech frame with length N is defined as
=
=
N
m
n
m n w m x
N
E
1
2
)] ( ) ( [
1
(1)
The short-term root mean squared (RMS) energy of this
frame is given by:
=
=
N
m
RMS n
m n w m x
N
E
1
2
) (
)] ( ) ( [
1
........ (2)
Where ) (m x is the discrete-time audio signal and ) (m w
is rectangle window function, given by the following equation:
s s
=
Otherwise
N n
m w
0
1 0 1
) ( (3)
2) Short-Time Average Zero-Crossing Rate
The average zero-crossing rate refers to the number of
times speech samples change algebraic sign in a given frame.
The rate at which zero crossings occur is a simple measure of
the frequency content of a signal. It is a measure of number of
times in a given time interval/frame that the amplitude of the
speech signals passes through a value of zero [19]. Unvoiced
speech components normally have much higher ZCR values
than voiced ones, as shown in Figure 2. The short-time
average zero-crossing rate is defined as
=
=
N
m
n
m n w m x m x Z
1
) ( | )] 1 ( sgn[ )] ( sgn[ |
2
1
(4)
Where,
<
>
=
0 ) ( 1
0 ) ( 1
)] ( sgn[
m x
m x
m x (5)
andw(n) is a rectangle window of length N, given in equation
(3).
B. Frequency-Domain Signal Features
The most information of speech is concentrated in 250Hz
6800Hz frequency range [20]. In order to extract frequency-
domain features, discrete Fourier transform (that provides
information about how much of each frequency is present in a
signal) can be used. The Fourier representation of a signal
shows the spectral composition of the signal.Widely used
frequency-domain features are spectral centroid and spectral
flux feature sequences that used discrete Fourier transform.
Figure 1. Original signal and short-time energy curves of the speech sentence, .
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
133 | P a g e
www.ijacsa.thesai.org
Figure 2. Original signal and short-term average zero-crossing rate curves of speech sentence, .
1) Spectral Centroid
The spectral centroid is a measure used in digital signal
processing to characterize a spectrum. It indicates where the
"center of gravity" of the spectrum is. This feature is a
measure of the spectral position, with high values
corresponding to brighter sounds [21], as shown in Figure-3.
The spectral centroid, SC
i
, of the i-th frame is defined as the
center of gravity of its spectrum and it is given by the
following equation:
=
=
1
0
1
0
) (
) ( ) (
N
m
i
N
m
i
i
m X
m X m f
SC . (6)
Here, f(m) represents the center frequency of i-th bin with
length N and ) (m X
i
is the amplitude corresponding to that
bin in DFT spectrum. The DFT is given by the following
equation and can be computed efficiently using a fast Fourier
transform (FFT) algorithm [22].
1 ..., , 0 ; ) (
1
0
2
= =
N k e n x X
N
n
N
n
k j
k
t
. (7)
2) Spectral flux
Spectral flux is a measure of how quickly the power
spectrum of a signal is changing (as shown in Figure 4),
calculated by comparing the power spectrum for one frame
against the power spectrum from the previous frame, also
known as the Euclidean distance between the two normalized
spectra. The spectral flux can be used to determine the timbre
of an audio signal, or in onset detection [23], among other
things. The equation of Spectral Flux, SF
i
is given by:
=
=
2 /
1
2
|) ) 1 ( | | ) ( (|
N
k
i i i
k X k X SF (8)
Here, ) (k X
i
is the DFT coefficients of i-th short-term
frame with length N, given in equation (7).
C. Speech Segments Detection
After computing speech feature sequences, a simple
dynamic threshold-based algorithm is applied in order to
detect the speech word segments. The following steps are
included in this thresholding algorithm.
1. Get the feature sequences from the previous feature
extraction module.
2. Apply median filtering in the feature sequences.
3. Compute the Mean or average values of smoothed
feature sequences.
4. Compute the histogram of the smoothed feature
sequences.
5. Find the local maxima of histogram.
6. If at least two maxima M
1
and M
2
have been found,
Then:
Threshold,
1
*
2 1
+
+
=
W
M M W
T .. (9)
Otherwise,
Threshold,
2
Mean
T = ... (10)
where W is a user-defined weight parameter [24]. Large
values of W obviously lead to threshold values closer to
M
1
.Here, we used W=100.
The above process is applied for both feature sequences
and finding two thresholds: T
1
based on the energy sequences
and T
2
on the spectral centroid sequences. After computing
two thresholds, the speech word segments are formed by
successive frames for which the respective feature values are
larger than the computed thresholds (for both feature
sequences). Figure-5 shows both filtered feature sequences
curves with threshold values.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
134 | P a g e
www.ijacsa.thesai.org
Figure 3. The graph of original signal and spectral centroid features of speech sentence, .
Figure 4. The curves of original signal and spectral flux features of speech sentence, .
Figure 5. Original speech signal and median filtered feature sequences curves with threshold values of a speech sentence .
Threshold = 0.007684
Threshold = 0.003570
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
135 | P a g e
www.ijacsa.thesai.org
(a)
(b)
Figure 6. Speech Segments:(a) Segment, where li is the length of i segment, di i+1 is the distance between two segments and si
is start time of i segment, ei is the end time of i segment. (b) Detection of start and end point of each segment in a speech
sentence.
D. Post Processing of Detected Segments
As shown in Figure 6, detected speech segments are
analyzed in post-processing stage. Common segmentation
errors are: short segments usually are noises/silences, and two
segments with short space in between can be the same
segment.
Post-processing with rule base can fix these and similar
mistakes. Waheed [25] proposed to use 2 rules:
1. If Length l
i
min < and Space d
i i
min
1
>
+
, then
the segment i is discarded, similarly if
Length l
i
min
1
<
+
and Space d
i i
min
1
>
+
, then
segment i+1 is discarded.
2. If Length l or l
i i
min
1
>
+
and
Space d
i i
min
1
>
+
and FL l l
i i
> +
+1
, then two
segments are merged and anything between the two
segments that was previously left, is made part of the
speech.
IV. IMPLEMENTATION
The automatic speech segmentation system has been
implemented in Windows environment and we have used
MATLAB Tool Kit for developing this application. The
proposed speech segmentation system has six major steps, as
shown in Figure 7.
A. Speech Acquisition
B. Signal Preprocessing
C. Speech Features Extraction
D. Histogram Computation
E. Dynamic thresholding and
F. Post-Processing
A. Speech Acquisition
Speech acquisition is acquiring of continuous Bangla
speech sentences through the microphone.
Speech capturing or speech recording is the first step of
implementation. Recording has been done by native male
speaker of Bangali. The sampling frequency is 16 KHz;
sample size is 8 bits, and mono channels are used.
B. Signal Preprocessing
This step includes elimination of background noise,
framing and windowing. Background noise is removed from
the data so that only speech samples are the input to the
further processing. Continuous speech signal has been
separated into a number of segments called frames, also
known as framing. After the pre-emphasis, filtered samples
have been converted into frames, having frame size of 50
msec. Each frame overlaps by 10 msec. To reduce the edge
effect of each frame segment windowing is done. The
window, w(n), determines the portion of the speech signal that
is to be processed by zeroing out the signal outside the region
of interest. Rectangular window has been used.
C. Short-term Feature Extraction
After windowing, we have been computed the short-term
energy features and spectral centroid features of each frame of
the speech signal. These features have been discussed in
details in Section 3. In this step, median filtering of these
feature sequences also computed.
D. Histogram Computation
Histograms of both smoothed feature sequences are
computed in order to find the local maxima of the histogram,
from which the threshold values are calculated.
E. Dynamic Thresholding
Dynamic thresholding is applied for both feature
sequences and finding two thresholds: T
1
and T
2
, based on the
energy sequences and the spectral centroid sequences
respectively.
After computing two thresholds, the speech word segments
are formed by successive frames for which the respective
feature values are larger than the computed thresholds.
Segment-1Segment-2 Segment-3 Segment-4 Segment-5 Segment-6
s
1
e
1
s
2
e
2
s
3
e
3
s
4
e
4
s
5
e
5
s
6
e
6
d
12
d
23
d
34
d
45
d
56
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
136 | P a g e
www.ijacsa.thesai.org
F. Post-Processing
In order to segment words/sub-words, the detected speech
segments are lengthened by 5 short term windows (each
window of 50 msec), on both sides in the post-processing step.
Two segments with short space in between have been merged
to get final speech segments. These segmented speech words
are saved as *.wav file format for further use.
Figure 7. Block diagram of the proposed Automatic Speech Segmentation system.
Figure-8. The segmentation results for a speech sentences which contains
5 (five) speech words. The first subfigure shows the original signal. The second subfigure shows the sequences
of the signals energy. In the third subfigure the spectral centroid sequence is presented. In both cases, the
respective thresholds are also shown. The final subfigure presents the segmented words in dashed circles.
Threshold = 0.005141
Threshold = 0.007387
Continuous Bangla Sentences
Signal Preprocessing
(Pre-emphasis, framing and windowing)
Speech Feature Extraction
(Short-time energy and spectral centroid
features computation )
Segmented Speech Words
Dynamic Thresholding
(Speech segment detection)
Post-Processing
(Speech word segmentation)
Histogram Computation
(Finding local maxima)
Word1 Word2 Word3 Word4 Word5
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
137 | P a g e
www.ijacsa.thesai.org
TABLE 1. SEGMENTATION RESULTS
Sentence
ID
No. of word
segments expected
No. of words properly
segmented by system
Segmentation rate (%)
Success rate Failure rate
S1 6 6 100 0
S2 10 10 100 0
S3 9 9 100 0
S4 5 4 80 20
S5 10 10 100 0
S6 7 7 100 0
S7 8 8 100 0
S8 11 10 90.9 9.1
S9 9 8 88.89 11.11
S10 5 5 100 0
Total 80 77 96.25 3.75
V. EXPERIMENTAL RESULTS
In order to evaluate the performance of the proposed
system different experiments were carried out. All the
techniques and algorithms discussed in this paper have been
implemented in Matlab 7.12.0 version. In this experiment,
various speech sentences in Bangla language have been
recorded, analyzed and segmented by using time-domain and
frequency-domain features with dynamic thresholding
technique. Figure 8 shows the filtered short-time energy and
spectral centroid features of the Bangla speech sentence
, where the boundaries of words
are marked automatically by the system. Table-1 shows the
details segmentation results for ten speech sentences and
reveals that the average segmentation accuracy rate is 96.25%,
and it is quite satisfactory.
VI. CONCLUSION AND FURTHER RESEARCH
We have presented a simple speech features extraction
approach for segmenting continuous speech into word/sub-
words in a simple and efficient way. The short-term speech
features have been selected for several reasons. First, it
provides a basis for distinguishing voiced speech components
from unvoiced speech components, i.e., if the level of
background noise is not very high, the energy of the voiced
segments is larger than the energy of the silent or unvoiced
segments. Second, if unvoiced segments simply contain
environmental sounds, then the spectral centroid for the voiced
segments is again larger. Third, its change pattern over the
time may reveal the rhythm and periodicity nature of the
underlying sound.
From the experiments, it was observed that some of the
words were not segmented properly. This is due to different
causes: (i) the utterance of words and sub-words differs
depending on their position in the sentence, (ii) the pauses
between the words or sub-words are not identical in all cases
because of the variability of the speech signals and (ii) the
non-uniform articulation of speech. Also, the speech signal is
very much sensitive to the speakers properties such as age,
sex, and emotion. The proposed approach shows good results
in speech segmentation that achieves about 96% of
segmentation accuracy. This reduces the memory requirement
and computational time in any speech recognition system.
The major goal of future research is to search for possible
mechanisms that can be employed to enable top-down
feedback and ultimately pattern discovery by learning. To
design more reliable system, future systems should employ
knowledge (syntactic or semantic) of linguistics and more
powerful recognition approaches like Gaussion Mixture
Models (GMMs), Time-Delay Neural Networks (TDNNs),
Hidden Markov Model (HMM), Fuzzy logic, and so on.
REFERENCES
[1] OkkoRasanen, Speech Segmentation and Clustering Methods for a
New Speech Recognition Architecture, M.Sc Thesis, Department of
Electrical and Communications Engineering, Laboratory of Acoustics
and Audio Signal Processing, Helsinki University of Technology,
Espoo, November 2007.
[2] R. Thangarajan, A. M. Natarajan, M. Selvam , Syllable modeling in
continuous speech recognition for Tamil language, International
Journal of Speech Technology , vol. 12, no. 1, pp. 47-57, 2009.
[3] Hioka Y and Namada N, Voice activity detection with array signal
processing in the wavelet domain, IEICE TRANSACTIONS on
Fundamentals of Electronics, Communications and Computer Sciences,
86(11):2802-2811, 2003.
[4] Beritelli F and Casale S, Robust voiced/unvoiced classification using
fuzzy rules, In 1997 IEEE workshop on speech coding for
telecommunications proceeding, pages5-6, 1997.
[5] Qi Y and Hunt B, Voiced-unvoiced-silence classification of speech
using hybrid features and a network classifier, IEEE Transactions on
Speech and Audio Processing, I(2):250-255, 1993.
[6] Basu S, A linked-HMM model for robust voicing and speech
detection, In IEEE international conference on acoustics, speech and
signal processing (ICAASSP03), 2003.
[7] Atal B and Rabiner L, A pattern recognition approach to voice-
unvoiced-silence classification with applications to speech recognition,
IEEEASSP, ASSP-24(3):201-212, June 1976.
[8] Siegel L and Bessey A, Voiced/unvoiced/mixed excitation
classification of speech, IEEE Transactions on Acoustics, Speech and
Signal Processing, 30(3):451-460, 1982.
[9] Shin W, Lee B, Lee Y and Lee J, Speech/Non-speech classification
using multiple features for robust endpoint detection, In 2000 IEEE
International Conference on Acoustics, Speech and Signal Processing,
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
138 | P a g e
www.ijacsa.thesai.org
ICASSP00 Proceedings, Vol.3, 2000.
[10] Kida Y and Kawahara T, Voice activity detection based on optimally
weighted combination of multiple features, In 9
th
European Conference
on Speech Communication and Technology, ISCA, 2005.
[11] Kvale K, Segmentation and Labeling of Speech, PhD Dissertation,
The Norwegian Institute of Technology, 1993.
[12] Antal M, Speaker Independent Phoneme Classification in Continuous
Speech,Studia Univ. Babes-Bolyal, Informatica, Vol. 49, No. 2, 2004.
[13] Sharma M and Mammone R, Blind speech segmentation: Automatic
segmentation of speech without linguistic knowledge, Spoken
Language, 1996. ICSLP 96. Proceedings. Vol. 2, pp. 1237-1240, 1996.
[14] Sai Jayram A K V, Ramasubramanian V and Sreenivas T V, Robust
parameters for automatic segmentation of speech, Proceedings IEEE
International Conference on Acoustics, Speech and Signal Processing
(ICASSP '02), Vol. 1, pp. 513-516, 2002.
[15] Schiel F, Automatic Phonetic Transcription of Non-Prompted
Speech,Proceedings of the ICPhS 1999. San Francisco, August 1999.
pp. 607-610, 1999.
[16] Knill K and and Young S, Hidden Markov Models in Speech and
Language Processing, Kluwer Academic Publishers, pp. 27-68, 1997.
[17] Juang B H and Rabiner L R, Automatic Speech Recognition A Brief
History of The Technology Development, Elsevier Encyclopedia of
Language and Linguistics, Second Edition, 2005.
[18] Tong Zhang and Jay C CKuo, Hierarchical classification of audio data
for acrhiving and retrieving, In International Conference on Acoustics,
Speech and Signal Processing, volume VI, pages 30013004. IEEE,
1999.
[19] L R Rabiner and M R Sambur, An Algorithm for determining the
endpoints of Isolated Utterances, The Bell System Technical Journal,
February 1975, pp 298-315.
[20] Niederjohn R and Grotelueschen J, The enhancement of speech
intelligibility in high noise level by high-pass filtering followed by rapid
amplitude compression, IEEE Transactions on Acoustics, Speech and
Signal Processing, 24(4), pp277-282, 1976.
[21] T Giannakopoulos, Study and application of acoustic information for
the detection of harmful content and fusion with visual information
Ph.D. dissertation, Dept. of Informatics and Telecommunications,
University of Athens, Greece, 2009.
[22] Cooley, James W. and Tukey, John W. An algorithm for the machine
calculation of complex Fourier series, Mathematics of Computation:
Journal Review, 19: 297301, 1965.
[23] Bello J P, Daudet L, Abdallah S, Duxbury C, Davies M, and Sandler
MB, A Tutorial on Onset Detection in Music Signals", IEEE
Transactions on Speech and Audio Processing 13(5), pp 10351047,
2005.
[24] T Giannakopoulos, A Pikrakis and S. Theodoridis A Novel Efficient
Approach for Audio Segmentation, Proceedings of the 19th
International Conference on Pattern Recognition (ICPR2008),
December 8-11 2008, Tampa, Florida, USA.
[25] Waheed K, Weaver K and Salam F, A robust algorithm for detecting
speech segments using an entropic contrast, In Proc. of the IEEE
Midwest Symposium on Circuits and Systems, Vol.45, Lida Ray
Technologies Inc., 2002.
AUTHORS PROFILE
Md. MijanurRahman
Mr. Md. Mijanur Rahman is working as
Assistant Professor of the Dept. of Computer
Science and Engineering at Jatiya Kabi Kazi
Nazrul Islam University, Trishal, Mymen singh,
Bangladesh. Mr. Rahman obtained his B. Sc.
(Hons) and M. Sc degrees, both with first class
first in Computer Science and Engineering from
Islamic University, Kushtia, Bangladesh. Now
he is a PhD researcher of the department of
Computer Science and Engineering at Jahangir
nagar University, Savar, Dhaka, Bangladesh.
His teaching and research interest lies in the areas such as Digital Speech
Processing, Pattern Recognition, Database management System, Artificial
Intelligence, etc. He has got many research articles published in both national
and international journals.
Dr. Md. Al-Amin Bhuiyan
Dr. Md. Al-Amin Bhuiyan is serving as a
professor of the Dept. of Computer Science and
Engineering. Dr. Bhuiyan obtained his B. Sc.
(Hons) and M. Sc degrees both with first class in
Applied Physics & Electronics from University
of Dhaka, Bangladesh. He completed his PhD
study in Information & Communication
Engineering from Osaka City University, Japan.
His teaching and research interest lies in the areas
such as Image Processing, Computer Vision, Computer Graphics, Pattern
Recognition, Soft Computing, Artificial Intelligence, Robotics, etc. He has got
many articles published in both national and international journals.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
139 | P a g e
www.ijacsa.thesai.org
A Randomized Fully Polynomial-time Approximation
Scheme for Weighted Perfect Matching in the Plane
Yasser M. Abd El-Latif
Faculty of Science,
Ain Shams University,
Cairo, Egypt
Salwa M. Ali
Faculty of Science,
Ain Shams University,
Cairo, Egypt
Hanaa A.E. Essa
Faculty of Science,
Tanta University,
Tanta, Egypt
Soheir M. Khamis
Faculty of Science,
Ain Shams University,
Cairo, Egypt
Abstract In the approximate Euclidean min-weighted perfect
matching problem, a set V of n 2 points in the plane and a real
number 0 > c are given. Usually, a solution of this problem is a
partition of points of V into n pairs such that the sum of the
distances between the paired points is at most
) 1 ( c +
times the
optimal solution.
In this paper, the authors give a randomized algorithm which
follows a Monte-Carlo method. This algorithm is a randomized
fully polynomial-time approximation scheme for the given
problem. Fortunately, the suggested algorithm is a one tackled
the matching problem in both Euclidean nonbipartite and
bipartite cases.
The presented algorithm outlines as follows: With repeating
(
c / 1 times, we choose a point from V to build the suitable pair
satisfying the suggested condition on the distance. If this
condition is achieved, then remove the points of the constructed
pair from V and put this pair in M (the output set of the
solution). Then, choose a point and the nearest point of it from
the remaining points in V to construct a pair and put it in M .
Remove the two points of the constructed pair from V and
repeat this process until V becomes an empty set. Obviously, this
method is very simple. Furthermore, our algorithm can be
applied without any modification on complete weighted graphs
m
K and complete weighted bipartite graphs
n n
K
,
, where
1 , > m n and m is an even.
Keywords- Perfect matching; approximation algorithm; Monte-
Carlo technique; a randomized fully polynomial-time
approximation scheme; and randomized algorithm.
I. INTRODUCTION
In this paper, the authors deal with Euclidean min-
weighted perfect matching problem. This problem and its
special cases are very important since they have several
applications in many fields such as operations research,
pattern recognition, shape matching, statistics, and VLSI, see
[1], [2], and [3].
The previous studies treated the underlining problem in
several versions, e.g. (un-)weighted general graphs, bipartite
graphs, and a case of a set of points in Euclidean plane. For
un-weighted bipartite graphs, Hopcroft and Karp showed that
maximum-cardinality matchings can be computed in
) ( n m O
time [1]. In [4], Micali and Vazirani introduced an
) ( n m O
algorithm for computing maximum-cardinality matchings on
un-weighted graphs. Goel et al. [5] presented an ) log ( n n O
algorithm for computing the perfect matchings on un-weighted
regular bipartite graphs.
In 1955, Kuhn used the Hungarian method for solving the
assignment problem. He introduced the first polynomial time
algorithm on weighted bipartite graphs having n vertices for
computing min-weighted perfect matching in ) (
3
n O time [6].
The matching problem on complete weighted graphs with n 2
vertices was solved in
) (
4
n O
by Edmonds' algorithm [7]. In
[8], Gabow improved Edmonds' algorithm to achieve
)) log ( ( n n m n O + time, where mis the number of edges in a
graph. For the Euclidean versions of the matching problem,
Vaidya [9] showed that geometry can be exploited to get
algorithms running in ) log (
) 1 ( 2 / 5
n n O
O
time for both the
bipartite and nonbipartite versions. Agarwal et al. [10]
improved this running time for the bipartite case to ) (
2 o +
n O ,
where 0 > o is an arbitrarily small constant. For nonbipartite
case, the running time was improved to ) log (
2 / 3
n poly n O , see
details in [11].
Several researchers did more efforts to find good
approximation algorithms for optimal matching which are
faster and simpler than the algorithms obtained the optimal
solutions (e.g. [12], [13]).
In [13], Mirjam and Roger gave an approximation
algorithm for constructing a minimum-cost perfect matching
on complete graphs whose cost functions satisfy the triangle
inequality. The running time of that algorithm is ) log (
2
n n O
and its approximations ratio is n log . For bipartite version
on two disjoint n -point sets in the plane, Agarwal and
Varadarajan [14] showed that an c -approximate matching can
be computed in
) log ) / ((
5 2 / 3
n n O c
time. In [15], Agarwal and
Varadarajan proposed a Monte Carlo algorithm for computing
an )) / 1 (log( c O -approximate matching in
) (
1 c +
n O
time.
Based on Agarwal's ideas, Indyk [16] presented an
) log (
) 1 (
n n O
O
algorithm with probability at least 1/2 and a cost
at most ) 1 ( O times the cost of the optimal matching.
Sharathkumar and Agarwal [17] introduced a Monte-Carlo
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
140 | P a g e
www.ijacsa.thesai.org
algorithm that computes an c-approximate matching in
) log ) / ((
) (
n poly n O
d O
c
time with high probability.
In this paper, a randomized approximation algorithm for
Euclidean min-weighted perfect matching problem is
demonstrated. This algorithm follows Monte-Carlo technique.
It computes an c-approximate Euclidean matching of a set of
n 2 points with probability at least 1/2. The probability is
improved when running the algorithm more and more. We
show that the algorithm is a RFPTAS (Randomized Fully
Polynomial Time Approximation Scheme) for underline
problem. We apply this algorithm for the Euclidean cases
having n 2 points and for general complete weighted graphs
having n 2 vertices.
The paper is organized as follows. In the next section,
some basic definitions and concepts needed for the description
of the topic are introduced. In section III, the description of
our randomized algorithm for the min-weighted perfect
matching problem in the plane is given. It is also proven that
the given algorithm is a RFPTAS. Moreover, the same section
contains some results of computational experiments on some
classes of graphs. Finally, a conclusion of the paper is
introduced in section IV.
II. BASIC DEFINITIONS AND CONCEPTS
Henceforth, let V be a set of n 2 points in the plane. A
matching of V is a collection M of pairs of V such that no
point in V belongs to more than one pair in M [14]. A
perfect matching of V is a matching of V in which every
point in V belongs to exactly one pair of ) , (
7 1
u u d z = and
so, a perfect matching of V has n pairs. Let ) , ( v u d be
referred to the specified distance between u and v . The
weight of a matching M is defined by summing the
Euclidean distances between the paired points. For 0 > c , an
-approximate perfect matching is a perfect matching
such that the weight of M is at most ) 1 ( c + times the weight
of a minimum weighted perfect matching [17]. In this paper,
we consider the approximate Euclidean min-weighted perfect
matching problem that finds an c -approximate perfect
matching of . Let U be an optimization problem. An
algorithm is called a Randomized Fully Polynomial-Time
Approximation Scheme (RFPTAS) [18] for U if there exists a
function N R N p
+
: , such that for every input ) , ( c x ,
1 )) ( ) , ( Pr( = e x M x A c {for every random choice A
computes a feasible solution of U },
2 / 1 ) 1 ) , ( Pr( > + s c c x R {a feasible solution, whose
approximation ratio is at most
) 1 ( c +
, is produced with the
probability at least 1/2}, and
) |, (| ) , (
1 1
s c c x p x Time
and
p
is a polynomial in
both its arguments
| | x
and
1
c .
In the approximate Euclidean bipartite min- weighted
perfect matching problem, a real number 0 > c and two
disjoint sets
1
V and
2
V such that each of which has points
are given. In this version, if a pair ) , ( v u belongs to the
matching, then
1
V ue and
2
V ve . The solution of this
problem is to find an -approximate perfect bipartite
matching of
2 1
V V Y .
In the following, the definition of an admissible pair is
given. This definition is very important to design the main
condition of our algorithm. We introduce first the meaning of
a near point. Suppose V is a set of points. For any
, V ve
V u e
is called a near point of v if for all V u e ' ;
) u' d(v, u) d(v s , .
Definition 2.1 Let V be a set of points. For any V v u e , ,
a pair ) , ( v u is called an admissible pair if
) u' )d(u, + ( v) d(u, 1 s or ) v' )d(v, + ( v) d(u, 1 s ,
otherwise it is called an inadmissible pair, where ' , ' v u are the
near points of v u, , respectively.
Let V be a set of points and n V 2 | | = . For each pair
) , ( v u e = and V v u e , , we consider a random variable
e
X
defined by 1 =
e
X if e is an inadmissible pair and 0 =
e
X
otherwise. Since each point V ue has at least a point V u e '
such that the pair ) ' , ( u u e = is an admissible pair (e.g. ' u is
the near point of u ). Thus,
and
2 2
/ ) 1 ( 1 ) 0 Pr( n n X
e
> = . Let X be a random variable
defined by
e
=
M e
e
X X
, where M is any matching of V .
This means that X counts the number of inadmissible pairs in
any matching M .
Let V v e and V V c ' . Assume that v is a near point of
every point in ' V . The next lemma gives the upper bound of
the cardinality of ' V .
Lemma 2.1 Let
V
be a set of
n 2
points in the plane. If
V v e
and
V V c '
satisfying
v
is a near point to every point
in
' V
, then 6 | ' | s V .
Proof. Assume that 7 | ' | > V and v is a near point to every
point in ' V . First, consider 7 | ' | = V , i.e., } ,..., , { '
7 2 1
u u u V = . In
the plane, connect the point v to all points in ' V as shown in
Fig.1. Clearly from computational geometry that there exist at
least one angle, o . Consider w.l.o.g.
c
M
V
A
n
c
2 2 2 2
/ ) 1 ( ) 2 /( ) 2 2 ( ) 1 Pr( n n n n X
e
= s =
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
141 | P a g e
www.ijacsa.thesai.org
7 1
vu u = o . To complete the proof, connect
1
u to
7
u . From
Fig.1, ) , (
1
u v d x = , ) , (
7
u v d y = , and ) , (
7 1
u u d z = .
Note that
y x s
So,
Since
()
(3)
Where, V
sat
and L
sat
denote the ground speed of satellite and
the coverage boundary length, respectively. D
L
(V
sat
.t) denote
the linear density of nodes on the coverage boundary at time t.
B. PatHO-LEO: In the PatHO-LEO model, the local
forwarding and paging scheme create some additional cost.
The total cost of PatHO-LEO model C
PatHO-LEO
(t) is
C
PatHO-LEO
(t) = M.H
MN,LD
+ M.H
AR,AR
R
HO
(t). +
{M.H
AR,AR
(S-1)+M.S}*n(t)(1-). (4)
Where, .H
AR,AR
and S denote the number of hops between
two adjacent satellites and the number of single-beam
satellites that cover a single paging area, respectively. n(t) and
denote the total number of MNs per a coverage area at time t
and the ratio of active MNs to the total number of MNs,
respectively. The rate of new connections to a MN is denoted
as .
C. Proposed Work: In our proposed method we have
introduced the billboard manager (BM) so that we have
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
149 | P a g e
www.ijacsa.thesai.org
successfully removed the scanning cost but it will introduce
the messaging cost which we will evaluate now
1. Messaging Cost in a day: As all channel sends
periodically the messages to BM which contains two
information channel capacity and signal strength so if N be
the total number of satellite and t (in sec) will be the time
interval between sending two successive messages so the
total number of messaging in a day will be
C
MSG,DAY
= 24*{(M.H
Sat,BM
*3600)/t}*n (5)
2. Messaging Cost per handover: Now we will evaluate
the messaging cost per handover. When the signal strength
and signal to noise ratio decreases below the threshold level
then every MN sends the handover request HR_REQ and
after performing the BMBHO algorithm BM sends IP of the
current satellite and adjacent satellite to MN so the required
messaging cost between MN and BM C
MN,BM
is
C
MN,BM
= 2*M*H
MN,BM
(6)
Now the message transfer between satellites to BM is also
two which contains the IP address of the MN; one to the
current satellite and another to the adjacent satellite. So the
total messaging cost between BM and satellites C
Sat,BM
is
C
Sat,BM
= M*(H
CSat,BM
+ H
AdSat,BM
) (7)
Where H
CSat,BM
is the message transfer between the current
satellite and BM and H
AdSat,BM
is the message transfer between
adjacent satellite and BM.
Now if the handover involves only one satellite then
H
AdSat,BM
= 0
Let K be the total number of handover in a day and L be
the total number of handover which involves only one satellite
then equation 7 reduces to
C
Sat,BM
= M*{H
CSat,BM
*(K-L)+ H
AdSat,BM
*L) (8)
So now the total messaging cost for handover C
MSG,HO
is
C
MSG,HO
= C
MN,BM
+ C
Sat,BM
=2*M*H
MN,BM
+ M*{H
CSat,BM
*(K-L)+ H
AdSat,BM
*L)
(9)
So the total messaging cost C
MSG
is
C
MSG
= C
MSG,DAY
+ C
MSG,HO
= 24*{(M.H
Sat,BM
*3600)/t}*n +2*M*H
MN,BM
+ M*{H
CSat,BM
*(K-L)+ H
AdSat,BM
*L)
(10)
So the total cost of handover C
Tot
is
C
Tot
= (C
MSG
+ M.H
AR,AR
)* R
HO
(t)
=({24*{(M.H
Sat,BM
*3600)/t}*n +2*M*H
MN,BM
+ M*{H
CSat,BM
*(K-L)+ H
AdSat,BM
*L)} + M.H
AR,AR
)* R
HO
(t)
(11)
Equation 11 represents the total cost of algorithm based
BMBHO.
IV. SIMULATION RESULT:
In order to evaluate the performance of the new algorithm
based BMBHO, we compared it to MIP & SeaHO-LEO and
the previous BMBHO. Each algorithm is evaluated by
analyzing the Handoff delay, Forced call termination
probability & MNs throughput and efficiency.
The simulation results were run on MATLAB 7.8 in a
designed virtual environment.
Fig 5: Simulation results of MNs handover throughput
In figure 5 we compare the Handover throughput for MIP,
SeaHO-LEO & BMBHO and algorithm based BMBHO
during a handover. Due to the tunneling between HA and FA
in Mobile IP network, throughput of the channel between
MN1/CN and MN2/MN converges to zero during handover
and the handover model is completed, the throughput reaches
a reasonable value. SeaHO-LEO throughput is better than MIP
during handover as it does not reach to zero.
In BMBHO the throughput is higher than SeaHO-LEO
because the handover takes very less time and the packets
during handover is sent by the old link. And here we can see
that algorithm based BMBHO is far better as it has a specific
algorithm to choose the best satellite for establishing
connection.
In figure 6we have compared the handoff latency between
the MIP,SeaHO-LEO,BMBHO and the Algorithm based
BMBHO.comparing all the results we can conclude that due to
ommiting the scanning process handoff delay is very less in
BMBHO than the other two.
Now in the Algorithm Based BMBHO,we have taken the
BMBHO with just an algorithm to select the satellite.so for
this also handoff latency will be lesser than the MIP and
SeaHO-LEO and as this finds the easiest way and lesser time
to establish the connection following the specific algorithm its
handoff latency is lesser than the normal BMBHO also.
0 100 200 300 400 500 600
0
5
10
15
20
25
30
35
40
time
T
h
r
o
u
g
h
p
u
t
(
p
a
c
k
e
t
s
)
Simulation results of MNs handover throughput
MIP
SeaHO-LEO
BMBHO
Algorithmbased BMBHO
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
150 | P a g e
www.ijacsa.thesai.org
Fig 6: Simulation results of handover latency
Figure7: Forced call termination probability of a handover call
In figure 7 we compare the Forced call termination
probability of MIP & SeaHO-LEO, BMBHO with the
Algorithm Based BMBHO. Among these management
models, BMBHO has the lowest Forced call termination
probability. In MIP there is a channel allocation so MN has to
wait and if there is no free channel within the handoff time the
call will be terminated. In SeaHO-LEO the MN has to wait for
the agent advertisement from a new satellite. If it did not
received within handoff time the call is being terminated. But
in BMBHO the no of channel available in the satellites seen
by the MN at the time of handoff is already known to BM so
BM selects the new satellite for MN which has a free channel.
So the force call termination probability is reduced. Now in
our approach since there is a specific algorithm for choosing
the satellite. It will choose the best satellite with maximum
channel capacity and for that the connection will be efficient
and so is the communication and data transfer. That is why
there the call termination probability is almost equal to zero.
By the above result we can show that for even 2200 calls per
minute the forced call termination probability is almost equal
to 0.07 whereas for MIP it is 0.25, for SeaHO-LEO it is 0.21
and for BMBHO it is 0.17.
V. ACKNOWLEDGMENT
In this paper we have evaluated the total cost analysis of
algorithm based BMBHO management where we have shown
that the specific algorithm can reduce handover latency, data
loss, scanning time, cost and forced call termination
probability as well as can increase the MNs throughput and
the efficiency.
We first described the handover is and handover process.
Then we described the standard handover mechanism MIP and
also SeaHO-LEO and PatHO-LEO and our BMBHO and their
drawbacks. Then we have shown the specific algorithm to
reduce the drawback of BMBHO. Then we have evaluated the
total cost of Algorithm Based BMBHO. Relaying on the
simulation results we showed that our proposed mechanism
reduced handoff latency and data transfer. This algorithm can
help BM to choose the best satellite for handover so that the
call quality increases as well as the call dropping probability
reduces to zero. Our method is more efficient than the
standard one to establish the best connection.
FUTURE WORK
In future we will be focused on how to reduce the cost of
Algorithm Based BMBHO in LEO satellite Networks.
REFERENCES
[1] S. L. Kota, P. A. Leppanen, and K. Pahlavan, Broadband Satellite
Communications For Internet Access, Kluwer Academic Publishers,
2004.
[2] A. Jamalipour, Satellites in IP networks, in Wiley Encyclopedia of
Telecommunications, vol. 4, Wiley, 2002, pp. 21112122.
[3] Satellite Mobility Pattern Scheme for Centrical and Seamless Handover
Management in LEO Satellite Networks AysegulTuysuz and Fatih
Alagoz
[4] H. Uzunalioglu, I. F. Akyildiz, Y. Yesha, and W. Yen, Footprint
handover
[5] H. Uzunalioglu, I. F. Akyildiz, Y. Yesha, and W. Yen, Footprint
handover
Rerouting protocol for low earth orbit satellite networks, Wireless
Networks, vol. 5, no. 5, pp. 327337, 1999
[6] Systems By Joydeep Banerjee D Sarddar, S.K. Saha, M.K. Naskar,
T.Jana, U. Biswas
[7] H. N. Nguyen, S. Lepaja, J. Schuringa, and H. R. Van As, Handover
management in low earth orbit satellite IP networks, IEEE Global
Telecommunications Conference, San Antonio, TX, USA, pp. 2730
2734, 25-29 November 2001
[8] J. T. Malinen and C.Williams, Micromobility taxonomy, Internet
Draft, IETF, Nov. 2001
[9] Tuysuz and F. Alagoz, Satellite mobility pattern based handover
management algorithm in LEO satellites, in Proc. IEEE ICC 2006,
Istanbul, Turkey, June 2006..
[10] AysegulTuysuz and Fatih Alagoz, Satellite Mobility Pattern
Scheme for centrical and Seamless Handover Management in LEO
Satellite Networks, JOURNAL OF COMMUNICATIONS AND
NETWORKS, VOL. 8, NO. 4, DECEMBER 2006.
[11] M. Atiquzzaman, S. Fu, and W. Ivancic, TraSH-SN: A transport layer
seamless handoff scheme for space networks, in Proc. ESTC 2004, Palo
Alto, CA, June 2004.
[12] Debabrata Sarddar, Soumya Das, Dipsikha Ganguly, Sougata
Chakraborty, M.k.Naskar, A New Method for Fast and Low Cost
Handover in Leo Satellites(International Journal of Computer
Applications (0975 8887) Volume 37 No.7, January 2012)
http://www.ijcaonline.org/archives/volume37/number7/4622-6631
0 20 40 60 80 100 120 140 160 180
0
50
100
150
200
250
Simulation results of handover latency
time spend in overlapping areas
h
a
n
d
o
f
f
l
a
t
e
n
c
y
MIP
SeaHO-LEO
BMBHO
Algorithm Based BMBHO
0 200 400 600 800 1000 1200 1400 1600 1800 2000 2200
0
0.05
0.1
0.15
0.2
0.25
call arrival rate(calls/min)
F
o
r
c
e
d
c
a
l
l
t
e
r
m
i
n
a
t
i
o
n
p
r
o
b
a
b
i
l
i
t
y
Forced call termination probability of a handover call
MIP
SeaHO-LEO
BMBHO
AlgorithmBased BHBHO
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
151 | P a g e
www.ijacsa.thesai.org
[13] Debabrata Sarddar, Soumya Das, Dipsikha Ganguly, Sougata
Chakraborty, M.k.Naskar, A New Method for Controlling Mobility
Management Cost of PatHO- LEO Satellite and Mobile IP
Network(International Journal of Computer Applications (0975
8887)Volume37No.7,January2012
http://www.ijcaonline.org/archives/volume37/number7/4621-6630.
[14] Dipsikha Ganguly, Debabrata Sarddar, Soumya Das, Suman Kumar
Sikdar, Sougata Chakraborty and Kunal Hui. Article: Algorithm Based
Approach for the Connection Establishment in the Fast Handover in Leo
Satellites in BMBHO. International Journal of Computer
Applications 44(12):36-42, April 2012. Published by Foundation of
Computer Science, New York, USA
[15] Debabrata Sarddar, Dipsikha Ganguly, Soumya Das, Suman Kumar
Sikdar, Sougata Chakraborty, Kunal Hui, Shabnam Bandyopadhyay,
Kalyan Kumar Das and Sujoy Palit. Article: Cost Analysis of Location
Manager based Handover Method for LEO Satellite
Networks. International Journal of Computer Applications 45(19):1-6,
May 2012. Published by Foundation of Computer Science, New York,
USA
[16] P. Bhagwat, C. Perkins, and S. Tripathi, Network layer mobility: An
architecture and survey, IEEE Pers. Commun., vol. 3, no. 3, pp. 5464,
June1996.
[17] A. T. Campbell, J. Gomez, S. Kim, Z. Turanyi, C.-Y. Wan, and A.
Valko, Comparison of IP micro-mobility protocols, IEEE Wireless
Commun. Mag., vol. 9, no. 1, Feb. 2002.
[18] H. Tsunoda, K. Ohta, N. Kato, and Y. Nemoto, Supporting IP/LEO
satellite networks by handover-independent IP mobility management,
IEEE J.Select. Areas Commun., vol. 22, no. 2, pp. 300307, 2004.
[19] Debabrata Sarddar, Dipsikha Ganguly, Soumya Das, Suman Kumar
Sikdar, Sougata Chakraborty, Kunal Hui, Shabnam Bandyopadhyay,
Kalyan Kumar Das and Sujoy Palit. Article: Cost Analysis of Mobile IP
for the Repetitive IP Stations during a Short Period of
Time. International Journal of Computer Applications 45(19):7-12, May
2012. Published by Foundation of Computer Science, New York, USA.
[20] S. Kalyanasundaram, E.K.P Chong, and N.B. Shroff, An efficient
scheme to reduce handoff dropping in LEO satellite systems, Wireless
Networks, vol. 7, no. 1, pp. 7585, January 2001.
[21] H. N. Nguyen, S. Lepaja, J. Schuringa, and H. R. Van As, Handover
management in low earth orbit satellite IP networks, IEEE Global
Telecommunications Conference, San Antonio, TX, USA, pp. 2730
2734, 25-29 November 2001.
AUTHORS PROFILE
Suman Kumar Sikdar is currently pursuing his PhD at Kalyani
University . He completed his M.Tech in CSE from jadavpur University in
2011 and B-Tech in Computer Science & Engineering from Mursidabad
College of engineering and technology under West Bengal University of
Technology in 2007. His research interest includes wireless communication
and satellite communication. Email:[email protected]
Soumya Das, son of Mr. Subrata Das and Mrs. Swapna Das, currently
pursuing his B.Tech in Electronics & Communication Engg. at Bengal
Institute of Technology under West Bengal University of Technology. His
research interest includes mobile communication & satellite communication.
Debabrata Sarddar (Asst. Professor in Kalyani University) is
currently pursuing his PhD at Jadavpur University. He completed his M.Tech
in Computer Science & Engineering from DAVV, Indore in 2006, and his
B.Tech in Computer Science & Engineering from Regional Engineering
College(NIT), Durgapur in 2001. His research interest includes wireless and
mobile communication Email:[email protected]
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
152 | P a g e
www.ijacsa.thesai.org
An agent based approach for simulating complex
systems with spatial dynamicsapplication in the land
use planning
Fatimazahra BARRAMOU
Architecture System Team LISER
Hassan II University- AinChock ENSEM Casablanca,
Morocco
Malika ADDOU
Architecture System Team LISER
Hassania School of Public Works EHTP
Casablanca, Morocco
AbstractIn this research a new agent based approach for
simulating complex systems with spatial dynamics is presented.
We propose architecture based on coupling between two systems:
multi-agent systems and geographic information systems. We
also propose a generic model of agent-oriented simulation that we
will apply to the field of land use planning. In fact, simulating the
evolution of the urban system is a key to help decision makers to
anticipate the needs of the city in terms of installing new
equipment and opening new urbanization areas to install the
new population.
Keywords-Multi-agent system; Geographic information system;
Modeling; Simulation; Complex system; Land use planning.
I. INTRODUCTION
A great part of the challenge of modeling and simulating
interactions between natural and social processes has to do
with the fact that processes in these systems result in complex
temporal-spatial behavior. To deal with this problem, we
propose a general architecture based on two modules:
A GIS module: to instantiate the simulation engine.
A MAS module: to represent the interactions between
the different agents of the system
We also propose a generic agent-oriented model for
simulating complex systems with spatial dynamics.
We apply our model to the field of land use planning. In
fact, we will study the urban system and simulate its long term
evolution through a backdrop of demographic change in a
temporal-spatial heterogeneous environment.
Our objective is to understand how public policy can
influence the transformation of cities, especially in regard to
the opening of new urban areas and installing new
equipments.
II. STATE OF THE ART
In this section, we will detail the multi-agent systems,
geographic information systems, complex systems and agent-
based modeling. These points are the core of our design model
simulation.
A. Multi agents approach
1) Notion of agent
An agent is a physical or virtual feature, which owns all or
part of the following:[1]
Located in an environment: means that the agent can
receive sensory input from its environment and can
perform actions that are likely to change this environment.
Independent: means that the agent is able to act without
the direct intervention of a human (or another agent) and
he has control of its actions and its internal state.
Flexible means that the agent is:
-Able to respond in time: it can sense its environment and
respond quickly to changes taking place.
- Proactive: it does not simply act in response to its
environment; it is also able to behave opportunistically,
led by its aims or its utility function, and take initiatives
when appropriate.
- Social: it is capable of interacting with other agents
(complete tasks or assist others to complete theirs)
2) Multi agents system
A multi-agents system (MAS) is a system composed of a
set of agents situated in some environment and interacting
according to some relations. There are four types of agent
architecture [2]:
Reactive agent: is responding to changes in the
environment.
Deliberative agent: makes some deliberation to choose its
actions based on its goals.
Hybrid agent: that includes a deliberative as well as a
reactive component
Learner agent: uses his perceptions not only to choose its
actions, but also to improve its ability to act in the future.
B. Geographic Information System
1) Definition
According to the Environmental Systems Research
Institute (ESRI), a geographic information system (GIS)
integrates hardware, software, and data for capturing,
managing, analyzing, and displaying all forms of
geographically referenced information.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
153 | P a g e
www.ijacsa.thesai.org
Figure 1.Geographic Information System.
2) Functions of GIS
The Functions of GIS describe the steps that have to be
taken to implement a GIS. These steps have to be followed in
order to obtain a systematic and efficient system. The steps
involved are:
Data Capture: Data used in GIS often come from many
sources. Data sources are mainly obtained from Manual
Digitization and Scanning of aerial photographs, paper
maps, and existing digital data sets. Remote-sensing
satellite imagery and GPS are promising data input
sources for GIS.
Data Compilation: Following the digitization of map
features, the user completes the compilation phase by
relating all spatial features to their respective attributes,
and by cleaning up and correcting errors introduced as a
result of the data conversion process.
Data Storage: Once the data have been digitally compiled,
digital map files in the GIS are stored on magnetic or
other digital media. Data storage is based on a Generic
Data Model that is used to convert map data into a digital
form. The two most common types of data models are
Raster and Vector.
Manipulation:Once data are stored in a GIS, many
manipulation options are available to users. These
functions are often available in the form of "Toolkits."
Analysis:The heart of GIS is the analytical capabilities of
the system. The analysis functions use the spatial and
non-spatial attributes in the database to answer questions
about the real world. Geographic analysis facilitates the
study of real-world processes by developing and applying
models. Results of geographic analysis can be
communicated with the help of maps, or both.
C. Complex system
A complex system is a set consisting of a large number of
interacting entities that prevent the observer to predict its
feedback, behavior or evolution by calculation. It is
characterized by two main properties [3]-[4]:
Emergence is the appearance of behavior that could not be
anticipated from the knowledge of the parts of the system
alone
Self-organization means that there is no external
controller or planner engineering the appearance of the
emergent features
Traditional modeling approaches for complex systems
focus on either temporal or spatial variation, but not both. To
understand dynamic systems, patterns in time and spaceneed
to be examined together.
The proposed architecture for simulating complex systems
that is used in this study, meshes these fields together in an
approach called Simulating Complex System with Spatial
Dynamics.
D. Agent-Based Modeling of Complex Systems with spatial
dynamics
We note that the Agent-Based Modeling (ABM) is well
suited to complex systems with spatial dynamics. In fact, the
various components of these systems can be represented by
agents.
We mention below four characteristics that show why
Agent-Based Modeling is well suited to complex systems with
spatial dynamics [4]-[5]:
Emergence: ABMs allow to define the low-level behavior
of each individual agent in order to let them interact (over
time and space) to see whether some emergent property
arises or not, and if it does, under which circumstances.
Self-Organization: ABMs do not have any kind of central
intelligence that governs all agents. On the contrary, the
sole interaction among agents along with their feedbacks
is what ultimately controls the system. This lack of a
centralized control is what enables (and enforces) its self-
organization.
Coupled Human-Natural Systems: ABMs allow
considering together both, social organizations with their
human decision-making with biophysical processes and
natural resources. This conjunction of subsystems enables
ABMs to explore the interrelations between them,
allowing analyzing the consequences of one over the
other.
Spatially Explicit: the feature of ABMs of being able to
spatially represent an agent or a resource is of particular
interest when communications and interactions among
neighbors is a key issue. This can either imply some kind
of internal representation of space or even the use of a
Geographical Information System (GIS) with real data.
This feature is of special interest in the case of agro-
ecosystems.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
154 | P a g e
www.ijacsa.thesai.org
III. AGENT-ORIENTED SIMULATION VS CLASSIC
SIMULATION
A. Definitions
Simulate is to reproduce a phenomenon in order to [6]:
Test hypotheses to explain the phenomenon
Predicting the evolution of the phenomenon
Figure 2.Simulation process.
Model is a simplified picture of reality that we use to
understand the functioning of a system based on a question.
B. Classic simulation
1) Continuous models
Continuous time models are characterized by the fact that
in a finite time interval, the system state variables change
value continuously [6]-[7].
Figure 3.Continuous models
The simulation of continuous models encounters the
difficulty of reproducing the continuity of the system
dynamics due to the nature of the digital computer (solution:
use of numerical integration methods)
2) Discrete models
The evolution of variables' state of the system is discretely
at a time t after t + dt [7].
Figure 4. Discrete models
The simulation of discrete models can be summarized in
two steps:
Determine the function that implements the system
dynamics.
Increment time by one unit.
In this type of simulation the choice of the time step is very
important because:
If a large time step:
Problem of management competitive events;
Events in rapid occurrence are not considered
If small time step:
Problem of decomposition behavior in elementary
behaviors
3) Discrete event models
The time axis is generally continuous; its represented by a
real number. However, unlike continuous models, the system
state variables change discretely at precise moments that are
called events.
Figure 5. Discrete event models
In this type of simulation, there are three policies
implementation:
Scheduling of events: during the simulation, the future
events that must occur on the system are predetermined;
Analysis of activities: during the simulation, the events
are not planned but triggered when certain conditions are
met (collision of two vehicles, ...);
Interaction process: this approach is the combination of
the two others.
C. Limits of Classical Simulation
In the classical simulation, we found the following limits
[6]:
Equation in model with a large number of parameters;
Difficulty of passing micro/macro, unable to represent
different levels;
No representation of behavior, but their results;
Lack of realism (social sciences...);
Does not explain the emergence of space-time structures;
Difficulty of modeling the action.
D. Agent-Oriented Simulation
Agent-Oriented Simulation (AOS) is now used in a
growing number of sectors, where it gradually replaces the
various techniques of micro simulation and object-oriented
simulation. This is due to its fundamental characteristics
which are [6]-[9]:
Representation as computer agents (state, skills, abilities,
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
155 | P a g e
www.ijacsa.thesai.org
resources);
Representation of possible interactions between agents;
Representation of space-time environment (agent is
located).
This apparent versatility make the AOS the best choice for
simulating complex systems and it spreads in a growing
number of areas: sociology, biology, physics, chemistry,
ecology, economics, land use planning, etc.
The four aspects of a simulation model are:
Figure 6. Agent oriented simulation model
Module behavior: it relates to the deliberative process
modeling agents;
Module environment: the problem is to define the various
physical objects in the world and the dynamics of the
environment;
Module scheduling: it concerns the modeling of the flow
of time and the definition of scheduling used;
Module interaction: it relates specifically to the modeling
result of the actions and interactions they entail at time t.
IV. PROPOSED APPROACH
A. System Architecture
The simultaneous use of a geographical information
system and a multi-agents system introduces an additional
level of modeling complex system with spatial component.
We propose the following architecture:
Figure 7.Proposed architecture.
The module "GIS" is a descriptive representation of the
reality that is used to initialize the multi-agent model, it
contains the data to instantiate each one of the agents;
The module MAS is used to model the real system and
generate simulations involving agents that react and
interact and so it can test scenarios.
B. Generic model of agent-oriented simulation
To simulate complex systems with spatial component, we
propose the following generic model:
Figure 8.Generic model for agent-oriented simulation.
The environment is heterogeneous composed of objects
and spatial agents;
Spatial Agent: This is the basic entity that represents the
spatial part of our simulator.
Agent manager: it is the agent that creates and acts on all
environmental agents. Also, it interacts with the scheduler
agent to determine the scheduling of its actions on the
agents. It is responsible for the initialization of the
simulation;
Agent scheduler: Its role is to schedule the actions of the
agents on the environment;
Agent decision maker: it has several rules that must be
applied to the environment
V. APPLICATION
A. Problematic of land use planning
We propose to apply our architecture to the field of land
use planning. Our goal is to understand how public policy can
influence the transformation of cities, especially in regard to
the opening of new urban areas and installing new equipment.
The simulation will determine when, during the dynamic of
the city, public policy can act and have the most impact. As
the modeling context we choose the cityCasablanca, and
more specifically the district Dar Bouazza.
1) Population dynamic
The population of Casablanca was established in the 2004
census, approximately to 3.63 million resident, 3 325 000 in
urban areas and 305 000 in rural areas. It has an annual growth
(50 500 persons per year)
For the district Dar Bouazza, it has a population of
115367 residents with a growth rate of 1.8% [10].
2) Spatial dynamic
Not enough housing: Housing remains a subject of major
concern for the government. In fact, the insufficient offer
of housing is a major problem in the field of land use
planning.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
156 | P a g e
www.ijacsa.thesai.org
Lack of equipment in some region: the district Dar
bouazza is experiencing an apparent lack of equipment
compared to the normative grid of equipment to provide:
TABLE I. NORMATIVE GRID OF EQUIPMENTS
Lack of green spaces: Casablanca knows deficiencies of
green spaces. The average is less than 1m of public green
space per resident, compared to the standard 10 m2 per
resident of the World Health Organization.
B. Agent-oriented simulation model
We apply our agent-oriented model in the context of land
use planning, the system consists of: Agent Manager, Agent
schedulor, LandUseAgent and two agents decision maker.
Figure 9.Agent-oriented model for simulation applied to the field of land use
planning.
Agent Manager: it initializes the simulation and creates
displays. At each iteration, it executes the actions
scheduled in the Agent Scheduler;
Agent Scheduler: at each step, it called the actions of
decision makers in a synchronous manner. The step in our
model is one year;
LandUseAgent: This is the entity that constitutes the
spatial environment, it has:
An attribute "State": This contains information about the
mode of land use. It can take the following values (bare land,
land built, industrial area, green space)Attributes
"potentiel_urban, potentiel_equip, potentiel_environnement":
these attributes vary between 0-3 and indicate the potential of
the cell to be intended to urbanization, to receive equipment or
to be conserved as green space. The cell potential is calculated
based on the master plan of Casablanca that indicates the
destination overall soil over the next 30 years [10].
UrbainAgent: its objective is to ensure urban sprawl for
the installation of the new population; it obtained a
surface demand. At each cycle, the agent performs the
following operations:
Get the value of the new population;
Multiply this value by a stretch ratio set by the user;
Obtain a surface demand.
According to the constraints defined by the user agent
changes the state of the cells by taking them from its target list
(starting from the highest urban potential of the cells).
EquipementAgent: its aim is to provide a number of
facilities to suit the new population. At each cycle the
agent performs the following operations:
Get the value of the new population;
Calculate the number of existing facilities;
Calculate the amount of equipment required for the
installation of the new population;
Extract a demand.
According to the constraints defined by the user,
EquipmnentAgent changes the state of the cells by taking
them from its target list (starting from the highest potential of
the cells).
A. Realization
To implement our model, we chose the multi-agent
simulation platform Repast Simphony 2.0 [11] and the GIS
software ArcGIS.
Our work environment is:
Figure 10.Environment work.
Nature of equipment Users Area (m)
Teatching
primary school 1/8000 4000
school 1/16000 8000
high school 1/32000 10000
health
Urban Health Center 1/30000 500-1000
Local Hospital 1/100000 15000-30000
Sports
Sports area 1/45000 10000
Sports center 1/150000 50000
Public Services
Marketplace 50000 2500 - 4000
Administratif
Police station 1/45000 2000
civil Protection 1/100000 10000
Administrative Services
undefined 1/45000 3000
cultual
mosque 1/15000 3500
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
157 | P a g e
www.ijacsa.thesai.org
The user can specify the rate of urban growth and the ratio
of urban sprawl by this interface:
Figure 11.Environment work.
The initial state of the simulator is:
Figure 12.initial state of the simulator.
After twenty iterations, the population of Dar Bouazza
has increased from 115 367 to 156 899 residents. To install the
new population, UrbanAgent will open new areas to
urbanization. It calculates the demand and changes a number
of cells of the SpaceAgent (starting from the highest potential
of the cells) to built land. The choice of cells to change is done
in a random way. We obtain the following result:
Figure 13.Results of the simulation after five iterations.
After 20 iterations, we see that the number of Equipment
to create at "DarBouazza"is about twenty equipments as
detailed in the diagram below:
Figure 14. Need of equipments.
The installation of the new population will require the
construction of:
Eleven school equipments distributed as follows: Six
primary schools, three schools, one high school.
Figure 15. Need of school equipments.
Ten cultualequipments (mosque)
Figure 16. Need of cultualequipments.
Two health equipments distributed as follows
Figure 17. Need of health equipments.
One sport equipment distributed as follows:
New built land
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
158 | P a g e
www.ijacsa.thesai.org
Figure 18. Need of sport equipments.
Administrative services may not to be created as its
showed below:
Figure 19.Need of administrative services.
One public service:
Figure 20. Need of public services.
VI. CONCLUSION
In this paper, we propose a new approach for simulating
complex systems with spatial dynamics. First, we presented an
architecture based on coupling between the two systems GIS
and MAS. Then, we also proposed an agent-oriented
simulation model. We applied our model to the field of land
use planning. Our objective is to simulate the evolution of the
city based on two dynamics: population dynamics and spatial
dynamics. Results of the simulation show that after twenty
iterations, it will be necessary to open about40 ha to
urbanization and to construct about twenty equipments.
As perspectives to our work, we will enrich the knowledge
base of Decision maker agent.
In fact, more this model will be enriched by heating
engineers (City planners, statisticians, sociologists ) more it
can help decision makers to have several simulation scenarios.
Also, we can detail the agent scheduler to show how the
scheduling system can influence the outcome of the
simulation.
REFERENCES
[1] B. Chaib-Ddraa, I. Jarras, B. Moulin, Systmes multiagents : Principes
gnraux et applications , (2001) Herms.
[2] J. Ferber, Les systmes multi-agents, vers une intelligence collective,
Inter Editions (1995).
[3] Ing. Jorge Corral Aren, agent-based methodology for developing
agroecosystems simulations, university of republica (2011).
[4] CSIRO, "Complex or just complicated: what is a complex system?.
csiro.au [Online]. Available:
www.csiro.au/resources/AboutComplexSystems.html[Accessed: Sept. ,
2011].
[5] T. Moncion, Modlisation de la complexit et de la dynamique des
simulations multi-agents. Application pour lanalyse des phnomnes
mergents, PhDthesis, Universit dEvry val dEssonne, (2008).
[6] J. Ferber, Agent-based simulation, LIRMM - Universit Montpellier
II, (2009).
[7] D. Davis, Prospective Territoriale par Simulation Oriente Agent,
PhDthesis, La Reunion university, (2011).
[8] P. Dawn, A. Hessl, S. Davisl, Complexity, land-use modeling, and the
human dimension: Fundamental challenges for mapping unknown
outcome spaces, Geoforum, Volume 39 Issue 2: 789-804., 2008.
[9] F. Bousquet, C. Le Page, Multi-agent simulations and ecosystem
management: a review, Elsevier, (2004).
[10] IAURIF, Plan de dveloppement stratgique et schema directeur de
Casablanca, AUC (2008).
[11] Repast documentation, repast.sourceforge.net [Online].
Available:http://repast.sourceforge.net/docs.html[Accessed: October. ,
2012].
Fatimazahra, Barramou received her engineer degree in Geographic
Information System in 2009 at the Hassania School of Public Works (EHTP
:EcoleHassania des Travaux Publics), Casablanca, Morocco. In 2010 she
joined the system architecture team of the National and High School of
Electricity and Mechanic (ENSEM: Ecole Nationale Suprieured Electricitet
de Mcanique), Casablanca, Morocco. Her actual main research interests
concern Modeling and simulating complex systems based on Multi-Agent
Systems and GIS.
Ms. Barramou is actually a Software Engineer in the Urban Agency of
Casablanca.
Malika, Addou received her Ph.D. in Artificial Intelligence from
University of Liege, Liege, Belgium, in 1992. She got her engineer degree in
Computer Systems from the Mohammadia School of Engineers (EMI :Ecole
Mohammadia des ingnieurs), Rabat, Morocco in 1982. She is Professor of
Computer Science at the Hassania School of Public Works (EHTP
:EcoleHassania des Travaux Publics), Casablanca, since 1982. Her research
focuses on Software Engineering (methods and technologies for design and
development), on Information Systems (Distributed Systems) and on Artificial
Intelligence (especially Multi-Agent Systems technologies).
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No.11, 2012
159 | P a g e
www.ijacsa.thesai.org
Improving Web Page Prediction Using Default Rule
Selection
Thanakorn Pamutha, Chom Kimpan
Faculty of Information Technology
Rangsit University, RSU
Phathumthani, THAILAND
Siriporn Chimplee
Faculty of Science and Technology
Suan Dusit Rajabhat University, SDU
Bangkok, THAILAND
Parinya Sanguansat
Faculty of Engineering and Technology
Panyapiwat Institute of Management, PIM
Nonthaburi, THAILAND
AbstractMining user patterns of web log files can provide
significant and useful informative knowledge. A large amount of
research has been done in trying to predict correctly the pages a
user will most likely request next. Markov models are the most
commonly used approaches for this type of web access prediction.
Web page prediction requires the development of models that can
predict a users next access to a web server. Many researchers
have proposed a novel approach that integrates Markov models,
association rules and clustering in web site access predictability.
The low order Markov models provide higher coverage, but these
are couched in ambiguous rules. In this paper, we introduce the
use of default rule in resolving web access ambiguous predictions.
This method could provide better prediction than using the
individual traditional models. The results have shown that the
default rule increases the accuracy and model-accuracy of web
page access predictions. It also applies to association rules and
the other combined models.
Keywords-web mining, web usage mining; user navigation session;
Markov model; association rules; Web page prediction; rule-
selection methods.
I. INTRODUCTION
The rapid expansion of the World Wide Web has created an
unprecedented opportunity to disseminate and gather
information online. There is an increasing need to study web-
user behavior to improve the service to web users and increase
their value of enterprise. One important data source for this
study is the web-server log data that traces the users web
browsing actions [1]. Predicting web-users behavior and their
next movement has been recognized and discussed by many
researchers lately. The need to predict the next Web page to be
accessed by the users is apparent in most web applications
today whether or not they are utilized as search engines, e-
commerce solutions or mere marketing sites. Web applications
today are designed to provide a more personalized experience
for their users [2]. The result of accurate predictions can be
used for recommending products to customers, suggesting
useful links, as well as pre-sending, pre-fetching and caching
of web pages in reducing access latency [1]. There are various
ways that can help us make such a prediction, but the most
common approaches are the Markov models and association
rules. Markov models are used in identifying the next page to
be accessed by the Web site user based on the sequence of
their previously accessed pages. Association rules can be used
to decide the next likely web page requests based on
significant statistical correlations.
Yang, Li and Wang [1] have studied different association-
rule based methods for web request predictions. In this work,
they have examined two important dimensions in building
prediction models, namely, the type of antecedents of rules and
the criteria for selecting prediction rules. In one dimension,
they have a spectrum of rule representation methods which
are: subset rules, subsequence rules, latest subsequence rules,
substring rules and latest substring rules. In the second
dimension, they have rule-selection methods namely: longest-
match, most-confident and pessimistic selection. They have
concluded that the latest substring representation, coupled with
the pessimistic-selection method, gives the best prediction
performance. The authors [2-5] have applied the latest
substring representation using the most-confident selection
method to building association-rule based prediction models
from web-log data using association rules. In Markov models,
the transition probability matrix is built making and
predictions for web sessions are straight forward. In Markov
models, the target is to build prediction models to predict web
pages that may be next requested, The consequence to this is
that, only the highest condition probability is considered.
Hence, a prediction rule with the highest condition probability
is chosen.
In the past, researchers have proposed different methods in
building association-rule based prediction models using web
logs, but none had yielded significant results. In this paper, we
propose the default rule in resolving ambiguous predictions.
This method could provide better prediction than using the
traditional models individually.
The rest of this paper is organized as follows:
Sec.2 Related Works
Sec.3 Markov Models
Sec.4 Ambiguous rules
Sec.5 Experimental Setup and Result
Sec. 6 Conclusion
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No.11, 2012
160 | P a g e
www.ijacsa.thesai.org
II. RELATED WORK
There are wide application areas in analyzing the user web
navigation behaviors in web usage mining [6]. The analysis of
user web navigation behavior can help improve the
organization of web sites and web performance by pre-
fetching and caching the most probable next web page access.
Web Personalization and Adaptive web sites are some of the
applications often used in web usage mining. Web usage
mining can provide guidelines in improving e-commerce to
handle business specific issues like customer satisfaction,
brand loyalty and marketing awareness.
The most widely used approach is web usage mining that
includes many models like the Markov models, association
rules and clustering [5]. However, there are some challenges
with the current state of the art solutions when it comes to
accuracy, coverage and performance. A Markov model is a
popular approach to predict what pages are likely to be
accessed next. Many authors have proposed models for
modeling the user web navigation sessions. Yang, Li and Wang
[1] have studied five different representations of Association
rules which are: subset rules, subsequent rules, latest
subsequence rules, substring rules and latest substring rules.
As a result of the experiments, performed by the authors
concerning the precision of these five association rules
representations using different selection methods, the latest
substring rules were proven to have the highest precision.
Deshpande and Karypis [7] have developed techniques for
intelligently combining different order Markov models
resulting to lower state complexity, improved prediction
accuracy, while retaining the coverage of the All-K
th
-Order
Markov model. Three approaches had been widely used
namely frequency-pruning, error-pruning and support-pruning
to reduce the state space complexity. Khalil, Li and Wang [2]
have proposed a new framework in predicting the next web
page access. They used lower all k-th Markov models to
predict the next page to be accessed. If the Markov model is
not able to predict the next page access, then the association
rules are used to predict the next web page. They have also
proposed, on the other hand, solutions for prediction
ambiguities. Chimphlee [5] has proposed a hybrid prediction
model (HyMFM) that integrates Markov model, Association
rules and Fuzzy Adaptive Resonance Theory (Fuzzy ART)
clustering all together. These three approaches are integrated
to maximize their strengths. This model could provide better
prediction than using an individual approach. In fact, Khalil,
Li and Wang [4] had introduced the Integration Prediction
Model (IPM) by combining the Markov model, association
rules and clustering algorithm together. The prediction is
performed on the cluster sets rather than the actual sessions. A.
Anitha [8] has proposed to integrate a Markov model based
sequential pattern mining with clustering. A variant of Markov
model called the dynamic support, pruned all k
th
order in order
to reduce state space complexity. The proposed model
provides accurate recommendations with reduced state space
complexity. Mayil [9] has proposed to model users navigation
records as inferred from log data, A Markov model and an
algorithm scans the model first in order to find the higher
probability trails which correspond to the users preferred web
navigation trails.
Predicting a users next access on a website has attracted a
lot of research work lately due to the positive impact of
predictions on the different areas of web based applications
[4]. First, many of the papers proposed using association rules
or Markov models for next page predictions [1, 7, 9-11].
Second, many papers have addressed the uses of combining
both methodologies [2-5]. Third, many of the papers have
addressed the integration Markov model, Association rules and
clustering method. This model could provide better predictions
than using each approach individually [5]. It can be inferred
that most researchers use Markov models for this type of
prediction and is thus, mostly preferred.
In this paper, we study Markov models in predicting a
users next web request using default rule. The prediction
models are based on web log data that correspond with users
behavior. They are used to make predictions for the general
user more fit for a particular client. This prediction requires
the discovery of a web users sequential access patterns and
using these patterns to make predictions of users future access
more accurate. We can then incorporate these predictions into
web pre-fetching system in an attempt to enhance the
performance.
III. MARKOV MODELS
Markov models are commonly used method for modeling
scholastic sequences with underlying finite-state structures
that are shown to be well-suited for modeling and predicting a
users browsing behavior on a web site [7]. The identification
of the next web page to be accessed by a user is calculated
based on the sequence of previously accessed pages.
In general, the input for this problem is the sequence of
web pages that were accessed by a user. It is assumed that it
discusses the Markov property. In such a process, the past is
irrelevant in predicting the future given the knowledge of the
present. Let P = {p
1
,p
2
,,p
m
} be a set of pages in a web site.
Let W be a user session including a sequence of pages visited
by the user in a visit. Assuming that the user has visited l
pages then P(p
i
/W) is the probability that the user visits page p
i
next. Thus an equation may thus be deduced:
p
l+1
= argmax
pP
{P(P
l+1
= p/W)}
= argmax
pP
{P(P
l+1
= p/P
l
,P
l-1
,,P
1
)} (1)
Essentially, this approach for each symbol P
i
computes its
probability of being accessed next which then selects the web
page that has the highest probability of accessibility. The
probability, P(p
i
,/W), is estimated by using all W. Naturally,
the longer l and the larger W are, the more accurate the results
are P(p
i
,/W). However, it is not feasible to accurately
determine these conditional probabilities because the
sequences may arbitrarily be longer (or longer l), and the size
of the training set is often much smaller than that required to
accurately estimate the various conditional probabilities for
long sequences (or large W). For this reason, the various
conditional probabilities are commonly estimated by assuming
that the process generating sequences of the web pages visited
by users follows a Markov process. That is, the probability of
visiting a web page p
i
does not depend on all the pages in the
web session, but only on a small set of k preceding pages,
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No.11, 2012
161 | P a g e
www.ijacsa.thesai.org
where k << l. Using the Markov process assumption, the web
page p
l+1
will be generated next is given by
p
l+1
= argmax
peP
{P(P
l+1
=
p
/P
l
,P
l-1
,,P
l-(k-1)
)} (2)
Where k denotes the number of the preceding pages as it
identifies the order of Markov model. The resulting model of
this equation is called the k
th
order Markov model. In order to
use the k
th
order Markov model the learning of P
l+1
is needed
for each sequence of k web pages.
Let
= <p
l-(k-1)
,P
l-(k-2)
, ,P
1
>, by
estimating the various conditional probability P(P
l+1
= p/P
l
,P
l-
1
,,P
l-(k-1)
). Using the maximum likelihood principle [5], the
conditional probability P(p
i
/
. The
conditional probability in the ratio of these two frequencies,
therefore,
(3)
IV. AMBIGUOUSRULES
The main problem in association rules in the application of
large data item sets is the discovery of a large number of rules
and the difficulty in identifying the area that leads to the
correct prediction. As regards the non-Markov models, they
lack web page prediction accuracy because they do not use
enough history in web access whereas Markov models have a
high state space complexity [2-5].
There is an apparent direct relationship between Markov
models and association rules techniques. According to the
Markov model pruning methods presented by Deshpande and
Karypis [7] and the association rules selection methods
presented by Yang, Li and Wang [1].
The researchers herein have proposed different methods for
building association-rule based prediction models in using
web logs, but ambiguous rules still exist. In order to solve this
problem, we propose the use of default rule to keep both the
low state complexity and high accuracy results. We use the
following examples to show the idea creating default rules.
Consider the set of user session in table 1. Note that the
numbers are assigned to web page names. Table 1 examines
the following 6 user session:
TABLE I. USER SESSIONS
S1 900, 586, 594, 618
S2 900, 868, 594
S3 868, 586, 594, 618
S4 594, 619, 618
S5 868, 594, 900,618
S6 868, 586, 618, 594, 619
Table 2 shows an example of counting the support of
extracted web access sequences and the useful sequences are
highlighted. The row represents the previously visited page
and the column represents the next visited page. Each field in
the matrix is produced by looking at the number of times the
web page on the horizontal line followed by the web page on
the vertical. In this example, web users session, web page 586
and web page 594 co-occurred in session S1 and S3, and
therefore web page (A,B) has the support of 2.
TABLE II. AN EXAMPLE OF EXTRACTED WEB ACCESS SEQUENCES AND
THEIR SUPPORT COUNT OF FIRST-ORDER MARKOV MODELS
Second item in sequence
Support count
First item in sequence
586 594 618 619 868
586 2 1
594 2 2
618 1
619 1
868 2 2
900 1 1 1
The next step is to generate rules from these remaining
sequences. All the remaining sequences construct the
prediction rules using the maximum likelihood principle. The
condition probabilities of all sequences are calculated and are
ranked.
For example, given the antecedent that the web page is
586, the condition probability is that 586 -> 594 is 85.7%. This
is calculated by dividing the support value of the sequence
(586,594) with the support value of the web page (586); (2/3 =
66.7%). From the training if antecedent web page is 586, then
the single-consequence can be 594 and 618 with their
confidence levels at 66.7%) and 33.3%. In this case, 586->594
have the highest probability value. Then a prediction rule with
the highest condition probability is chosen.
Applying the First-order Markov models to the above
training user sessions, we notice that the most frequent state is
<594> and it appeared 6 times as follows:
P
l+1
= argmax{P(618|594)} = 618 OR 619
Using Markov models, we can determine that there is a
50/50 chance that the next page to be accessed by the user
after accessing page 594 could be either 618 or 619.
Obviously, this information alone does not provide us with
correct predictions on the next page to be accessed by the user
as we may have obtained the highest condition probability for
both pages, using the 618 and 619.
To break the tie and find out which page would lead to the
most accurate prediction, we choose the rule whose right hand
side (RHS) is the most popular page in the training web log
(page 618 = 5, page 619 = 2). Thus, we choose rule 594->618.
We refer to this rule as the default rule.
In this paper, we introduce the default rule in resolving
ambiguous predictions. It can apply to all Markov models and
Association rule. This method avoids the complexity of high
order Markov model. This method also improves the
efficiency of web access predictions. Algorithm is summarized
as follows:
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No.11, 2012
162 | P a g e
www.ijacsa.thesai.org
Training:
Generate rule using the First-order Markov model
Test:
FOR each session of Test set
Latest substring from session (got LHS->RHS)
Compare with appropriate rules
IF the matching rule provides an non-
ambiguous prediction
THEN
The prediction is made by the state
ELSE //The ambiguous occurs
Select rule whose RHS is the most
frequency the training web log THEN make
prediction
ENDIF
ENDFOR
In this work, we define an ambiguous prediction as one
predictive page. This task could apply for an ambiguous
prediction as two or more predictive pages that have the same
conditional probabilities by a Markov model. The ambiguous
prediction potentially has other definitions, for example, the
certainty of a prediction is below a threshold. We, nonetheless,
did not explore the other options in this paper [2].
V. EXPERIMENTAL SETUP AND RESULT
The experiment used on web data, as collected from the
web server at the NASA Kennedy space Centre form 00:00:00
Aug 1, 1995 through 23:59:59 Aug 31, 1995, yielded a total of
31 days. During this period there are totally 1,569,898 requests
recorded by the log file (see example in Fig.1).
Before doing the experiments, we removed all the
dynamically generated content such as CGI scripts. We also
filtered out requests with unsuccessful HTTP response codes
[1, 12]. See Web log pre-processing in [12].
199.72.81.55 - - [01/Jul/1995:00:00:01 -0400] "GET
/history/apollo/ HTTP/1.0" 200 6245
Figure 1. A fragment from the server logs file
The next step that we had used was to identify the user
sessions. The session identification splits all the pages
accessed by the IP address which is a unique identity and a
timeout, where if the time between page requests exceeds a
certain limit, it is assumed that the user has started a new
session.
A 30 minute default timeout is considered. Figure 2 shows
a fragment from this session identification result. There were a
total of 71,242unique visiting IP addresses, 935 unique pages
and 130,976 sessions. Also, the frequency of each page visited
by the user was calculated. The page access frequency is shown
in Figure 3 which reveals that page number 295 is the most
frequent page accessed at and it was accessed 41109 times.
S1: 634, 391, 396, 408
S2: 393, 392, 400, 398, 396, 408, 37, 53
S3: 91, 124, 206, 101, 42, 287, 277
S4: 634, 391, 396, 631
S5: 124, 125, 127, 123, 130, 126, 131, 128, 129, 83
S6: 391, 634, 633, 295
S7: 295, 277, 91, 919
S8: 935, 391, 631, 634
S9: 755, 295, 810
S10: 637, 391, 918
Figure 2. A fragment from session identification result
Figure 3. A frequency chart for the frequency visited sessions
All experiments were conducted on P i-7 2.20 GHz PC
with 8 GB DDR RAM, running Windows 7 Home Premium.
The algorithms were implemented using MATLAB.
This model will then be evaluated for accuracy on a
previously unseen set of web sessions. These are called test
set. When applying the trained model on the testing sequence,
this is done by hiding the last symbol in each of the test
sessions, and using the model to make a prediction of this
trimmed session. During the testing step the model is given a
trimmed sequence for prediction in which the last symbol of
the sequence to compute the accuracy of the model is made. If
the prediction model was unable to make a prediction for a
particular web session, it was calculated as the wrong
prediction. In the experiment, the proposed measure is
compared with prediction model based evaluation that
measures accuracy, coverage, and model accuracy [5].
Accuracy is defined as the ratio of the number of sequence for
which the model is able to correctly predict the hidden symbol
to the total number of sequence in the test set. The second is
the coverage. It is defined as the ratio of the number of
sequences whose required number for making a prediction was
found in the model to the total number of sequences in the test
set. The Third one is the model accuracy; it is calculated only
on the web user sessions upon which the prediction was made.
If the prediction model was unable to make a prediction for a
particular web session, it is ignored in the model accuracy
calculations.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No.11, 2012
163 | P a g e
www.ijacsa.thesai.org
The evaluation was calculated using a 10-fold cross
validation. The data was spit into ten equal sets. The first nine
sets are considered as training data for rule constructions.
Then, the second last set is considered as testing data for
evaluation. The test set is continued moving upward until the
first set was used for testing and the rest for training. The
reported accuracy is the average of ten sets. The experiment
results are the average of ten tests. We experimentally
evaluated the performance of the proposed approach first-
order Markov model and construct the predictive model.
Figure 4 shows the number of ambiguous rules (%). It
shows that as the minimum support threshold has sequences
varied (2-8), the number of ambiguous rules occurs at any
support threshold.
Figure 4. Number of ambiguous rules (%) occurred at difference minimum
support thresholds.
The results are plotted in Figure 5-7. It shows that as the
support threshold has increased, the coverage of the model
decreased as accompanied by ad decrease in the accuracy.
However, the model-accuracy of the model continues to
increase.
Figure 5. The comparison of coverage achieved by the Markov models and
Markov models using default rules with difference minimum thresholds.
Figure 6. The comparison of accuracy achieved by the Markov models and
Markov models using default rules with difference minimum support
threshold.
Figure 7. The comparison of Model-accuracy achieved by the Markov
models and Markov models using default rule with difference minimum
support threshold.
As it can be seen from Figures 5 7, the interesting
observations can thus be summarized as the overall
performances of Markov models using the default rule
accuracy and model-accuracy while keeping their higher
coverage abilities.
VI. CONCLUSION
A Markov models is a popular approach to predict what
web page are likely to be accessed next. Many researchers
have proposed to use the Markov models in predicting a web
users navigation session but there has been no proposal at
present on how to solve the ambiguous problem rules. In this
paper, we propose to use the default rules in resolving
ambiguous prediction in the first order Markov models. This
method could provide better web navigation prediction than
merely using the individual traditional models individually. It
can also apply to all Markov models, Association rule and the
other combined Markov models.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No.11, 2012
164 | P a g e
www.ijacsa.thesai.org
ACKNOWLEDGMENT
The authors gratefully thank the anonymous referees and
collaborators for their substantive suggestions. We also
acknowledge research support from Rangsit University at
Bangkok, Thailand.
REFERENCES
[1] Q. Yang, T. Li and K. Wang, Building Association -Rules Based
Sequential Classifiers for Web-Document Prediction, Journal of Data
Mining and Knowledge Discovery, Vol. 8, 2004.
[2] F. Khalil, J. Li and H. Wang, A Framework of Combining Markov Model
with Association Rules for Predicting Web Page Accesses,Conferences in
Research and Practice in Information Tecnology (CRPIT), Vol. 61, 2006.
[3] S. Chimphlee, N. Salim, M.S.B Ngadiman, W. Chimphlee and S. Srinoy,
Predicting Next Page Access by Markov Models and Association Rules
on Web Log Data,The international Journal of Applied Management and
Technology,Vol. 4, 2006.
[4] F. Khalil, J. Li and H. Wang, Integrating recommendation models for
improved web page prediction accuracy, Thirty-First Australasian
Computer Science Conference (ACSC2008), 2008.
[5] S. Chimphlee, N. Salim, M.S.B. Ngadiman and W.Chimphlee, Hybrid
Web Page Prediction Model for Predicting a User's Next
Access,Information Technology Journal, Vol.9, No. 4, 2010.
[6] Bhawna Nigam, S.J.a.S.T., Mining Association Rules from Web Logs by
Incorporating Structural Knowledge of Website, International Journal of
Computer Applications (0975-8887), 2012. Vol. 42(No. 11): p. pp. 17-23.
[7] M. Deshpande and G. Karypis,Selective Markov Models for Predicting
Web Page Accesses, In ACM Transactions on Internet Technology,Vol.4,
No.2,2004.
[8] A. Anitha, A New Web Usage Mining Approach for Next Page Access
Prediction, International Journal of Computer Applications (0975-8887),
Vol. 8, No. 11, 2010.
[9] V.V. Mayil, Web Navigation Path Pattern Prediction using First Order
Markov Model and Depth first Evaluation, International Journal of
Computer Applications(0975-8887),Vol. 45, No. 16, 2012.
[10] S. Venkateswari and R.M. Suresh,Association Rule Mining in E-
commerce: A Survey,International Journal of Engineering Science and
Technology (IJEST), Vol. 3, No. 4, 2011.
[11] N.K. Tyagi and A.K. Solanki, Prediction of Users Behavior through
Correlation Rules,International of Advanced Computer Science and
Applications (IJACSA), Vol. 2, No. 9, 2011.
[12] T. Pamutha, S. Chimphlee, Ch. Kimpan and P. Sanguansat, Data
Preprocessing on Web Server Log Files for Mining Users Access
Patterns,International Journal of Research and Reviews in Wireless
Communications (IJRRWC),Vol.2, No.2, 2012.
AUTHORS PROFILE
Thanakorn Pamutha received the M.Sc. degree in
Computer Science from Prince of Songkla University,
Songkla-Thailand in 2000. He has been an assistant
professor in Yala Rajabhat University, Thailand since
2003. Now, he is a PhD student in Information
Technology, Faculty of Information Technology, Rangsit
University, Thailand. He is interested in database system,
artificial intelligence, data mining, and web usage mining.
Siriporn Chimplee received the M.Sc. degree in Applied
Statistics from National Institution Development of
Administration (NIDA), Thailand, and Ph.D. degrees in
Computer Science from University Technology Malaysia,
Malaysia. She is a lecturer at the Computer Science
Department, Faculty of Science and Technology, Suan
Dusit Rajabhat University, Thailand. She is interested in
data mining, web mining, web usage mining, statistical, and soft
computing.
Chom Kimpan received his D.Eng in Electrical and
Computer Engineering from King Mongkuts Institute of
Technology Ladkrabang, M.Sc in Electrical Engineering
from Nihon University, Japan, and bachelors degree in
Electrical Engineering from King Mongkuts Institute of
Technology of Thailand. Now he is an Associate
Profressor at the Department of Informatics, Faculty of
Information Technology, Rangsit University, Thailand. He is interested
in pattern recognition, image retrieval, speech recognition, and swarm
intelligence.
Parinya Sanguansatreceived the B.Eng., M.Eng. and
Ph.D. degrees in Electrical Engineering from the
Chulalongkorn University, Thailand. He is an assistant
professor in the Faculty of Engineering and Technology,
Panyapiwat Institute of Management, Thailand in 2001,
2004 and 2007 respectively. His research areas are
digital signal processing in pattern recognition including
on-line handwritten recognition, face and automatic target recognition.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
165 | P a g e
www.ijacsa.thesai.org
Multilayer Neural Networks and Nearest Neighbor
Classifier Performances for Image Annotation
Mustapha OUJAOURA
Laboratory of Information
Processing and Telecommunications,
Computer Science Department,
Faculty of Science and Technology,
Sultan Moulay Slimane University,
PO Box. 523, Bni Mellal, Morocco.
Brahim MINAOUI
Laboratory of Information
Processing and Telecommunications,
Computer Science Department,
Faculty of Science and Technology,
Sultan Moulay Slimane University,
PO Box. 523, Bni Mellal, Morocco.
Mohammed FAKIR
Laboratory of Information
Processing and Telecommunications,
Computer Science Department,
Faculty of Science and Technology,
Sultan Moulay Slimane University,
PO Box. 523, Bni Mellal, Morocco.
AbstractThe explosive growth of image data leads to the
research and development of image content searching and
indexing systems. Image annotation systems aim at annotating
automatically animage with some controlled keywords that can
be used for indexing and retrieval of images. This paper presents
a comparative evaluation of the image content annotation system
by using the multilayer neural networks and the nearest
neighbour classifier. The region growing segmentation is used to
separate objects, the Hu moments, Legendre moments and
Zernike moments which are used in as feature descriptors for the
image content characterization and annotation.The ETH-80
database image is used in the experiments here. The best
annotation rate is achieved by using Legendre moments as
feature extraction method and the multilayer neural network as a
classifier.
Keywords-I mage annotation; region growing segmentation;
multilayer neural network classifier; nearest neighbour classifier;
Zernike moments; Legendre moments; Hu moments; ETH-80
database.
I. INTRODUCTION
As the online resources are still a vital resource in
everyday life, developing automatic methods for managing
large volumes of digital information is increasingly important.
Among these methods, automatic indexing multimedia data
remain an important challenge, especially the image
annotation [1]. Annotated images play a very important role in
information processing. They are useful for an image retrieval
based on keywords and image content [2]. Manual annotation
is not only boring but also not practical in many cases due to
the abundance of information. Most images are, therefore,
available without adequate annotation. Automatic image
content annotation becomes a recent research interest [3]. It
attempts to explore the visual characteristics of images and
associate them with image contents and semantics.
Many classifiers have been used for the image
classification and annotation without any performances
evaluation. In this paper, we use the neural network classifier
for image annotation, and we compare its performance with
the nearest neighbour classifier. We use the same type of
moments as a method of features extraction for each classifier
in order to have an objective comparison between
performances of the considered classifiers. In such case, we
used a system that tends to extract objects from each image
and find the annotation terms that describe its individual
content.
The rest of the paper is organised as follows. The Section 2
presents the adopted annotation system while the Section 3
discusses the primordial tasks of any annotation and
recognition system. They are the image segmentation and
features extraction problems in addition to a brief formulation
for Hu, Legendre and Zernike moments as features extraction
method. The Section 4 is reserved for the image classification
and annotation using neural network or the nearest neighbour
classifier. Finally, the Section 5 presents the experimental
annotation results and the comparing performance of each
used classifier.
II. ANNOTATION SYSTEM
The automatic annotations techniques of images content
attempt to explore visual features of images that describe an
image content and associate them with its semantics. Its an
effective technology to improve the image indexing and
searching in the large volume of information that is available
in the media. The algorithms and systems, used for image
annotation, are commonly divided into those tasks [4]:
- Segmentation and Features extraction;
- Classification and Annotation.
Figure 1. The block diagram of the image annotation system.
Keywords&
Features
Database
Input Image
AnnotationResults
Classifier
Segmentation
Features Extraction
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
166 | P a g e
www.ijacsa.thesai.org
The annotation system adopted in this work is shown in
Fig.1. The system has a reference database that contains
keywords and features descriptor of images that are already
annotated by experts (manual offline annotation).This
database is used for modelling and training the classifier in
order to choose the appropriate keywords. To achieve this goal,
by using region growing segmentation, the input image is
firstly segmented into regions that represent objects in the
image. The feature vectors of each region are secondly
computed and extracted from the image. Those features are
finally feed into the input of the classifier, that can be the
multilayer neural networks or the nearest neighbor classifier,
in order to decide and choose the appropriated keywords for
annotation tasks. Therefore, the image content is annotated
and the performance of each classifier can be evaluated.
III. IMAGE SEGMENTATION AND FEATURES EXTRACTION
The features vector, which is extracted from the entire
image, loses local information. So, it is necessary to segment
an image into regions or objects of interest and use of local
characteristics. The image segmentation is the process of
partitioning a digital image into multiple segments. It is very
important in many applications for any image processing, and
it still remains a challenge for scientists and researchers. So far,
the efforts and attempts are still being made to improve the
segmentation techniques. With the improvement of computer
processing capabilities, there are several possible
segmentation techniques of an image: threshold, region
growing, active contours, level sets, etc... [5]. Among these
methods, the region growing is well suited because of its
simplicity and has been successfully used several times as a
segmentation technique of digital images.
The regions are iteratively grown by comparing all
unallocated neighbouring pixels to the regions. The difference
between a pixel's intensity and the pixel's mean in one region
is used as a measure of similarity, its a predicate that control
the evolution of segmentation process. The pixel, with the
smallest difference measured this way, is allocated to the
respective region. This process continues until all pixels are
allocated to a region.
Figure 2. Region growing segmentation algorithm.
By assuming that the objects are localised at the center of
images, the region growing segmentation is started from the
corner of the image to isolate the objects in the center of the
image. The region growing segmentation algorithm used in
this paper is presented in Fig.2 and an example of image
segmentation is in Fig.3.
Figure 3. Example of image segmentation result.
After dividing the original image into several distinct
regions that correspond to objects in a scene, the feature vector
can be extracted from each region and can be considered as a
representation of an object in the entire image.
The feature extraction task transforms carefully the rich
content and large input data of images into a reduced
representation set of features in order to decrease the
processing time. Not only it enhances the retrieval and
annotation accuracy, but the annotation speed as well, since a
large image database can be organized according to the
classification rule and, therefore, search can be performed [6].
In the feature extraction method, the representation of the
image content must be considered in some situations such as:
translation, rotation and change of scale. This is the reason that
justifies the use of moments for feature extraction method
from the segmented image.
The use of moments for image analysis and pattern
recognition was inspired by Hu [7] and Alt [10]. In this paper,
the moments used are:
- Hu moments.
- Legendre moments.
- Zernike moments.
A. Hu moments
For a discrete image of M x N pixels with intensity
function f(x, y), Hu [7] defined the following seven moments
that are invariant to the change of scale, translation and
rotation:
02 20 1
| + = (1)
( )
2
11
2
02 20 2
4 | + + = (2)
Whileall the pixels in image are not visited;
1. Choose an unlabeled pixel p
k
;
2. Set the regions mean to intensity of pixel p
k
;
3. Consider unlabeled neighboring pixels p
kj
;
I f (pixel's intensity - region's mean) < threshold;
a. Affect the pixel p
kj
to the region labeled by k.
b. Update the regions mean and go back to ;
Elsek = k + 1 and go back to.
End I f
End While
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
167 | P a g e
www.ijacsa.thesai.org
( ) ( )
2
03 21
2
12 30 3
3 3 | + = (3)
( ) ( )
2
03 21
2
12 30 4
| + = (4)
( )( )
( ) ( ) | |
( )( )
( ) ( ) | |
2
03 21
2
12 30
03 21 03 21
2
03 21
2
12 30
12 30 12 30 5
3
3
3
3
|
+ +
+
+ + +
+ =
(5)
( ) ( ) ( ) | |
( )( )
03 21 12 30 11
2
03 21
2
12 30 02 20 6
4
|
+ + +
+ + =
(6)
( )( )
( ) ( ) | |
( )( )
( ) ( ) | |
2
03 21
2
12 30
03 21 12 30
2
03 21
2
12 30
12 30 03 21 7
3
3
3
3
|
+ +
+
+ +
+ =
(7)
Where,
( ) ( )
1
2 1
0
1
0
1
0
1
0
) , (
) , (
+
+
=
|
|
.
|
\
|
=
q p
N
x
M
y
N
x
M
y
q p
pq
y x f
y x f y y x x
(8)
And,
=
=
=
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
) , (
) , (
) , (
) , (
N
x
M
y
N
x
M
y
q
N
x
M
y
N
x
M
y
p
y x f
y x f y
y
y x f
y x f x
x
(9)
B. Legendre moments
Legendre moments were first introduced by Teague [8].
They were used in several pattern recognition applications [9].
The orthogonal property of Legendre polynomials implies no
redundancy or overlap of information between the moments
with different orders. This property enables the contribution of
each moment to be unique and independent of the information
in an image [10]. The Legendre moments for a discrete image
of M x N pixels with intensity function f(x, y)is the following
[13]:
=
=
1
0
1
0
) , ( ) ( ) (
M
x
N
y
j q i p pq pq
y x f y P x P L (10)
Where
( )( )
N M
q p
pq
+ +
=
1 2 1 2
, xi and yj denote the
normalized pixel coordinates in the range of [-1, +1], which are
given by:
( )
( )
=
=
1
1 2
1
1 2
N
N y
y
M
M x
x
j
i
(11)
( ) x P
p
is the p
th
-order Legendre polynomial defined by:
( )
( ) ( )
even k p
p
k p
k
k p
p
k p k p
k
k p x
x P
=
=
|
.
|
\
| +
|
.
|
\
|
+
=
0
2
!
2
!
2
! 2
! 1
(12)
And, the recurrent formula of the Legendre polynomials is:
( )
( )
( )
( )
( )
( ) ( )
= =
=
1 ,
1 1 2
0 1
2 1
x P x x P
x P
p
p
x P
p
x p
x P
p p p
(13)
In this work the recurrent formula is used for calculating
Legendre polynomials in order to increase the computation
speed.
C. Zernike moments
Zernike moments are the mapping of an image onto a set
of complex Zernike polynomials. As these Zernike
polynomials are orthogonal to each other, Zernike moments
can represent the properties of an image with no redundancy
or overlap of information between the moments [11]. Due to
these characteristics, Zernike moments have been utilized as
feature sets in many applications [12].
The discrete form of the Zernike moments of an image size
M N represented by f(x, y) is expressed, in the unit disk
1 ) (
2 2
= + y x , as follows [13]:
( ) ( )
=
+
=
1
0
1
0
,
,
1
M
x
N
y
jq
xy q p pq
y x f e r R
p
Z
xy
u
(14)
Where
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
168 | P a g e
www.ijacsa.thesai.org
( )
( ) ( )
( )
( )
> s =
|
|
.
|
\
|
|
|
.
|
\
|
+
=
0 , ,
!
2
!
2
!
! 1
2
0
2
,
p p q even q p
s
q p
s
q p
s
r s p
r R
q p
s
s p s
q p
(15)
is the number of pixels located in the unit circle, and
the transformed phase
xy
u and the distance
xy
r at the pixel of
coordinates (x, y) are [13]:
( ) ( ) ( )
( ) ( ) ( )
( ) ( )
|
.
|
\
|
+
|
.
|
\
|
=
|
|
.
|
\
|
=
2 2
1
1
1 2
1
1 2
1 1 2
1 1 2
tan
N
N y
M
M x
r
M M x
N N y
xy
xy
u
(16)
Most of the time taken for the computation of Zernike
moments is due to the computation of radial polynomials.
Therefore, researchers have proposed faster methods that
reduce the factorial terms by utilizing the recurrences
relations on the radial polynomials. In this paper, we obtained
Zernike moments using the direct method.
IV. IMAGES CLASSIFICATION AND ANNOTATION
As the features are extracted, a suitable classifier must be
chosen. Many classifiers are used and each classifier is found
suitable to classify a particular kind of feature vectors
depending on their characteristics.
Some of these classifiers are based on supervised learning
which require an intensive learning and a training phase of the
classifier parameters (parameters of Support Vector Machines
[14], Boosting [15], parametric generative models [16],
decision trees[17], and Neural Network[18]). They are also
known as parametric classifiers. Other classifiers base their
classification decision directly on the data, and require no
learning and training of parameters. These methods are also
known as nonparametric classifiers. The most common
nonparametric classifier, based on distance estimation, is the
Nearest Neighbour classifier.
There are several types of the classifier. Based on their
ability to detect complex nonlinear relationships between
dependent and independent variables, the multilayer neural
networks are used in this paper to classify and annotate image
content. Their performances are compared to those of the
Nearest Neighbour classifier.
A. Multilayer Neural Network classifier
A multilayer neural network consists of an input layer
including a set of input nodes, one or more hidden layers of
nodes, and an output layer of nodes [19]. Fig. 4 shows an
example of a three layer network, used in this paper, having
input layer formed by m nodes, one hidden layer formed by 20
nodes, and output layer formed by n nodes. This neural
network is trained to classify inputs according to target classes.
The training input data are loaded from the reference
database while the target data should consist of a vector of all
zero values except for the element i that represents the
appropriate class.
Figure 4. The three layer neural network.
After random initialisation of bias and connection weights,
the training principle of the neural network is based on a loop
of steps. It starts by propagating inputs from the input layer to
the output layer. The error is calculated and back propagated
for each layer of the neural network in order to update the bias
and connection weights. When the bias and connection
weights are changed, the propagation of the inputs is repeated
until having the minimum error between outputs and targets.
This is the criterion stop of the neural network training.
The inputs Y
i
are presented to the input layer and
propagated to the hidden layer using the following formula:
|
.
|
\
|
+ =
=
m
i
i i i j
b w Y f Y
1
(17)
Then from the hidden layer to the output layer:
|
|
.
|
\
|
+ =
=
20
1 j
j j j k
b w Y f Y (18)
Finally, the outputs are:
|
.
|
\
|
+ =
=
n
k
k k k k
b w Y f O
1
(19)
Where f is the activation function (hyperbolic tangent
sigmoid) used in the tree layer neural network, defined by:
1
1
2
) (
2
+
=
x
e
x f (20)
At the output layer, the error between the desired output T
k
and the actual output O
k
is calculated by:
( )( )
k k k k
O T O E =
2
1
(21)
The calculated error is propagated to the hidden layer
using the following formula:
20
Hidden Layer
j
*
wj
bj
+
n
Output Layer
k
*
wk
bk
+
I
n
p
u
t
E
i
O
u
t
p
u
t
O
k
m
Input Layer
i
*
wi
bi
+
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
169 | P a g e
www.ijacsa.thesai.org
( )
=
=
n
k
k k k j
E w Y E
1
2
. 1 (22)
Then the calculated back propagation of error from the
hidden layer to the input layer is:
( )
=
=
20
1
2
. 1
j
j j j i
E w Y E (23)
The bias and the connections weights of the input layer i,
the hidden layer j and the output layer k are adjusted by:
k j i l for
E b
E Y w
l l
l l l
, ,
.
=
= A
= A
(24)
B. Nearest Neighbor classifier
The nearest neighbour classifier is used to compare the
feature vector of the input image and the feature vectors stored
in the database. It is obtained by finding the distance between
the prototype image and the database. The class is found by
measuring the distance between a feature vector of input
image and feature vectors of images in reference database. The
Euclidean distance measurement is used in this paper, but
other distance measurement can be also used [14].
Let X
1
, X
2
, , X
k
be the k class features vectors in the
database and X
q
the feature vector of the query image. The
feature vector with the minimum distance is found to be the
closest matching vector. It is given by:
{ }
( ) ( ) ( )
)
`
=
e
i
j q
k j
j q
i x i x X X d
2
,..., 2 , 1
min ) , ( (25)
C. Image Annotation
For the image annotation, low-level feature vectors are
calculated iteratively for each region in the image, either by
using Hu moments or Legendre moments or Zernike moments.
These features vector are presented either to the nearest
neighbour classifier or to the Multilayer neural network that
was already trained.
When features vector are presented to the nearest
neighbour classifier to test matching with the feature values in
reference database, the label or keyword of image class with
the minimum distance is selected.
When features vectors are feed to the input layer of the
Multilayer neural network, where each of the input neurons or
nodes correspond to each element of the features vector, The
output neurons of the neural network represent the class labels
of images to be classified and annotated. Then, each region is
annotated by the corresponding label that is found by the
classifier.
The input layer of the neural network has a variable
number of input nodes. It has seven input nodes in the case of
adoption of Hu moments, nine input nodes when adopting
Zernike moments and ten input nodes when used Legendre
moment as feature extraction method. However, the number of
input nodes for the neural networks can be changed and
increased when using Zernike and Legendre moments, as a
feature extraction method, to increase the accuracy of the
annotation system. The features can be also combined together
and feed to the Multilayer neural network or the nearest
neighbour classifier.
V. EXPERIMENTS AND RESULTS
A. Experiments
In our experiments, for each region that represent an object
from the input image, the number of input features extracted
using Hu invariants features extraction method is 7 (hu1, hu2,
hu3, hu4, hu5, hu6, hu7) while the number of input features
extracted using the order 4 of Zernike moments is 9 (Z00,
Z11, Z20, Z22, Z31, Z33, Z40, Z42, Z44) and the number of
input features extracted using the order 3 of Legendre
moments is 10 (L00, L01, L02, L03, L10, L11, L12, L20, L21,
L30). These inputs are feed to the input of the multilayer
neural network or the nearest neighbour classifier in order to
do matching with the feature values in the reference database.
Then, the appropriate keywords are selected and used for
annotation of the input image.
The fig. 5 shows some examples of image objects from the
ETH-80 image database used in our experiments. The
experiments are based on eight classes of objects (Apple, Car,
Cow, Cup, Dog, Horse, Pears, and Tomato). The number of
prototypes per class is 5.
Figure 5. Some examples of objects from the ETH-80 database.
The accuracy of image annotation system is evaluated by
the precision rate which is the number of the correct results
divided by the number of all returned results.
All the experiments are conducted using the ETH-80
database containing a set of 8 different object images [20].
The proposed system has been implemented and tested on a
Core 2 Duo personal computer using Matlab Software.
B. Results
For both of the used classifiers (the neural network
classifier and Nearest Neighbour classifier), the annotation
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
170 | P a g e
www.ijacsa.thesai.org
rates of Hu, Zernike and Legendre Moments, as the feature
extraction method, are given for each object in Table 1.
TABLE I. OJECTS ANNOTATION RATE OF HU, ZERNIKE AND LEGENDRE
MOMENTS USING NEURAL NETWORK AND NEAREST NEIGHBOUR CLASSIFIER
Neural networkclassifier Nearest neighbourclassifier
Object Hu Zernike Legendre Hu Zernike Legendre
Apple 71.20% 78.40% 92.13% 69.12% 76.84% 90.93%
Car 68.53% 74.37% 83.11% 67.03% 72.87% 81.91%
Cow 41.32% 50.00% 61.37% 39.82% 49.13% 60.10%
Cup 63.27% 71.20% 81.20% 62.07% 70.37% 80.02%
Dog 62.57% 72.30% 80.41% 60.87% 70.93% 79.10%
Horse 48.60% 56.40% 65.28% 46.80% 55.10% 63.80%
Pears 69.29% 78.40% 89.38% 66.80% 76.94% 87.85%
Tomato 67.49% 76.52% 86.40% 65.90% 75.10% 84.70%
Average 61.53% 69.70% 79.91% 59.80% 68.41% 78.55%
The general annotation rates and error rates of the neural
network classifier or the Nearest Neighbour classifier based on
Hu moments, Zernike moments and Legendre Moments as
feature extraction method are given in Table 2. The
experimental results showed that the annotation rate of the
neural network classifier based on Legendre moments, Zernike
moments and Hu moments is higher than the annotation rate of
the Nearest Neighbour classifier based on Legendre moments,
Zernike moments and Hu moments.
TABLE II. Annotation and Error Rate of Hu, Zernike and Legendre
Moments using Neural Network and Nearest Neighbour classifier
Classifier
Features Extraction
method
Annotation
Rate
Error
Rate
Neural
network
Hu moments 61.53% 38.47%
Zernike moments 69.70% 30.30%
Legendre moments 79.91% 20.09%
Nearest
neighbour
Hu moments 59.80% 40.20%
Zernike moments 68.41% 31.59%
Legendre moments 78.55% 21.45%
The best annotation rate is achieved when we use the
Legendre moments as the features extraction method and the
multilayer neural networks as a classifier. However, for the
computation time, the Nearest Neighbour classifier based on
Hu moments is the best.
The annotation rates of Zernike moments is lower than the
annotation rate of Legendre moments because Zernike
moments are computed only from pixels of an image inside a
unit circle obtained by the mapping transformation over a unit
circle. In one hand, those pixels outside a unit circle are not
considered. On the other hand, only the absolute values of
extracted complex Zernike moments are feed into the classifier.
This is the main reason for the modest results obtained by
Zernike moments.
Figure 6. Comparison of general annotation rates.
The fig 6 gives the comparison of general annotation rate
for each extraction method and for each classifier.
(a)
(b)
Figure 7. Example of the annotation results using Zernike moment features
and (a) the multilayer neural network classifier (b) the Nearest Neighbour
classifier.
The Fig. 7 presents an example of annotation results
obtained using the presented system and Zernike moment as a
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
171 | P a g e
www.ijacsa.thesai.org
feature extraction method. The Graphical User Interface is
illustrated in Fig. 8.
Figure 8. Graphical User Interface (GUI).
The results are also affected by the accuracy of the image
segmentation method. In most cases, it is very difficult to have
an automatic ideal segmentation. This problem decreases the
annotation rates. Therefore, any annotation attempt must
consider the image segmentation as an important step, not only
for automatic image annotation system, but also for the other
systems which require its use. The Multilayer neural network
classifier based on Legendre moments and Zernike moments is
very expensive regarding to the computation time of the
features in addition to the training time of the classifier. So,
any use of them in a real time for an online image annotation
system will be difficult and impracticable.
VI. CONCLUSION
In this paper, we evaluated the image annotation
performance of the neural network classifier and the nearest
neighbour classifier based on Hu moments, Legendre
moments and Zernike moments as feature extraction methods.
The performance of each classifier and each feature extraction
method has been experimentally analyzed. The successful
experimental results proved that the proposed image
annotation system based on the multilayer neural network
classifier gives the best results for some images that are well
and properly segmented. However, the processing time and the
Image segmentation remain a challenge that needs more
attention in order to increase precision and accuracy of the
image annotation system. Also, the gap between the low-level
features and the semantic content of an image must be reduced
and considered for more accuracy of any image annotation
system. Other Image segmentation and other classifiers may
be considered in the future works for other image databases.
REFERENCES
[1] N. Vasconcelos and M. Kunt, Content-Based Retrieval from Image
Databases: Current Solutions and Future Directions, Proc. Intl Conf.
Image Processing, 2001.
[2] A. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain, Content-
Based Image Retrieval: The End of the Early Years, IEEE Trans.
Pattern. Analysis and Machine Intelligence, vol. 22, no. 12, pp. 1349-
1380, Dec. 2000.
[3] Lei Ye, Philip Ogunbona and Jianqiang Wang, Image Content
Annotation Based on Visual Features, Proceedings of the Eighth IEEE
International Symposium on Multimedia (ISM'06), IEEE computer
society, San Diego, USA, 11-13 December 2006.
[4] Ryszard S. Choras, Image Feature Extraction Techniques and Their
Applications for CBIR and Biometrics Systems, International Journal Of
Biology And Biomedical Engineering, Issue 1, Vol. 1, pp. 6-16, 2007.
[5] Frank Y. Shih, Shouxian Cheng, Automatic seeded region growing for
color image segmentation, Image and Vision Computing 23, pp. 877
886, 2005.
[6] Zijun Yang and C.-C. Jay Kuo, Survey on Image Content Analysis,
Indexing, and Retrieval Techniques and Status Report of MPEG-7,
Tamkang Journal of Science and Engineering, Vol. 2, No. 3, pp. 101-
118, 1999.
[7] M. K. Hu, Visual Problem Recognition by Moment Invariant, IRE.
Trans. Inform. Theory, Vol. IT-8, pp. 179-187, Feb. 1962.
[8] M.R. Teague, Image analysis via the general theory of moments, J. Opt.
Soc. Amer. 70 , pp. 920930, 1980.
[9] Chee-Way Chonga, P. Raveendranb and R. Mukundan, Translation and
scale invariants of Legendre moments, Pattern Recognition 37, pp. 119
129, 2004.
[10] F. L. Alt, Digital Pattern Recognition by Moments, J. Assoc. Computing
Machinery, Vol. 9, pp. 240-258, 1962.
[11] Sun-Kyoo Hwang, Whoi-Yul Kim, Anovel approach to the fast
computation of Zernike moments, Pattern Recognition 39, pp. 2065
2076, 2006.
[12] R. Sinan Tumen, M. Emre Acer and T. Metin Sezgin, Feature Extraction
and Classier Combination for Image-based Sketch Recognition,
Eurographics Symposium on Sketch-Based Interfaces and Modeling, pp.
18, 2010.
[13] Mustapha Oujaoura, Brahim Minaoui and Mohammed Fakir. Article:
Image Annotation using Moments and Multilayer Neural Networks.
IJCA Special Issue on Software Engineering, Databases and Expert
Systems SEDEX(1):46-55, September 2012. Published by Foundation of
Computer Science, New York, USA.
[14] Oren Boiman, Eli Shechtman and Michal Irani, In Defense of Nearest-
Neighbor Based Image Classification, IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), June 2008.
[15] A. Opelt, M. Fussenegger, A. Pinz, and P. Auer. Weak hypotheses and
boosting for generic object detection and recognition, In ECCV, 2004.
[16] G. Wang, Y. Zhang, and L. Fei-Fei. Using dependent regions for object
categorization in a generative framework. In CVPR, 2006.
[17] A. Bosch, A. Zisserman, and X. Munoz. Image classification using
random forests and ferns. In ICCV, 2007.
[18] P.Simard, D. Steinkraus, J. C. Platt, Best Practices for Convolutional
Neural Networks Applied to Visual Document Analysis, ICDAR, 2003,
pp. 958-962.
[19] S. Haykin, Neural networks: a comprehensive foundation, 2
nd
Edition.
Prentice Hall, New Jersey. 1999.
[20] ETH-80 database image. [Online]. Available: http://www.d2.mpi-
inf.mpg.de/Datasets/ETH80
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
172 | P a g e
www.ijacsa.thesai.org
Minimization of Call Blocking Probability using
Mobile Node velocity
Suman Kumar Sikdar
Department of Computer science &
Engineering, University of Kalyani
Kalyani-741235, India
Uttam Kumar Kundu
MTech in Department of Electronics
& Communication Engineering, W.B.
University of Technology, India
Debabrata Sarddar (Astt
Professor)
Department of Computer Science&
Engineering, University of kalyani
Kalyani-741235
AbstractDue to rapid growth in IEEE 802.11 based Wireless
Local Area Networks (WLAN), handoff has become a burning
issue. A mobile Node (MN) requires handoff when it travels out
of the coverage area of its current access point (AP) and tries to
associate with another AP. But handoff delays provide a serious
barrier for such services to be made available to mobile
platforms. Throughout the last few years there has been plenty of
research aimed towards reducing the handoff delay incurred in
the various levels of wireless communication.In this article, we
propose a received signal strength measurement based handoff
technique to improve handoff probability. By calculating the
speed of MN (Mobile Node) and signaling delay information we
try to take the right decision of handoff initiation time. Our
theoretical analysis and simulation results show that by taking
the proper decision for handoff we can effectively reduce false
handoff initiation probability and unnecessary traffic load
causing packet loss and call blocking.
Keywords-component; Next Generation Wireless Systems (NGWS);
Handoff; BS (Base Station); MN (Mobile Node); RSS (Received
signal strength); I EEE802.11
I. INTRODUCTION
Handoff has become an essential criterion in mobile
communication system especially in urban areas, owing to the
limited coverage area of Access Points (AP). Whenever a MN
move from current AP to a new AP it requires handoff. For
successful implementation of seamless Voice over IP
communications, the handoff latency should not exceed 50ms.
But measurements indicate MAC layer handoff latencies in the
range of 400ms which is completely unacceptable and thus
must be reduced for wireless networking to fulfil its potential.
With the advent of real time applications, the latency and
packet loss caused by mobility became an important issue in
Mobile Networks. The most relevant topic of discussion is to
reduce the IEEE 802.11 link-layer handoff latency. IEEE
802.11 MAC specification [1] defines two operation modes:
ad hoc and infrastructure mode. In the ad hoc mode, two or
more stations (STAs) recognize each other through beacons
and hence establish a peer-to-peer relationship. In
infrastructure mode, an AP provides network connectivity to
its associated STAs to form a Basic Service Set (BSS).
Multiple APs form an Extended Service Set (ESS) that
constructs the same wireless networks.
Figure1.
A. Channel distribution
IEEE802.11b and IEEE802.11g operates in the 2.4GHz
ISM band and use 11 of the maximum 14 channels available
and are hence compatible due to use of same frequency
channels. The channels (numbered 1to14) are spaced by 5MHz
with a bandwidth of 22MHz, 11MHz above and below the
centre of the channel. In addition there is a guard band of
1MHz at the base to accommodate out-of-band emissions
below 2.4GHz. Thus a transmitter set at channel one transmits
signal from 2.401GHz to 2.423GHz and so on to give the
standard channel frequency distribution as shown in [Figure.2]
Figure 2 .
It should be noted that due to overlapping of frequencies
there can be significant interference between adjacent APs.
Thus, in a well configured network, most of the APs will
operate on the non-overlapping channels numbered 1, 6 and
11.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
173 | P a g e
www.ijacsa.thesai.org
B. Handoff:
When a MS moves out of reach of its current AP it must be
reconnected to a new AP to continue its operation. The search
for a new AP and subsequent registration under it constitute
the handoff process which takes enough time (called handoff
latency) to interfere with proper functioning of many
applications
Figure2 : Handoff process
Three strategies have been proposed to detect the need for
hand off[1]:
1) mobile-controlled-handoff (MCHO):The mobile
station(MS) continuously monitors the signals of the
surrounding base stations(BS)and initiates the hand off
process when some handoff criteria are met.
2) network-controlled-handoff (NCHO):The surrounding
BSs measure the signal from the MS and the network initiates
the handoff process when some handoff criteria are met.
3) mobile-assisted-handoff (MAHO):The network asks the
MS to measure the signal from the surrounding BSs. The
network makes the handoff decision based on reports from the
MS.
Handoff can be of many types:
Hard & soft handoff: Originally hard handoff was used
where a station must break connection with the old AP before
joining the new AP thus resulting in large handoff delays.
However, in soft handoff the old connection is maintained
until a new one is established thus significantly reducing
packet loss.
The handoff procedure consists of three logical phases
where all communication between the mobile station
undergoing handoff and the APs concerned is controlled by
the use of IEEE802.11 management frames as shown below in
[fig3].
C. Scanning:
When a mobile station is moving away from its current
AP, it initiates the handoff process when the received signal
strength and the signal-to-noise-ratio have decreased
significantly. The STA now begins MAC layer scanning to
find new APs. It can either opt for a passive scan (where it
listens for beacon frames periodically sent out by APs) or
chose a faster active scanning mechanism wherein it regularly
sends out probe request frames and waits for responses for
T
MIN
(min Channel Time) and continues scanning until T
MAX
(max Channel Time) if at least one response has been heard
within T
MIN
. Thus, n*T
MIN
time to scan n channels
n*T
MAX
. The information gathered is then processed so that
the STA can decide which AP to join next. The total time
required until this point constitutes 90% of the handoff delay.
STA APs
Re-association Response
Re-association Request
Authentication Response
Authentication Request
Probe Response
Probe Request
Figure 3
D. Authentication:
Authentication is necessary to associate the link with the
new AP. Authentication must either immediately proceed to
association or must immediately follow a channel scan cycle.
In pre-authentication schemes, the MN authenticates with the
new AP immediately after the scan cycle finishes. IEEE
802.11 defines two subtypes of authentication service: Open
System which is a null authentication algorithm and Shared
Key which is a four-way authentication mechanism. If Inter
Access Point Protocol (IAPP) is used, only null authentication
frames need to be exchanged in the re-authentication phase.
Exchanging null authentication frames takes about 1-2 ms.
E. Re-Association:
Re-association is a process for transferring associations
from old AP to new one. Once the STA has been authenticated
with the new AP, re-association can be started. Previous works
has shown re-association delay to be around 1-2 ms.The range
of scanning delay is given by:-
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
174 | P a g e
www.ijacsa.thesai.org
N Tmin _ Tscan _ N Tmax
Where N is the total number of channels according to the
spectrum released by a country, Tmin is Min Channel
Time,Tscan is the total measured scanning delay, and Tmax is
Max Channel Time. Here we focus on reducing the scanning
delay by minimizing the total number of scans performed.
We have organized the rest part of this paper as follows.
Sec 2 describes the related works, Sec 3 describes our
proposed approach. The simulation results are given in Sec 4
and finally concluding remarks are presented in Sec5.
II. RELATED WORKS
A lot of researches have been dedicated to enhance the
performance of handover in Next Generation heterogeneous
Wireless Networks. Recently a number of cross layer
protocols and algorithms have been proposed to support
seamless handoff in NGWS.
A number of different schemes have been proposed to
reduce handoff latency in IEEE 802.11 wireless LANs. IEEE
802.11b based wireless and mobile networks [5], also called
Wi-Fi commercially, are experiencing a very fast growth
upsurge and are being widely deployed for providing variety
of services as it is cheap, and allows anytime, anywhere
access to network data.
The new age applications require a seamless handover
while the small coverage of individual APs has increased the
number of handoffs taking place. Thus reducing the handoff
latency has become a burning issue and much work has been
done to achieve this. See [6] for an overall review of popular
methods suggested.
Shin et al in [7] have introduced a selective scanning
algorithm with the help of channel masking technique coupled
with a caching mechanism to significantly reduce the handoff
delay. However, it still scans excess APs even after the new
AP may have already been found and thus leaves room for
further improvements.
Handoff, an inherent problem with wireless networks,
particularly real time applications, has not been well addressed
in IEEE 802.11, which takes a hard handoff approach [8].
In [9] the authors have introduced a novel caching process
using neighbor graphs by pre-scanning neighbor APs to collect
their respective channel information. The concept of neighbor
graphs can be utilized in different ways and have become very
popular in this field. In [10] a pre-authentication mechanism is
introduced to facilitate seamless handover. [11] is a novel
approach towards reducing handover latency in AP dense
networks.
In [12] a cross layer handoff management protocol scheme
has been proposed. They tried to enhance the handoff
performance by analyzing the speed of the mobile node,
handoff signaling delay, relative signal strength of old and
new base station and their relation with handoff failure
probability.
A novel mobility management system is proposed in [13]
for vertical handoff between WWAN and WLAN. The system
integrates a connection manager (CM) that intelligently detects
the changes in wireless network and a virtual connectivity
manager (VCM) mantains connectivity using end-to-end
priciple.
Authors of [14] propose solutions towards enabling and
supporting all types of mobility in heterogeneous networks.
The proposed approach does not support real time applications
by the network mobility functionality. This keeps the
application unaware of network mobility and works as a
backup for real time applications.
Handoff using received signal strength (RSS) of BS has
been proposed also to reduce handoff latency in NGWS.
In [15], the authors proposed a handoff algorithm in which
the received pilot signal strength is typically averaged to
diminish the undesirable effect of the fast fading component.
Unfortunately, the averaging process can substantially alter the
characteristics of path loss and shadowing components,
causing increased handoff delay.
In [16], a handoff algorithm using multi-level thresholds is
proposed. The performance results obtained, shows that an 8-
level threshold algorithm operates better than a single
threshold algorithm in terms of forced termination and call
blocking probabilities.
In [17] signal to interference ratio between old base-station
and neighboring base-stations are calculated to make the
handoff initiation decision for next generation wireless system
or 4G networks.
In [18], a handoff algorithm using multi-level thresholds
is proposed. The performance results obtained, shows that an
8-level threshold algorithm operates better than a single
threshold algorithm in terms offorced termination and call
blocking probabilities.
In [23], a handoff algorithm using distance measurement
method is used where the distance of MN from each BS is
calculated and by comparing them the best neighbor AP can
be found out.
In [24], a handoff algorithm using cell sectoring method is
used where the hexagonal cell is divided in 3 sectors and each
sector is assigns with two neighbor APs.
In [25], a handoff algorithm using prescanning technique is
used where the neighbor graphs are used to decide the handoff.
In [26], carrier to interference ratio is calculated to find the
best neighbor AP to perform handoff.
A new handoff scheme is proposed in [27] where a curve
fitting equation is used to predict the direction of motion of
MN and thus perform the handoff by reducing the scanning
time in a considerable amount.
In [28] and [29] two schemes are proposed to reduce
handoff failure probability by introducing new cell coverage
area in handoff region and by using different channel
allocation technique.
In [30] a vector analysis method is used to reduce the
handoff latency where the direction of velocity of MN is
considered as a vector quantity.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
175 | P a g e
www.ijacsa.thesai.org
Another new handoff scheme is proposed in [31] where a
tangent analysis method is used to predict the direction of
motion of MN and thus perform the handoff by reducing the
scanning time in a considerable amount.
III. PROPOSED WORK
In this paper we propose a handoff management protocol
to support seamless handoff in Next Generation Wireless
Systems. We consider the mobile nodes speed, Relative
Signal Strength of the base station, handoff signaling delay
information and threshold distance from cell boundary to
reduce false handoff initiation probability which creates
unnecessary traffic load and sometimes call blocking.
For our proposed work we consider the coverage area of
the base stations (BS) as regular hexagonal cells. We take two
base stations into our account to explain our proposed
approach, one is OBS where the call generates and other is
NBS, next destination of the MN. When the MN tends to
move out the coverage area of OBS it needs handoff with NBS
to continue the call. Fig. 4 describes the handoff scenario
between OBS nad NBS. Notations used in Fig. 4 describes the
following parameters.
Measurement.
4) Handoff Management.
B. Speed Estimation:
The speed of the MN is estimated by using the Doppler
Effect. The received signal frequency and the carrier signal
frequency and the speed of the MN are respectively
,
, and
. And
. (1)
)
Figure 4
This equation helps to estimate the speed of the MN.
When the MN moves out of the coverage area of OBS
through radial outward direction,
.
Thus
.. (2)
Figure 5
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
176 | P a g e
www.ijacsa.thesai.org
C. Measurement of threshold distance d from the cell
(OBS) boundary:
Case1: Let handoff signaling delay is, when the MN
moves out of the coverage area of OBS through radically
outward (Fig.5) direction the MN needs to initiate HMIP
registration at the distance d from the cell boundary for a
successful handoff. Otherwise the MN does not get enough
time to complete the total handoff procedure.
So for a successful handoff (handoff success probability
maximum) in this case
. (3)
Case2: When the MN reaches the point P, the RSS starts to
drop below the
region
and overlap region of OBS and NBS. So the MN can initiate
handoff without any hesitation from here. In this case
PA=PB=
For handoff
.. (4)
In this case handoff failure probability is maximum.
Combining (3) and (4) we get,
(5)
For this condition handoff success probability lies between
zero to one.
D. RSS and
Measurement:
The OBS antenna is transmitting a signal which gets
weaker as The MN moves away from it. The transmitted
signal power is maximum at the centre of the cell and
gradually decreases as the distance from the centre increases.
The received signal strength (RSS) at a distance is given by
...... (6)
=wave length of the transmitting signal.
=Transmitting power.
The threshold value of the received signal strength,
. (7)
Where
And
We take this value as we calculate the limiting value for d
previously. After this distance the MN can starts handoff
initiation. After call setup the MN periodically checks the
RSS, when RSS drops below the calculated threshold value
the MN tries to initiate HMIP registration.
The wave propagation in multipath channel depends on the
actual environment, including factors such as the antenna
height, profiles of buildings, roads and geo-morphological
conditions (hills, terrain) etc. These factors cause propagation
loss in the channel. Hence the received signal strength may be
Propagation loss Lis characterized by three aspects: path
loss (
), slow fading (
).
In a typical urban area propagation loss is measured as [11]
E. Handoff Management:
Handoff Initiation: When the MN is going to move out its
old BS its first challenge is to estimate the right time to initiate
the HMIP registration. The handoff trigger unit (Fig 6)
estimates the speed of the MN and its direction of motion and
signaling delay information to determine the threshold RSS
(
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 3, No. 11, 2012
181 | P a g e
www.ijacsa.thesai.org
environment, if the pheromone concentration of the eight grids
is the same, the ant will move towards the original orientation.
Due to the grid space of high pheromone concentration, the
random moving of ants is avoided that results in acceleration of
the convergence of the algorithm. The state of an individual ant
has been expressed by a phase variable containing its position r
and orientation.
Ghosh [6] proposed a method of aggregation pheromone.
The ants move with an aim to create homogenous groups. The
amount of movement of an ant towards a point is governed by
the intensity of aggregation pheromone deposited by all other
ants at that point. This gradual movement of ants in due course
of time results in formation of groups or clusters. The amount
of pheromone deposited by an ant at a particular location
depends on its distance from it. Less is the distance; more is the
deposition of pheromone. Thus aggregation pheromone density
at a particular location depends on the number of ants in its
closer proximity. More is the number of ants in its closer
proximity; the higher is the aggregation pheromone density.
The ants are allowed to move in the search space to find out the
points with higher pheromone density. The movement of an ant
is governed by the amount of pheromone deposited at different
points of the search space. More the deposited pheromone
more is the aggregation of ants. This leads to the formation of
homogenous groups or clusters. The number of clusters so
formed might be more than the desired number of clusters. So
as to obtain the desired number of clusters, in the second step
agglomerative average linkage algorithm is applied.
Liu [7] proposed an incremental method where each agent
computes the information entropy or pheromone concentration
of the area surrounding it, and clusters objects by picking up,
dropping and moving.
Ramos [9] modified the ant-based clustering by changing
the movement paradigm. His ants would move according to a
trail of pheromones left on clustering formations. This would
reduce the exploration of empty areas, where the pheromone
would eventually evaporate. This algorithm was called
ACLUSTER.
Chircop [5] proposed Multiple Pheromone Algorithm for
Cluster Analysis. Ants detect individual features of objects in
space and deposit pheromone traces that guide towards these
features. Each ant starts off by looking for a particular feature
but they can combine with ants looking for other features if the
match of their paths is above a given threshold. This enables
ants to detect and deposit pheromone corresponding to feature
combinations and provides the colony with more powerful
cluster analysis and classification tools.
IV. PROPOSED METHOD
The existing methods for ant based clustering are mainly
based on memory. Whatever an ant finds in its workspace it
stores in memory and the pick/drop decisions are based on the
probability calculated from the memory. Few methods have
used pheromone concept from real ants to change the random
walk into biased walk according to pheromone.
The proposed method takes into consideration the random
walk for movement and pheromone concept to avoid the
memory. According to this the ants will move randomly in the
workspace, a 2-D grid, but the pick and drop actions will
depend upon the pheromone on a particular cell. Pheromone on
a particular cell will depend upon the objects on that particular
cell and the objects lying on its neighbor cells. Here, we
consider total 9 cells, a cell and its 8-neighbor cells.
The pheromone on a particular cell will be incremented by
a fixed amount for every similar object in the surrounding and
decremented by the same amount for every dissimilar object.
Empty cells will not contribute to the pheromone and all the
empty cells will not have any pheromone. Pick/drop actions
will be done according to a threshold that changes its value
with the cluster construction. An ant, if encounters an object on
a cell, will pick it up if the pheromone on that cell is less than
the threshold. If a loaded ant encounters an object, same as its
load, it drops the object in neighborhood if the pheromone on
the cell is greater than or equal to the threshold. The clustering
process is presented below. It consists of three steps:
A. Initial pheromone laying:
This is initialization step. Every location (i, j) with an object
on the grid will be assigned a pheromone
ij
based on the
surrounding. Let be the amount of pheromone change. The
presence of similar objects in the surroundings increases the
pheromone trail on the location by and a dissimilar object
decreases the trail by .
B. Cluster construction:
Ants move randomly on the grid. If an unloaded ant meets
an object and finds pheromone on that location below the
threshold value, it picks it up. If loaded ant comes to a location
with pheromone value greater than the threshold and its load
matches with the object on that location, it drops in
neighborhood of location with
drop
probability.
C. Pheromone updation:
On a pick/drop action, the pheromone on that location and
the surrounding location will be updated. On Pickup,
and pheromone in the surrounding cells containing the similar
object will be decreased and containing dissimilar objects will
be increased. On Drop,