Ee 2018
Ee 2018
Abstract—Natural language processing is a widely used character recognition process is introduced in TTS system by
technique by which systems can understand the instructions for Thu [2] with the intension to develop image to speech
manipulating text or speech. In the present paper, a Text-to- conversion system. Pros and cons of interactive voice response
speech synthesizer is developed that converts text into spoken system are reviewed by researchers [3] which are used in
word, by analysing and processing it using Natural Language different day-to-day applications. Different methods are also
Processing (NLP) and then using Digital Signal Processing (DSP) compared [4] for online systems in terms of user efficiency.
technology to convert this processed text into synthesized speech Concatenation method is further developed for local language
representation of the text. Here we developed a useful text-to- [5] with better accuracy which is an extension of work of
speech synthesizer in the form of a simple application that
Kanisha [1]. Different speech synthesis processes are also
converts inputted text into synthesized speech and reads out to
discussed by Htun and his co-workers [6]. Text inside any
the user which can then be saved as an mp3 file.
image is recently tracked by camera, and converted to speech
Keywords—Text-to-speech; Natural language processing; by Patil et. al. [7] for multilingual language. It is pointed out
Speech synthesizer; Speech recognition; Signal transformation that this method is very useful for blind persons for detecting
currency notes [8]. Finger-mounted camera provides a great
relief in these cases [9].
I. INTRODUCTION
Speech is the first primary mode of communication in The TTS synthesis is a procedure where first text analysis
Human Intelligent System (HIS) where NLP plays a role with and then generate the waveform of speech. Here it converts
many aspects of the field deal with linguistic natures of this phonetic and prosodic information into a wave form based
computation. NLP is a way of research and application that upon approximation formula. The amplitude of each signal
explains how a system (mainly a computer) can be used to which forms from the speech waves is measured and creates
understand, identify and manipulates a natural language. TTS the proper speech. Those speech waves are linguistic approach
is the automatic conversion which configures the concept of of texts. Sometimes these forms are linguistic or non-linguistic
speech recognition, speech analysis, speech synthesis, speech in nature. In the present paper, a synthesizer is developed
tuning, speech alteration etc. Here TTS use to convert a text which, apart from TTS conversion, saves the file into mp3
into speech that resembles, as closely as possible for a native format.
speaker of the language who trying to read that text. TTS is
the technology by which a computer can speak to user and II. OVEREVIEW OF SPEECH SYNTHESIS
give the computed information. TTS system acquires the text Speech synthesis is one of the artificial computations of
as input and then a computer algorithm which called TTS producing human voice. A TTS system converts any text
engine analyses the text, pre-processes the text and synthesizes followed by grammatical language into speech. Synthesized
the speech with some mathematical models. The TTS engine speech is a collection of small pieces of recorded speech
usually generates sound data in an audio format as the output. which are stored in a knowledge base (KB). This KB System
This TTS system also worked upon Natural Language differs in the size based on the stored speech units. That
Generator (NLG). system also maintains the speech quality based upon its
Kanisha [1] explained an innovative way for STS for algorithm by which it analysed the tree of speech units for
visually impaired people through voice signal. Optical better clarity. Alternatively, a synthesizer can be proper in
The application is divided into two main modules - the Let’s see how the import occurs using browse button:
main application module which includes the basic GUI
components which handles the basic operations of the
application such as input of parameters for conversion either
via file or direct keyboard input.
The second module, the main conversion engine which
integrated into the main module is for the acceptance of data
hence the conversion.
TTS Gramaty (TTSG) converts text to speech either by
typing the text into the text field provided or by coping from
an external document in the local machine and then pasting it
in the text field provided in the application. It also provides a
functionality that allows the user browse and open a text
document in the machine. TTSG then loads the document’s
text in the text area of the application and the reading
procedure starts automatically.
TTSR contains an exceptional function that gives the user
the choice of saving its already converted text to any part of
the local machine in an audio format; this allows the user to
copy the audio format to any of his/her audio devices, so that
they can hence forth treat it as an audio book.
The following figure depicts the loading procedure of the
TTS Gramaty.
Fig. 4. Import of file using TTS Gramaty