Automatic Speech Recognition Using Deep Neural Networks
Automatic Speech Recognition Using Deep Neural Networks
I. INTRODUCTION
AUTOMATIC SPEECH
RECOGNITION USING A. Significance of the model
The Automatic Speech Recognition model plays a
DEEP NEURAL vital role in converting spoken language into text.
This also enables the seamless interaction between
NETWORKS humans and machines. This has evolved the
communication system by empowering the devices
1st Shivam Kushwaha Computer Science which are voice-controlled, translation the
Engineering Chandigarh UniversityMohali, India languages, and accessibility tools for the hearing
[email protected] impaired. ASR models have a large number of
2nd Piyush Deep Computer Science Engineering applications including healthcare, entertainment,
Chandigarh UniversityMohali, India
[email protected] education, enhancing productivity and customer
rd services.
3 Mohd Muaz Computer Science Engineering
Chandigarh UniversityMohali, India
[email protected] B. Objectives of the research
th
4 Er. Shafalii Sharma Computer Science The primary aim of the research in Automatic
Engineering Chandigarh University Mohali, India Speech Recognition with Deep Neural Networks is
[email protected] to improve the precision, effectiveness, and
reliability of the speech recognition models. The
creation of deep neural network structures helps to
Abstract - This research paper revolves around accurately grasp complex speech patterns, contexts,
the evolution and the present scenario of noise avoidance, and nuances. This allows the
Automatic Speech Recognition systems using speech recognition models to understand different
Deep Neural Networks. It includes the designs, languages, accents and environmental factors with
techniques of training, the evaluations of the higher accuracy. This boosts the performance of the
model performance, and the emerging trends model through innovative neural network designs
that are specific to deep neural networks and data augmentation approaches.
embedded in automatic speech recognition
models. This research also incorporates the
challenges faced while building and deploying the II. LITERATURE REVIEW
speech recognition model including limited data
availability and adaptability. We have examined There have been a number of researches on
how deep neural networks have transformed automatic speech recognition using deep neural
automatic speech recognition and this research networks which have resulted in
provides valuable insights to improve the speech significant advancements in communication and
recognition technology across a number of human-machine interaction. A number of literature
applications from healthcare to smart devices. reviews have been reviewed before conducting this
research on automatic speech recognition using
INDEX TERMS - Automatic Speech DNNs. We have reviewed the evolutions,
Recognition, Deep Neural Networks, Language methodologies, challenges and, future directions in
Modeling, Robustness to Noise, Speech Modelling this specific field. The history of ASR has shown a
significant shift from rule-based models to statistical
models and adoption of the neural networks. Deep
Neural Networks have the ability to model complex
patterns that have emerged as of great significance
in ASR research. They can handle large datasets and
© 2023, IJSREM | www.ijsrem.com DOI: 10.55041/IJSREM27292 | Page 1
International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 07 Issue: 12 | December - 2023 SJIF Rating: 8.176 ISSN: 2582-3930
Let X,
B. Feature Extraction:
D. Input Layer:
E. Output Layer:
outperform the traditional systems. It has also shown Noise reduction techniques and acoustic modelling
robustness to various background noises and other help to develop more reliable models. As we get
environmental factors. deeper into the work of neural networks, it will help
to improve the customization and personalization of
the ASR.
IX. CONCLUSION
[5] Espana-Bonet, Cristina, and José AR Fonollosa. Systems Design and Computing, 1(1-2), pp.71-86.
"Automatic speech recognition with deep neural networks [17] Sim, K.C., Qian, Y., Mantena, G., Samarakoon, L.,
for impaired speech." In Advances in Speech and Kundu, S. and Tan, T., 2017. Adaptation of deep neural
Language Technologies for Iberian Languages: Third network acoustic models for robust automatic speech
International Conference, IberSPEECH 2016, Lisbon, recognition. New Era for Robust Speech Recognition:
Portugal, November 23-25, 2016, Proceedings 3, pp. 97- Exploiting Deep Learning, pp.219-243.
107. Springer [18] Serizel, R. and Giuliani, D., 2014. Deep neural
International Publishing, 2016. network adaptation for children's and adults' speech
[6] Fantaye, T.G., Yu, J. and Hailu, T.T., 2020. recognition. Deep neural network adaptation for
Investigation of automatic speech recognition systems via children's and adults' speechrecognition, pp.344-348.
the multilingual deep neural network modelling methods [19] Soundarya, M., Karthikeyan, P.R. and Thangarasu,
for a very low-resource language, Chaha. Journal of G., 2023, March. Automatic Speech Recognition trained
Signal and Information Processing, 11(1), pp.1-21. with Convolutional Neural Network and predicted with
[7] Fendji, J.L.K.E., Tala, D.C., Yenke, B.O. and Recurrent Neural Network. In 2023 9th International
Atemkeng, M., 2022. Automatic speech recognition using Conference on Electrical Energy Systems (ICEES) (pp.
limited vocabulary: A survey. Applied Artificial 41-45). IEEE.
Intelligence, 36(1), p.2095039. [20] Toledano, D.T., Fernández-Gallego, M.P. and
[8] Gulati, A., Qin, J., Chiu, C.C., Parmar, N., Zhang, Lozano-Diez, A., 2018. Multi-resolution speech analysis
Y., Yu, J., Han, W., Wang, S., Zhang, Z., Wu, Y. and for automatic speech recognition using deep neural
Pang, R., 2020. Conformer: Convolution-augmented networks: Experiments on TIMIT. PloS one, 13(10),
transformer for speech recognition. arXiv preprint p.e0205355.
arXiv:2005.08100. [21] Tong, S., Garner, P.N. and Bourlard, H., 2017. An
[9] Han, K., He, Y., Bagchi, D., Fosler-Lussier, E. and investigation of deep neural networks for multilingual
Wang, D., 2015. Deep neural network-based spectral speech recognition training and adaptation (No. CONF,
feature mapping for robust speech recognition. At the pp. 714-718).
Sixteenth annual conference of the International Speech [22] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J.,
Communication Association. Jones, L., Gomez, A.N., Kaiser, Ł. and Polosukhin, I.,
[10] Iosifova, O., Iosifov, I., Sokolov, V.Y., 2017. Attention is all you need. Advances in neural
Romanovskyi, O. and Sukaylo, I., 2021. Analysis of information processing systems, 30.
automatic speech recognition methods. Cybersecurity [23] Weng, C., Yu, D., Seltzer, M.L. and Droppo, J.,
Providing in Information and Telecommunication 2015. Deep neural networks for single-channel multi-
Systems, 2923, pp.252-257. talker speech recognition. IEEE/ACM Transactions on
[11] Mukhamadiyev, A., Khujayarov, I., Djuraev, O. and Audio, Speech, and Language Processing, 23(10),
Cho, J., 2022. Automatic speech recognition method pp.1670-1679.
based on deep learning approaches for Uzbek language. [24] Yao, Kaisheng, Dong Yu, Frank Seide, Hang Su, Li
Sensors, 22(10), p.3683. Deng, and Yifan Gong. "Adaptation of context-dependent
[12] Nassif, A.B., Shahin, I., Attili, I., Azzeh, M. and deep neural networks for automatic speech recognition."
Shaalan, K., 2019. Speech recognition using deep neural In 2012 IEEE Spoken Language Technology Workshop
networks: A systematic review. IEEE Access, 7, (SLT),pp. 366-369. IEEE, 2012.
pp.19143-19165.
[25] Yu, D., Siniscalchi, S.M., Deng, L. and Lee, C.H.,
[13] Palaz, D. and Collobert, R., 2015. Analysis of
2012, March. Boosting attribute and phone estimation
CNN-based speech recognition system using raw speech
accuracies with deep neural networks for detection-based
as input (No. REP_WORK). Idiap.
speech recognition. In 2012 IEEE International
[14] Pardede, H.F., Yuliani, A.R. and Sustika, R., 2018.
Conference on Acoustics, Speech and Signal Processing
Convolutional neural network and feature transformation
(ICASSP) (pp. 4169-4172). IEEE.
for distant speech recognition. International Journal of
Electrical and Computer Engineering, 8(6), p.5381.
[15] Qian, Y., Bi, M., Tan, T. and Yu, K., 2016. Very
deep convolutional neural networks for noise robust
speech recognition. IEEE/ACM Transactions on Audio,
Speech, and Language Processing, 24(12), pp.2263-2276.
[16] Sarma, M., 2017. Speech recognition using deep
neural network trends. International Journal of Intelligent