Spotting Multilingual Consonant-Vowel Units of Speech Using Neural Network Models

Gangashetty, Suryakanth V.; Sekhar, C. Chandra; Yegnanarayana, B.

doi:10.1007/11613107_27

Suryakanth V. Gangashetty²³,
C. Chandra Sekhar²³ &
B. Yegnanarayana²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3817))

Included in the following conference series:

International Conference on Nonlinear Analyses and Algorithms for Speech Processing

735 Accesses

Abstract

Multilingual speech recognition system is required for tasks that use several languages in one speech recognition application. In this paper, we propose an approach for multilingual speech recognition by spotting consonant-vowel (CV) units. The important features of spotting approach are that there is no need for automatic segmentation of speech and it is not necessary to use models for higher level units to recognise the CV units. The main issues in spotting multilingual CV units are the location of anchor points and labeling the regions around these anchor points using suitable classifiers. The vowel onset points (VOPs) have been used as anchor points. The distribution capturing ability of autoassociative neural network (AANN) models is explored for detection of VOPs in continuous speech. We explore classification models such as support vector machines (SVMs) which are capable of discriminating confusable classes of CV units and generalisation from limited amount of training data. The data for similar CV units across languages are shared to train the classifiers for recognition of CV units of speech in multiple languages. We study the spotting approach for recognition of a large number of CV units in the broadcast news corpus of three Indian languages.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Convolutional neural network based language identification system: A spectrogram based approach

Article 04 October 2024

Improvements in the Detection of Vowel Onset and Offset Points in a Speech Sequence

Article 08 September 2016

Cebuano-English Code-Switching Speech Detection Using Support Vector Machine

References

Rabiner, L.R., Juang, B.-H.: Fundamentals of Speech Recognition. PTR Prentice Hall, Englewood Cliffs (1993)
Google Scholar
Eswar, P., Gupta, S.K., Chandra Sekhar, C., Yegnanarayana, B., Nagamma Reddy, K.: An acoustic-phonetic expert for analysis and processing of continuous speech in Hindi. In: Proc. European Conf. Speech Technology, Edinburgh, pp. 369–372 (1987)
Google Scholar
Gangashetty, S.V., Chandra Sekhar, C., Yegnanarayana, B.: Detection of vowel onset points in continuous speech using autoassociative neural network models. In: Proc. Eighth Int. Conf. Spoken Language Processing (INTERSPEECH 2004 - ICSLP), pp. 1081–1084 (2004)
Google Scholar
Gangashetty, S.V., Chandra Sekhar, C., Yegnanarayana, B.: Acoustic model combination for recognition of speech in multiple languages using support vector machines. In: Proc. IEEE Int. Joint Conf. Neural Networks (Budapest, Hungary), vol. 4(4), pp. 3065–3069 (2004)
Google Scholar
Haykin, S.: Neural Networks: A Comprehensive Foundation. Prentice-Hall International, New Jersey (1999)
MATH Google Scholar
Gangashetty, S.V., Chandra Sekhar, C., Yegnanarayana, B.: Dimension reduction using autoassociative neural network models for recognition of consonant-vowel units of speech. In: Proc. Fifth Int. Conf. Advances in Pattern Recognition (ISI Calcutta, India), pp. 156–159 (2003)
Google Scholar
Diamantaras, K.I., Kung, S.Y.: Principal Component Neural Networks, Theory and Applications. John Wiley and Sons, Inc., New York (1996)
Google Scholar
Roukos, S., Rohlicek, R., Russel, W., Gish, H.: Continuous hidden Markov modelling for speaker-independent word spotting. In: Proc. IEEE Int. Conf. Acoust., Speech and Signal Processing, pp. 627–630 (1989)
Google Scholar
Chandra Sekhar, C., Yegnanarayana, B.: Neural network models for spotting stop consonant-vowel (SCV) segments in continuous speech. In: Proc. Int. Conf. Neural Networks, pp. 2003–2008 (1996)
Google Scholar
Gangashetty, S.V., Chandra Sekhar, C., Yegnanarayana, B.: Spotting consonant-vowel units in continuous speech using autoassociative neural networks and support vector machines. In: Proc. IEEE Int. Workshop on Machine Learning for Signal Processing (Sao Luis, Brazil), pp. 401–410 (2004)
Google Scholar
Chandra Sekhar, C.: Neural Network Models for Recognition of Stop Consonant-Vowel (SCV) Segments in Continuous Speech. PhD thesis, Department of Computer Science and Engineering, Indian Institute of Technology Madras (1996)
Google Scholar
Gangashetty, S.V., Mahadeva Prasanna, S.R.: Significance of vowel onset point for speech recognition using neural network models. In: Proc. Fifth Int. Conf. Cognitive and Neural Systems (Boston, USA), vol. 24 (2001)
Google Scholar
Siva Rama Krishna Rao, J.Y., Chandra Sekhar, C., Yegnanarayana, B.: Neural networks based approach for detection of vowel onset points. In: Proc. Int. Conf. Advances in Pattern Recognition and Digital Techniques, Calcutta, pp. 316–320 (1999)
Google Scholar
Yegnanarayana, B., Kishore, S.P.: AANN-An alternative to GMM for pattern recognition. Neural Networks 15, 459–469 (2002)
Article Google Scholar
Bourlard, H., Morgan, N.: Connectionist Speech Recognition: A Hybrid Approach. Kluwer Academic Publishers, Boston (1994)
Google Scholar
Davis, S.B., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust., Speech, and Signal Processing 28, 357–366 (1980)
Article Google Scholar
Furui, S.: On the role of spectral transition for speech perception. J. Acoust. Soc. Am. 80(4), 1016–1025 (1986)
Article Google Scholar
Chandra Sekhar, C., Yegnanarayana, B.: A constraint satisfaction model for recognition of stop consonant-vowel (SCV) utterances. IEEE Trans. Speech and Audio Processing 10, 472–480 (2002)
Article Google Scholar
Chopde, A.: ITRANS Indian Language Transliteration Package Version 5.2. Source, http://www.aczone.com/itrans/
Chandra Sekhar, C., Takeda, K., Itakura, F.: Recognition of consonant-vowel (CV) units of speech in a broadcast news corpus using support vector machines. In: Proc. Int. Workshop on Pattern Recognition using Support Vector Machines (Niagara Falls, Canada), pp. 171–185 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Speech and Vision Laboratory, Department of Computer Science and Engineering, Indian Institute of Technology Madras, Chennai, 600 036, India
Suryakanth V. Gangashetty, C. Chandra Sekhar & B. Yegnanarayana

Authors

Suryakanth V. Gangashetty
View author publications
You can also search for this author in PubMed Google Scholar
C. Chandra Sekhar
View author publications
You can also search for this author in PubMed Google Scholar
B. Yegnanarayana
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Escola Universitària Politècnica de Mataró, UPC, Spain
Marcos Faundez-Zanuy
Escola Universitària Politècnica de Mataró, Spain
Léonard Janer & Antonio Satue-Villar &
Department of Psychology, Second University of Naples, and IIASS, Via Pellegrino 19, 84019, Vietri sul Mare, (SA), Italy
Anna Esposito
The Auton Lab, Carnegie Mellon University, Pittsburgh, PA, USA
Josep Roure
Escola Universitària Politècnica de Mataró (UPC), Barcelona, Spain
Virginia Espinosa-Duro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gangashetty, S.V., Sekhar, C.C., Yegnanarayana, B. (2006). Spotting Multilingual Consonant-Vowel Units of Speech Using Neural Network Models. In: Faundez-Zanuy, M., Janer, L., Esposito, A., Satue-Villar, A., Roure, J., Espinosa-Duro, V. (eds) Nonlinear Analyses and Algorithms for Speech Processing. NOLISP 2005. Lecture Notes in Computer Science(), vol 3817. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11613107_27

Download citation

DOI: https://doi.org/10.1007/11613107_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31257-4
Online ISBN: 978-3-540-32586-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics