Audio-visual speech recognition one pass learning with spiking neurons

Renaud Séguier; David Mercier

Journal Article

Audio-visual speech recognition one pass learning with spiking neurons

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2002) 2415 1207-1212

DOI: 10.1007/3-540-46084-5_195

3Citations

10Readers

Get full text

Abstract

We present a new application in the field of impulse neurons: audio-visual speech recognition. The features extracted from the audio (cepstral coefficients) and the video (height, width of the mouth, percentage of black and white pixels in the mouth) are sufficiently simple to consider a real time integration of the complete system. A generic preprocessing makes it possible to convert these features into an impulse sequence treated by the neural network which carries out the classification. The training is done in one pass: the user pronounces once all the words of the dictionary. The tests on the European M2VTS Data Base shows the interest of such a system in audio-visual speech recognition. In the presence of noise in particular, the audio-visual recognition is much better than the recognition based on the audio modality only. © Springer-Verlag Berlin Heidelberg 2002.

Cite

CITATION STYLE

APA

Séguier, R., & Mercier, D. (2002). Audio-visual speech recognition one pass learning with spiking neurons. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2415, 1207–1212. https://doi.org/10.1007/3-540-46084-5_195

Audio-visual speech recognition one pass learning with spiking neurons

Abstract

Cite

Register to see more suggestions