Toward optimizing stream fusion in multistream recognition of speech

  • Mesgarani N
  • Thomas S
  • Hermansky H
14Citations
Citations of this article
37Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

A multistream phoneme recognition framework is proposed based on forming streams from different spectrotemporal modulations of speech. Phoneme posterior probabilities were estimated from each stream separately and combined at the output level. A statistical model of the final estimated posterior probabilities is used to characterize the system performance. During the operation, the best fusion architecture is chosen automatically to maximize the similarity of output statistics to clean condition. Results on phoneme recognition from noisy speech indicate the effectiveness of the proposed method.

Cite

CITATION STYLE

APA

Mesgarani, N., Thomas, S., & Hermansky, H. (2011). Toward optimizing stream fusion in multistream recognition of speech. The Journal of the Acoustical Society of America, 130(1), EL14–EL18. https://doi.org/10.1121/1.3595744

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free