The paper presents the Position Specific Posterior Lattice, a novel representation of automatic speech recognition lattices that naturally lends itself to efficient indexing of position information and subsequent relevance ranking of spoken documents using proximity. In experiments performed on a collection of lecture recordings - MIT iCampus data - the spoken document ranking accuracy was improved by 20% relative over the commonly used baseline of indexing the 1-best output from an automatic speech recognizer. The Mean Average Precision (MAP) increased from 0.53 when using 1-best output to 0.62 when using the new lattice representation. The reference used for evaluation is the output of a standard retrieval engine working on the manual transcription of the speech collection. Albeit lossy, the PSPL lattice is also much more compact than the ASR 3-gram lattice from which it is computed - which translates in reduced inverted index size as well - at virtually no degradation in word-error-rate performance. Since new paths are introduced in the lattice, the ORACLE accuracy increases over the original ASR lattice. © 2005 Association for Computational Linguistics.
CITATION STYLE
Chelba, C., & Acero, A. (2005). Position Specific Posterior Lattices for indexing speech. In ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (pp. 443–450). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1219840.1219895
Mendeley helps you to discover research relevant for your work.