On the use of automatic speech recognition for spoken information retrieval from video databases

Luis R. Salgado-Garza; Juan A. Nolazco-Flores

Journal ArticleOPEN ACCESS

On the use of automatic speech recognition for spoken information retrieval from video databases

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2004) 3287 381-385

DOI: 10.1007/978-3-540-30463-0_47

1Citations

2Readers

Abstract

This document describes the realization of a spoken information retrieval system and its application to words search in an indexed video database. The system uses an automatic speech recognition (ASR) software to convert the audio signal of a video file into a transcript file and then a document indexing tool to index this transcripted file. Then, a spoken query, uttered by any user, is presented to the ASR to decode the audio signal and propose a hypothesis that is later used to formulate a query to the indexed database. The final outcome of the system is a list of video frame tags containing the audio correspondent to the spoken query. The speech recognition system achieved less than 15% Word Error Rate (WER) and its combined operation with the document indexing system showed outstanding performance with spoken queries. © Springer-Verlag 2004.

Cite

CITATION STYLE

APA

Salgado-Garza, L. R., & Nolazco-Flores, J. A. (2004). On the use of automatic speech recognition for spoken information retrieval from video databases. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3287, 381–385. https://doi.org/10.1007/978-3-540-30463-0_47

On the use of automatic speech recognition for spoken information retrieval from video databases

Abstract

Cite

Register to see more suggestions