On the use of automatic speech recognition for spoken information retrieval from video databases

1Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

This document describes the realization of a spoken information retrieval system and its application to words search in an indexed video database. The system uses an automatic speech recognition (ASR) software to convert the audio signal of a video file into a transcript file and then a document indexing tool to index this transcripted file. Then, a spoken query, uttered by any user, is presented to the ASR to decode the audio signal and propose a hypothesis that is later used to formulate a query to the indexed database. The final outcome of the system is a list of video frame tags containing the audio correspondent to the spoken query. The speech recognition system achieved less than 15% Word Error Rate (WER) and its combined operation with the document indexing system showed outstanding performance with spoken queries. © Springer-Verlag 2004.

Cite

CITATION STYLE

APA

Salgado-Garza, L. R., & Nolazco-Flores, J. A. (2004). On the use of automatic speech recognition for spoken information retrieval from video databases. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3287, 381–385. https://doi.org/10.1007/978-3-540-30463-0_47

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free