Robust named entity extraction from large spoken archives

Benoît Favre; Frédéric Béchet; Pascal Nocéra

Conference ProceedingsOPEN ACCESS

Robust named entity extraction from large spoken archives

HLT/EMNLP 2005 - Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (2005) 491-498

DOI: 10.3115/1220575.1220637

50Citations

84Readers

Abstract

Traditional approaches to Information Extraction (IE) from speech input simply consist in applying text based methods to the output of an Automatic Speech Recognition (ASR) system. If it gives satisfaction with low Word Error Rate (WER) transcripts, we believe that a tighter integration of the IE and ASR modules can increase the IE performance in more difficult conditions. More specifically this paper focuses on the robust extraction of Named Entities from speech input where a temporal mismatch between training and test corpora occurs. We describe a Named Entity Recognition (NER) system, developed within the French Rich Broadcast News Transcription program ESTER, which is specifically optimized to process ASR transcripts and can be integrated into the search process of the ASR modules. Finally we show how some metadata information can be collected in order to adapt NER and ASR models to new conditions and how they can be used in a task of Named Entity indexation of spoken archives. © 2005 Association for Computational Linguistics.

Cite

CITATION STYLE

APA

Favre, B., Béchet, F., & Nocéra, P. (2005). Robust named entity extraction from large spoken archives. In HLT/EMNLP 2005 - Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (pp. 491–498). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1220575.1220637

Robust named entity extraction from large spoken archives

Abstract

Cite

Register to see more suggestions