Abstract
Cross-language retrieval of spontaneous speech combines the challenges of working with noisy automated transcription and language translation. The CLEF 2005 Cross-Language Speech Retrieval (CL-SR) task provides a standard test collection to investigate these challenges. We show that we can improve retrieval performance: by careful selection of the term weighting scheme; by decomposing automated transcripts into phonetic substrings to help ameliorate transcription errors; and by combining automatic transcriptions with manually-assigned metadata. We further show that topic translation with online machine translation resources yields effective CL-SR.
Cite
CITATION STYLE
Inkpen, D., Alzghool, M., Jones, G. J. F., & Oard, D. W. (2006). Investigating cross-language speech retrieval for a spontaneous conversational speech collection. In HLT-NAACL 2006 - Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, Short Papers (pp. 61–64). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1614049.1614065
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.