A Dirichlet-smoothed bigram model for retrieving spontaneous speech

4Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We present two simple but effective smoothing techniqes for the standard language model (LM) approach to information retrieval [12]. First, we extend the unigram Dirichlet smoothing technique popular in IR [17] to bigram modeling [16]. Second, we propose a method of collection expansion for more robust estimation of the LM prior, particularly intended for sparse collections. Retrieval experiments on the MALACH archive [9] of automatically transcribed and manually summarized spontaneous speech interviews demonstrates strong overall system performance and the relative contribution of our extensions. © 2008 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Lease, M., & Charniak, E. (2008). A Dirichlet-smoothed bigram model for retrieving spontaneous speech. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5152 LNCS, pp. 687–694). Springer Verlag. https://doi.org/10.1007/978-3-540-85760-0_87

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free