Language modeling using PLSA-based topic HMM

Atsushi Sako; Tetsuya Takiguchi; Yasuo Ariki

Journal ArticleOPEN ACCESS

Language modeling using PLSA-based topic HMM

IEICE Transactions on Information and Systems (2008) E91-D(3) 522-528

DOI: 10.1093/ietisy/e91-d.3.522

2Citations

12Readers

Abstract

In this paper, we propose a PLSA-based language model for sports-related live speech. This model is implemented using a unigram rescaling technique that combines a topic model and an n-gram. In the conventional method, unigram rescaling is performed with a topic distribution estimated from a recognized transcription history. This method can improve the performance, but it cannot express topic transition. By incorporating the concept of topic transition, it is expected that the recognition performance will be improved. Thus, the proposed method employs a "Topic HMM" instead of a history to estimate the topic distribution. The Topic HMM is an Ergodic HMM that expresses typical topic distributions as well as topic transition probabilities. Word accuracy results from our experiments confirmed the superiority of the proposed method over a trigram and a PLSA-based conventional method that uses a recognized history. Copyright © 2008 The Institute of Electronics, Information and Communication Engineers.

Author supplied keywords

Cite

CITATION STYLE

APA

Sako, A., Takiguchi, T., & Ariki, Y. (2008). Language modeling using PLSA-based topic HMM. IEICE Transactions on Information and Systems, E91-D(3), 522–528. https://doi.org/10.1093/ietisy/e91-d.3.522

Language modeling using PLSA-based topic HMM

Abstract

Author supplied keywords

Cite

Register to see more suggestions