A phonotactic language model for spoken language identification

Haizhou Li; Bin Ma

Conference Proceedings

A phonotactic language model for spoken language identification

ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (2005) 515-522

DOI: 10.3115/1219840.1219904

34Citations

100Readers

Get full text

Abstract

We have established a phonotactic language model as the solution to spoken language identification (LID). In this framework, we define a single set of acoustic tokens to represent the acoustic activities in the world's spoken languages. A voice tokenizer converts a spoken document into a text-like document of acoustic tokens. Thus a spoken document can be represented by a count vector of acoustic tokens and token n-grams in the vector space. We apply latent semantic analysis to the vectors, in the same way that it is applied in information retrieval, in order to capture salient phonotactics present in spoken documents. The vector space modeling of spoken utterances constitutes a paradigm shift in LID technology and has proven to be very successful. It presents a 12.4% error rate reduction over one of the best reported results on the 1996 NIST Language Recognition Evaluation database. © 2005 Association for Computational Linguistics.

Cite

CITATION STYLE

APA

Li, H., & Ma, B. (2005). A phonotactic language model for spoken language identification. In ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (pp. 515–522). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1219840.1219904

A phonotactic language model for spoken language identification

Abstract

Cite

Register to see more suggestions