Detection of Language (Model) Errors

1Citations
Citations of this article
70Readers
Mendeley users who have this article in their library.

Abstract

The bigram language models are popular, in much language processing applications, in both Indo-European and Asian languages. However, when the language model for Chinese is applied in a novel domain, the accuracy is reduced significantly, from 96% to 78% in our evaluation. We apply pattern recognition techniques (i.e. Bayesian, decision tree and neural network classifiers) to discover language model errors. We have examined 2 general types of features: model-based and language-specific features. In our evaluation, Bayesian classifiers produce the best recall performance of 80% but the precision is low (60%). Neural network produced good recall (75%) and precision (80%) but both Bayesian and Neural network have low skip ratio (65%). The decision tree classifier produced the best precision (81%) and skip ratio (76%) but its recall is the lowest (73%).

Cite

CITATION STYLE

APA

Hung, K. Y., Luk, R. W. P., Yeung, D., Chung, K. F. L., & Shu, W. (2000). Detection of Language (Model) Errors. In Proceedings of the 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, SIGDAT-EMNLP 2000 - Held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics, ACL 2000 (pp. 87–94). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1117794.1117805

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free