All-word prediction as the ultimate confusable disambiguation

4Citations
Citations of this article
73Readers
Mendeley users who have this article in their library.

Abstract

We present a classification-based word prediction model based on IGTREE, a decision-tree induction algorithm with favorable scaling abilities and a functional equivalence to n-gram models with back-off smoothing. Through a first series of experiments, in which we train on Reuters newswire text and test either on the same type of data or on general or fictional text, we demonstrate that the system exhibits log-linear increases in prediction accuracy with increasing numbers of training examples. Trained on 30 million words of newswire text, prediction accuracies range between 12.6% on fictional text and 42.2% on newswire text. In a second series of experiments we compare all-words prediction with confusable prediction, i.e., the same task, but specialized to predicting among limited sets of words. Confusable prediction yields high accuracies on nine example confusable sets in all genres of text. The confusable approach outperforms the all-words-prediction approach, but with more data the difference decreases.

Cite

CITATION STYLE

APA

van den Bosch, A. (2006). All-word prediction as the ultimate confusable disambiguation. In HLT-NAACL 2006 - Computationally Hard Problems and Joint Inference in Speech and Language Processing, Proceedings of the Workshop (pp. 25–32). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1631828.1631832

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free