Annealing techniques for unsupervised statistical language learning

Noah A. Smith; Jason Eisner

Conference ProceedingsOPEN ACCESS

Annealing techniques for unsupervised statistical language learning

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2004) 486-493

DOI: 10.3115/1218955.1219017

28Citations

95Readers

Abstract

Exploiting unannotated natural language data is hard largely because unsupervised parameter estimation is hard. We describe deterministic annealing (Rose et al., 1990) as an appealing alternative to the Expectation-Maximization algorithm (Dempster et al., 1977). Seeking to avoid search error, DA begins by globally maximizing an easy concave function and maintains a local maximum as it gradually morphs the function into the desired non-concave likelihood function. Applying DA to parsing and tagging models is shown to be straightforward; significant improvements over EM are shown on a part-of-speech tagging task. We describe a variant, skewed DA, which can incorporate a good initializer when it is available, and show significant improvements over EM on a grammar induction task.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Smith, N. A., & Eisner, J. (2004). Annealing techniques for unsupervised statistical language learning. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 486–493). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1218955.1219017

Readers over time

Readers' Seniority

PhD / Post grad / Masters / Doc 31

58%

Professor / Associate Prof. 10

19%

Researcher 9

17%

Lecturer / Post doc 3

Readers' Discipline

Computer Science 37

69%

Linguistics 10

19%

Engineering 5

Social Sciences 2

Annealing techniques for unsupervised statistical language learning

Abstract

References Powered by Scopus

Elements of Information Theory

Optimization by simulated annealing

Deterministic annealing for clustering, compression, classification, regression, and related optimization problems

Cited by Powered by Scopus

Minimum risk annealing for training log-linear models

Linguistic Structure Prediction

On the link between gaussian homotopy continuation and convex envelopes

Register to see more suggestions

Cite

Readers over time

Readers' Seniority

Readers' Discipline