Unsupervised induction of labeled parse trees by clustering with syntactic features

6Citations
Citations of this article
84Readers
Mendeley users who have this article in their library.

Abstract

We present an algorithm for unsupervised induction of labeled parse trees. The algorithm has three stages: bracketing, initial labeling, and label clustering. Bracketing is done from raw text using an unsupervised incremental parser. Initial labeling is done using a merging model that aims at minimizing the grammar description length. Finally, labels are clustered to a desired number of labels using syntactic features extracted from the initially labeled trees. The algorithm obtains 59% labeled f-score on the WSJ10 corpus, as compared to 35% in previous work, and substantial error reduction over a random baseline. We report results for English, German and Chinese corpora, using two label mapping methods and two label set sizes. © 2008. Licensed under the Creative Commons.

Cite

CITATION STYLE

APA

Reichart, R., & Rappoport, A. (2008). Unsupervised induction of labeled parse trees by clustering with syntactic features. In Coling 2008 - 22nd International Conference on Computational Linguistics, Proceedings of the Conference (Vol. 1, pp. 721–728). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1599081.1599172

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free