An analysis of tree topological features in classifier-based unlexicalized parsing

Samuel W.K. Chan; Mickey W.C. Chong; Lawrence Y.L. Cheung

Conference Proceedings

An analysis of tree topological features in classifier-based unlexicalized parsing

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2011) 6608 LNCS(PART 1) 155-170

DOI: 10.1007/978-3-642-19400-9_13

1Citations

3Readers

Get full text

Abstract

A novel set of "tree topological features" (TTFs) is investigated for improving a classifier-based unlexicalized parser. The features capture the location and shape of subtrees in the treebank. Four main categories of TTFs are proposed and compared. Experimental results showed that each of the four categories independently improved the parsing accuracy significantly over the baseline model. When combined using the ensemble technique, the best unlexicalized parser achieves 84% accuracy without any extra language resources, and matches the performance of early lexicalized parsers. Linguistically, TTFs approximate linguistic notions such as grammatical weight, branching property and structural parallelism. This is illustrated by studying how the features capture structural parallelism in processing coordinate structures. © 2011 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Chan, S. W. K., Chong, M. W. C., & Cheung, L. Y. L. (2011). An analysis of tree topological features in classifier-based unlexicalized parsing. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6608 LNCS, pp. 155–170). https://doi.org/10.1007/978-3-642-19400-9_13

An analysis of tree topological features in classifier-based unlexicalized parsing

Abstract

Author supplied keywords

Cite

Register to see more suggestions