Resolving coordinate structures for chinese constituent parsing

0Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Coordinate structures are linguistic structures consisting of two or more conjuncts, which usually compose into larger constituent as a whole unit. However, the boundary of each conjunct is difficult to identify, which makes it difficult to parse the whole coordinate and larger structures. In labeled data, such as the Penn Chinese Tree Bank (CTB), coordinate structures are not labeled explicitly, which makes solving the problem more complicated. In this paper, we treat resolving coordinate structures as an independent sub-problem of parsing. We first define coordinate structures explicitly and design rules to extract the coordinate structures from labeled CTB data. Then a specifically designed grammar is proposed for automatic parsing of coordinate structures. We propose two groups of new features to better model coordinate structures in a shift-reduce parsing framework. Our approach can achieve a 15% improvement in F-1 score on resolving coordinate structures.

Cite

CITATION STYLE

APA

Zhou, Y., Huang, S., Dai, X., & Chen, J. (2015). Resolving coordinate structures for chinese constituent parsing. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9362, pp. 353–361). Springer Verlag. https://doi.org/10.1007/978-3-319-25207-0_30

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free