Complex conjunctions and determiners are often considered as pretokenized units in parsing. This is not always realistic, since they can be ambiguous. We propose a model for joint dependency parsing and multiword expressions identification, in which complex function words are represented as individual tokens linked with morphological dependencies. Our graphbased parser includes standard secondorder features and verbal subcategorization features derived from a syntactic lexicon. We train it on a modified version of the French Treebank enriched with morphological dependencies. It recognizes 81.79% of ADV+que conjunctions with 91.57% precision, and 82.74% of de+DET determiners with 86.70% precision.
CITATION STYLE
Nasr, A., Ramisch, C., Deulofeu, J., & Valli, A. (2015). Joint dependency parsing and multiword expression tokenisation. In ACL-IJCNLP 2015 - 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Proceedings of the Conference (Vol. 1, pp. 1116–1126). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/p15-1108
Mendeley helps you to discover research relevant for your work.