Is Context All You Need? Non-contextual vs Contextual Multiword Expressions Detection

0Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Effective methods of the detection of multiword expressions are important for many technologies related to Natural Language Processing. Most contemporary methods are based on the sequence labeling scheme, while traditional methods use statistical measures. In our approach, we want to integrate the concepts of those two approaches. In this paper, we present a novel weakly supervised multiword expressions extraction method which focuses on their behaviour in various contexts. Our method uses a lexicon of Polish multiword units as the reference knowledge base and leverages neural language modelling with deep learning architectures. In our approach, we do not need a corpus annotated specifically for the task. The only required components are: a lexicon of multiword units, a large corpus, and a general contextual embeddings model. Compared to the method based on non-contextual embeddings, we obtain gains of 15% points of the macro F1-score for both classes and 30% points of the F1-score for the incorrect multiword expressions. The proposed method can be quite easily applied to other languages.

Cite

CITATION STYLE

APA

Piasecki, M., & Kanclerz, K. (2022). Is Context All You Need? Non-contextual vs Contextual Multiword Expressions Detection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13350 LNCS, pp. 248–261). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-08751-6_18

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free