ALaSca: An Automated Approach for Large-Scale Lexical Substitution

11Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.

Abstract

The lexical substitution task aims at finding suitable replacements for words in context. It has proved to be useful in several areas, such as word sense induction and text simplification, as well as in more practical applications such as writing-assistant tools. However, the paucity of annotated data has forced researchers to apply mainly unsupervised approaches, limiting the applicability of large pre-trained models and thus hampering the potential benefits of supervised approaches to the task. In this paper, we mitigate this issue by proposing ALaSca, a novel approach to automatically creating large-scale datasets for English lexical substitution. ALaSca allows examples to be produced for potentially any word in a language vocabulary and to cover most of the meanings it lists. Thanks to this, we can unleash the full potential of neural architectures and finetune them on the lexical substitution task. Indeed, when using our data, a transformer-based model performs substantially better than when using manually-annotated data only. We release ALaSca at https://sapienzanlp.github.io/alasca/.

Cite

CITATION STYLE

APA

Lacerra, C., Pasini, T., Tripodi, R., & Navigli, R. (2021). ALaSca: An Automated Approach for Large-Scale Lexical Substitution. In IJCAI International Joint Conference on Artificial Intelligence (pp. 3836–3842). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2021/528

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free