ALaSca: An Automated Approach for Large-Scale Lexical Substitution

Caterina Lacerra; Tommaso Pasini; Rocco Tripodi; Roberto Navigli

Conference ProceedingsOPEN ACCESS

ALaSca: An Automated Approach for Large-Scale Lexical Substitution

IJCAI International Joint Conference on Artificial Intelligence (2021) 3836-3842

DOI: 10.24963/ijcai.2021/528

11Citations

14Readers

Abstract

The lexical substitution task aims at finding suitable replacements for words in context. It has proved to be useful in several areas, such as word sense induction and text simplification, as well as in more practical applications such as writing-assistant tools. However, the paucity of annotated data has forced researchers to apply mainly unsupervised approaches, limiting the applicability of large pre-trained models and thus hampering the potential benefits of supervised approaches to the task. In this paper, we mitigate this issue by proposing ALaSca, a novel approach to automatically creating large-scale datasets for English lexical substitution. ALaSca allows examples to be produced for potentially any word in a language vocabulary and to cover most of the meanings it lists. Thanks to this, we can unleash the full potential of neural architectures and finetune them on the lexical substitution task. Indeed, when using our data, a transformer-based model performs substantially better than when using manually-annotated data only. We release ALaSca at https://sapienzanlp.github.io/alasca/.

Cite

CITATION STYLE

APA

Lacerra, C., Pasini, T., Tripodi, R., & Navigli, R. (2021). ALaSca: An Automated Approach for Large-Scale Lexical Substitution. In IJCAI International Joint Conference on Artificial Intelligence (pp. 3836–3842). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2021/528

ALaSca: An Automated Approach for Large-Scale Lexical Substitution

Abstract

Cite

Register to see more suggestions