Getting BART to Ride the Idiomatic Train: Learning to Represent Idiomatic Expressions

10Citations
Citations of this article
27Readers
Mendeley users who have this article in their library.

Abstract

Idiomatic expressions (IEs), characterized by their non-compositionality, are an impor-tant part of natural language. They have been a classical challenge to NLP, including pre-trained language models that drive today’s state-of-the-art. Prior work has identified defi-ciencies in their contextualized representation stemming from the underlying compositional paradigm of representation. In this work, we take a first-principles approach to build idiomaticity into BART using an adapter as a lightweight non-compositional language expert trained on idiomatic sentences. The im-proved capability over baselines (e.g., BART) is seen via intrinsic and extrinsic methods, where idiom embeddings score 0.19 points higher in homogeneity score for embedding clustering, and up to 25% higher sequence accuracy on the idiom processing tasks of IE sense disambiguation and span detection.

Cite

CITATION STYLE

APA

Zeng, Z., & Bhat, S. (2022). Getting BART to Ride the Idiomatic Train: Learning to Represent Idiomatic Expressions. Transactions of the Association for Computational Linguistics, 10, 1120–1137. https://doi.org/10.1162/tacl_a_00510

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free