Iterative paraphrastic augmentation with discriminative span alignment

3Citations
Citations of this article
48Readers
Mendeley users who have this article in their library.

Abstract

We introduce a novel paraphrastic augmentation strategy based on sentence-level lexically constrained paraphrasing and discriminative span alignment. Our approach allows for the large-scale expansion of existing datasets or the rapid creation of new datasets using a small, manually produced seed corpus. We demonstrate our approach with experiments on the Berkeley FrameNet Project, a large-scale language understanding effort spanning more than two decades of human labor. With four days of training data collection for a span alignment model and one day of parallel compute, we automatically generate and release to the community 495,300 unique (Frame, Trigger) pairs in diverse sentential contexts, a roughly 50-fold expansion atop FrameNet v1.7. The resulting dataset is intrinsically and extrinsically evaluated in detail, showing positive results on a downstream task.

Cite

CITATION STYLE

APA

Culkin, R., Hu, J. E., Stengel-Eskin, E., Qin, G., & Van Durme, B. (2021). Iterative paraphrastic augmentation with discriminative span alignment. Transactions of the Association for Computational Linguistics, 9, 494–509. https://doi.org/10.1162/tacl_a_00380

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free