Iterative paraphrastic augmentation with discriminative span alignment

Ryan Culkin; J. Edward Hu; Elias Stengel-Eskin; Guanghui Qin; Benjamin Van Durme

Journal ArticleOPEN ACCESS

Iterative paraphrastic augmentation with discriminative span alignment

Transactions of the Association for Computational Linguistics (2021) 9 494-509

DOI: 10.1162/tacl_a_00380

3Citations

48Readers

Abstract

We introduce a novel paraphrastic augmentation strategy based on sentence-level lexically constrained paraphrasing and discriminative span alignment. Our approach allows for the large-scale expansion of existing datasets or the rapid creation of new datasets using a small, manually produced seed corpus. We demonstrate our approach with experiments on the Berkeley FrameNet Project, a large-scale language understanding effort spanning more than two decades of human labor. With four days of training data collection for a span alignment model and one day of parallel compute, we automatically generate and release to the community 495,300 unique (Frame, Trigger) pairs in diverse sentential contexts, a roughly 50-fold expansion atop FrameNet v1.7. The resulting dataset is intrinsically and extrinsically evaluated in detail, showing positive results on a downstream task.

Cite

CITATION STYLE

APA

Culkin, R., Hu, J. E., Stengel-Eskin, E., Qin, G., & Van Durme, B. (2021). Iterative paraphrastic augmentation with discriminative span alignment. Transactions of the Association for Computational Linguistics, 9, 494–509. https://doi.org/10.1162/tacl_a_00380

Iterative paraphrastic augmentation with discriminative span alignment

Abstract

Cite

Register to see more suggestions