DeepBT and NLP Data Augmentation Techniques: A New Proposal and a Comprehensive Study

2Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Data Augmentation methods – a family of techniques designed for synthetic generation of training data – have shown remarkable results in various Deep Learning and Machine Learning tasks. Despite its widespread and successful adoption within the computer vision community, data augmentation techniques designed for natural language processing (NLP) tasks have exhibited much slower advances and limited success in achieving performance gains. As a consequence, with the exception of applications of back-translation to machine translation tasks, these techniques have not been as thoroughly explored by the wider NLP community. Recent research on the subject also still lacks a proper practical understanding of the relationship between data augmentation and several important aspects of model design, such as hyperparameters and regularization parameters. In this paper, we perform a comprehensive study of NLP data augmentation techniques, comparing their relative performance under different settings. We also propose Deep Back-Translation, a novel NLP data augmentation technique and apply it to benchmark datasets. We analyze the quality of the synthetic data generated, evaluate its performance gains and compare all of these aspects to previous existing data augmentation procedures.

Cite

CITATION STYLE

APA

Maier Ferreira, T., & Reali Costa, A. H. (2020). DeepBT and NLP Data Augmentation Techniques: A New Proposal and a Comprehensive Study. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12319 LNAI, pp. 435–449). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-61377-8_30

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free