In computer vision, one of the common practices to augment the image dataset is by creating new images using geometric transformation preserving similarity. This data augmentation was one of the most significant factors for winning the Image Net competition in 2012 with vast neural networks. Unlike in computer vision and speech data, there have not been many techniques explored to augment data in natural language processing (NLP). The only technique explored in the text data is lexical substitution, which only focuses on replacing words by synonyms. In this paper, we investigate the use of different pointer networks with the sequence-to-sequence models, which have shown excellent results in neural machine translation (NMT) and text simplification tasks, in generating similar sentences using a sequence-to-sequence model and the paraphrase dataset (PPDB). The evaluation of these paraphrases is carried out by augmenting the training dataset of IMDb movie review dataset and comparing its performance with the baseline model. To our best knowledge, this is the first study on generating paraphrases using these models with the help of PPDB dataset.
CITATION STYLE
Gupta, V., & Krzyżak, A. (2020). An empirical evaluation of attention and pointer networks for paraphrase generation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12139 LNCS, pp. 399–413). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-50420-5_29
Mendeley helps you to discover research relevant for your work.