Text data-augmentation using Text Similarity with Manhattan Siamese long short-term memory for Thai language

Thananya Phreeraphattanakarn; Boonserm Kijsirikul

Conference ProceedingsOPEN ACCESS

Text data-augmentation using Text Similarity with Manhattan Siamese long short-term memory for Thai language

Journal of Physics: Conference Series (2021) 1780(1)

DOI: 10.1088/1742-6596/1780/1/012018

8Citations

16Readers

Abstract

In this paper, we address the issue of using small text datasets for learning of neural networks. We explore the method that is used with image and sound datasets to augment data for increasing the performance of models. We then leverage this data augmentation technique to expand the training set of textual data. A great challenge in our dataset is that the amount of data is insufficient for training models. For this reason, we propose a method for augmenting text data specifically for Thai language which is based on Text Similarity and using the model to determine the semantic relationship between two sentences. The experimental results indicated that our proposed method is able to improve the performance of text classification.

Cite

CITATION STYLE

APA

Phreeraphattanakarn, T., & Kijsirikul, B. (2021). Text data-augmentation using Text Similarity with Manhattan Siamese long short-term memory for Thai language. In Journal of Physics: Conference Series (Vol. 1780). IOP Publishing Ltd. https://doi.org/10.1088/1742-6596/1780/1/012018

Text data-augmentation using Text Similarity with Manhattan Siamese long short-term memory for Thai language

Abstract

Cite

Register to see more suggestions