Exploring Attentive Siamese LSTM for Low-Resource Text Plagiarism Detection

Wei Bao; Jian Dong; Yang Xu; Yuanyuan Yang; Xiaoke Qi

Journal ArticleOPEN ACCESS

Exploring Attentive Siamese LSTM for Low-Resource Text Plagiarism Detection

Data Intelligence (2024) 6(2) 488-503

DOI: 10.1162/dint_a_00242

3Citations

17Readers

Abstract

Low-resource text plagiarism detection faces a significant challenge due to the limited availability of labeled data for training. This task requires the development of sophisticated algorithms capable of identifying similarities and differences in texts, particularly in the realm of semantic rewriting and translation-based plagiarism detection. In this paper, we present an enhanced attentive Siamese Long Short-Term Memory (LSTM) network designed for Tibetan-Chinese plagiarism detection. Our approach begins with the introduction of translation-based data augmentation, aimed at expanding the bilingual training dataset. Subsequently, we propose a pre-detection method leveraging abstract document vectors to enhance detection efficiency. Finally, we introduce an improved attentive Siamese LSTM network tailored for Tibetan-Chinese plagiarism detection. We conduct comprehensive experiments to showcase the effectiveness of our proposed plagiarism detection framework.

Author supplied keywords

Cite

CITATION STYLE

APA

Bao, W., Dong, J., Xu, Y., Yang, Y., & Qi, X. (2024). Exploring Attentive Siamese LSTM for Low-Resource Text Plagiarism Detection. Data Intelligence, 6(2), 488–503. https://doi.org/10.1162/dint_a_00242

Exploring Attentive Siamese LSTM for Low-Resource Text Plagiarism Detection

Abstract

Author supplied keywords

Cite

Register to see more suggestions