Abstract
Low-resource text plagiarism detection faces a significant challenge due to the limited availability of labeled data for training. This task requires the development of sophisticated algorithms capable of identifying similarities and differences in texts, particularly in the realm of semantic rewriting and translation-based plagiarism detection. In this paper, we present an enhanced attentive Siamese Long Short-Term Memory (LSTM) network designed for Tibetan-Chinese plagiarism detection. Our approach begins with the introduction of translation-based data augmentation, aimed at expanding the bilingual training dataset. Subsequently, we propose a pre-detection method leveraging abstract document vectors to enhance detection efficiency. Finally, we introduce an improved attentive Siamese LSTM network tailored for Tibetan-Chinese plagiarism detection. We conduct comprehensive experiments to showcase the effectiveness of our proposed plagiarism detection framework.
Author supplied keywords
Cite
CITATION STYLE
Bao, W., Dong, J., Xu, Y., Yang, Y., & Qi, X. (2024). Exploring Attentive Siamese LSTM for Low-Resource Text Plagiarism Detection. Data Intelligence, 6(2), 488–503. https://doi.org/10.1162/dint_a_00242
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.