Due to the availability of the technology stack for implementing state of the art neural networks, fake news or fake information classification problems have attracted many researchers working on Natural Language Processing, machine learning, and deep learning. Currently, most works on fake news detection are available in English, which has confined its widespread usability outside the English-speaking population. As far as multilingual content is considered, the fake news classification in low-resource languages is challenging due to the unavailability of enough annotated corpus. In this work, we have studied and analyzed the impact of different transformer-based models like multilingual BERT, XLMRoBERTa, and MuRIL for the dataset created (translated) as a part of this research on multilingual low-resource fake news classification. We have done various experiments, including language-specific and different models, to see the impact of the models. We also offer the multilingual dataset in Tamil and Malayalam, which are from multiple domains that could be useful for research in this direction. We have made the datasets and code available in Github (https://github.com/hariharanrl/Multilingual_Fake_News ).
CITATION STYLE
Hariharan, R. I. L. A., & Anand Kumar, M. (2023). Impact of Transformers on Multilingual Fake News Detection for Tamil and Malayalam. In Communications in Computer and Information Science (Vol. 1802 CCIS, pp. 196–208). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-33231-9_13
Mendeley helps you to discover research relevant for your work.