Evaluation of Language Models on Romanian XQuAD and RoITD datasets

4Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

Abstract

Natural language processing (NLP) has become a vital requirement in a wide range of applications, including machine translation, information retrieval, and text classification. The development and evaluation of NLP models for various languages have received significant attention in recent years, but there has been relatively little work done on comparing the performance of different language models on Romanian data. In particular, the introduction and evaluation of various Romanian language models with multilingual models have barely been comparatively studied. In this paper, we address this gap by evaluating eight NLP models on two Romanian datasets, XQuAD and RoITD. Our experiments and results show that bert-base-multilingual-cased and bertbase-multilingual-uncased, perform best on both XQuAD and RoITD tasks, while RoBERT-small model and DistilBERT models perform the worst. We also discuss the implications of our findings and outline directions for future work in this area.

Cite

CITATION STYLE

APA

Nicolae, D. C., Yadav, R. K., & Tufis, D. (2023). Evaluation of Language Models on Romanian XQuAD and RoITD datasets. International Journal of Computers, Communications and Control, 18(1). https://doi.org/10.15837/ijccc.2023.1.5111

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free