Detecting and Measuring Social Bias of Arabic Generative Models in the Context of Search and Recommendation

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Pre-training large language models on vast amounts of web-scraped texts is a current trend in natural language processing. While the resulting models are capable of generating convincing text, they also reproduce harmful social biases. Several modern studies have demonstrated that societal biases have a significant impact on the outcome of information retrieval systems. Our study is among the recent studies aimed at developing methods for quantifying and mitigating bias in search results and applying them to retrieval and recommendation systems based on transformer-based language models. This paper explores expressions of bias in Arabic text generation. Analyses will be performed on samples produced by the generative model AraGPT2 (a GPT-2 fine-tuned for Arabic). An Arabic bias classifier (Regard Classifier) based on new transformer model AraBERT (a BERT fine-tuned for Arabic) will be used to captures the social Bias of an Arabic-generated sentence or text. For the development of this classifier, a dataset will be crowd-sourced, cleaned, and independently annotated. AraGPT2 will be used to generate more biased descriptions from the standard prompts. Our Bias Detection Model (BDM) will be based on the combination of the two transformers (AraGPT2-AraBERT) models. In addition to our quantitative evaluation study, we will also conduct a qualitative study to understand how our system would compare to others approaches where users try to find bias in Arabic texts generated using AraGPT2 model. Our proposed model has achieved very encouraging results by reaching an accuracy percentage of 81%.

Cite

CITATION STYLE

APA

Harrag, F., Mahdadi, C., & Ziad, A. N. (2023). Detecting and Measuring Social Bias of Arabic Generative Models in the Context of Search and Recommendation. In Communications in Computer and Information Science (Vol. 1840 CCIS, pp. 155–168). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-37249-0_13

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free