With the rapid increase of Arabic content on the web comes an increased need for short and accurate answers to queries. Machine question answering has appeared as an important emerging field for progress in natural language processing techniques. Machine learning performance surpasses that of humans in some areas, such as natural language processing and text analysis, especially with large amounts of data. There are two main contributions of this research. First, we propose the Tawasul Arabic question similarity (TAQS) system with four Arabic semantic question similarity models using deep learning techniques. Second, we curated and used an Arabic customer service question-similarity dataset with a 44,404 entries of question-answer pairs, called 'Tawasul.' For TAQS, first, we use transfer learning to extract the contextualized bidirectional encoder representations from transformers (BERT) embedding with bidirectional long short-term memory (BiLSTM) in two different ways. Specifically, we propose two architectures: the BERT contextual representation with BiLSTM (BERT-BiLSTM) and the hybrid transfer BERT contextual representation with BiLSTM (HT-BERT-BiLSTM). The hybrid transfer representation combines two transfer learning techniques. Second, we fine-tuned two versions of bidirectional encoder representations from transformers for Arabic language (AraBERT). The results show that the HT-BERT-BiLSTM with the features of Layer 12 reaches an accuracy of 94.45%, where the fine-tuning of AraBERTv2 and AraBERTv0.2 achieve 93.10% and 93.90% accuracy, respectively, for the Tawasul dataset. Our proposed TAQS model surpassed the performance of the state-of-the-art BiLSTM with SkipGram by a gain of 43.19% in accuracy.
CITATION STYLE
Thuwaini, W. A., & Alhumoud, S. (2022). TAQS: An Arabic Question Similarity System Using Transfer Learning of BERT With BiLSTM. IEEE Access, 10, 91509–91523. https://doi.org/10.1109/ACCESS.2022.3198955
Mendeley helps you to discover research relevant for your work.