Detecting Duplicate Bi-Lingual Mash-up Question Pairs Using Siamese Multi-layer Perceptron Network

1Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Online Question & Answering (Q&A) platforms facilitate instant information, but with the influx of questions and answers, the response time is high and the quality of answer is compromised too. Duplicate content further corrupts the filtering mechanism. Concurrently, bi-lingual mash-up, specifically Anglicization of language, that is, to make or become English in sound, appearance, or character, is a commonly observed phenomenon on social media. This research put forwards a model for semantic matching of duplicate question pairs, where one question is in the source language (English) and the other is a mash-up (Hindi + English = Hinglish). In the proposed model, firstly language transformation is done to translate the mash-up question into the source language text and then a Siamese artificial neural network (multi-layer perceptron) is implemented to detect semantically similar question pairs using Manhattan distance function for similarity measure. The encoder vector representations and their distance are given as input to the logit (logistic regression) model for binary classification as duplicate or not-duplicate. The model achieves an accuracy of 70.09%.

Cite

CITATION STYLE

APA

Rani, S., Kumar, A., & Kumar, N. (2021). Detecting Duplicate Bi-Lingual Mash-up Question Pairs Using Siamese Multi-layer Perceptron Network. In Advances in Intelligent Systems and Computing (Vol. 1175, pp. 329–338). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-15-5619-7_23

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free