Online Question & Answering (Q&A) platforms facilitate instant information, but with the influx of questions and answers, the response time is high and the quality of answer is compromised too. Duplicate content further corrupts the filtering mechanism. Concurrently, bi-lingual mash-up, specifically Anglicization of language, that is, to make or become English in sound, appearance, or character, is a commonly observed phenomenon on social media. This research put forwards a model for semantic matching of duplicate question pairs, where one question is in the source language (English) and the other is a mash-up (Hindi + English = Hinglish). In the proposed model, firstly language transformation is done to translate the mash-up question into the source language text and then a Siamese artificial neural network (multi-layer perceptron) is implemented to detect semantically similar question pairs using Manhattan distance function for similarity measure. The encoder vector representations and their distance are given as input to the logit (logistic regression) model for binary classification as duplicate or not-duplicate. The model achieves an accuracy of 70.09%.
CITATION STYLE
Rani, S., Kumar, A., & Kumar, N. (2021). Detecting Duplicate Bi-Lingual Mash-up Question Pairs Using Siamese Multi-layer Perceptron Network. In Advances in Intelligent Systems and Computing (Vol. 1175, pp. 329–338). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-15-5619-7_23
Mendeley helps you to discover research relevant for your work.