Duplicate Question Identification (DQI) improves the processing efficiency and accuracy of large-scale community question answering and automatic QA system. The purpose of DQI task is to identify whether the paired questions are semantically equivalent. However, how to distinguish the synonyms or homonyms in paired questions is still challenging. Most previous works focus on the word-level or phrase-level semantic differences. We firstly propose to explore the asking emphasis of a question as a key factor in DQI. Asking emphasis bridges semantic equivalence between two questions. In this paper, we propose an attention model with multi-fusion asking emphasis (MFAE) for DQI. At first, BERT is used to obtain the dynamic pre-trained word embeddings. Then we get inter- and intra-asking emphasis by summing inter-attention and self-attention, respectively; the idea is that, the more a word interacts with others, the more important the word is. Finally, we use eight-way combinations to generate multi-fusion asking emphasis and multi-fusion word representation. Experimental results demonstrate that our model achieves state-of-the-art performance on both Quora Question Pairs and CQADupStack data. In addition, our model can also improve the results for natural language inference task on SNLI and MultiNLI datasets. The code is available at https://github.com/rzhangpku/MFAE.
CITATION STYLE
Zhang, R., Zhou, Q., Wu, B., Li, W., & Mo, T. (2020). What do questions exactly ask? mfae: duplicate question identification with multi-fusion asking emphasis. In Proceedings of the 2020 SIAM International Conference on Data Mining, SDM 2020 (pp. 226–234). Society for Industrial and Applied Mathematics Publications. https://doi.org/10.1137/1.9781611976236.26
Mendeley helps you to discover research relevant for your work.