A new question representation method is proposed for automated question matching over accumulated question-answer data archive. The representation defines four kinds of question words as question-type words, user-centered words, shareable-pattern words, and irrelevant words for question analysis. These question words are further annotated by a semantic labeling ontology to enhance the semantic representation for the purpose of word ambiguity reduction. We tested the matching precision on 5,000 questions with respect to various generators and the result demonstrated the stability of the method. We further compared the method with Cosine similarity and WordNet-based semantic similarity as baselines on a standard TREC dataset containing 5,536 questions. The results presented that our method improved MRR by 8.6% and accuracy by 9.6% on average, indicating its effectiveness.
CITATION STYLE
Hao, T., Qiu, X., & Jiang, S. (2015). Leveraging semantic labeling for question matching to facilitate question-answer archive reuse. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9225, pp. 65–75). Springer Verlag. https://doi.org/10.1007/978-3-319-22180-9_7
Mendeley helps you to discover research relevant for your work.