Automatic keyphrase extraction by bridging vocabulary gap

68Citations
Citations of this article
139Readers
Mendeley users who have this article in their library.

Abstract

Keyphrase extraction aims to select a set of terms from a document as a short summary of the document. Most methods extract keyphrases according to their statistical properties in the given document. Appropriate keyphrases, however, are not always statistically significant or even do not appear in the given document. This makes a large vocabulary gap between a document and its keyphrases. In this paper, we consider that a document and its keyphrases both describe the same object but are written in two different languages. By regarding keyphrase extraction as a problem of translating from the language of documents to the language of keyphrases, we use word alignment models in statistical machine translation to learn translation probabilities between the words in documents and the words in keyphrases. According to the translation model, we suggest keyphrases given a new document. The suggested keyphrases are not necessarily statistically frequent in the document, which indicates that our method is more flexible and reliable. Experiments on news articles demonstrate that our method outperforms existing unsupervised methods on precision, recall and F-measure. © 2011 Association for Computational Linguistics.

Cite

CITATION STYLE

APA

Liu, Z., Chen, X., Zheng, Y., & Sun, M. (2011). Automatic keyphrase extraction by bridging vocabulary gap. In CoNLL 2011 - Fifteenth Conference on Computational Natural Language Learning, Proceedings of the Conference (pp. 135–143).

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free