A new domain independent keyphrase extraction system

Nirmala Pudota; Antonina Dattolo; Andrea Baruzzo; Carlo Tasso

Conference Proceedings

A new domain independent keyphrase extraction system

Communications in Computer and Information Science (2010) 91 CCIS 67-78

DOI: 10.1007/978-3-642-15850-6_8

13Citations

27Readers

Get full text

Abstract

In this paper we present a keyphrase extraction system that can extract potential phrases from a single document in an unsupervised, domain-independent way. We extract word n-grams from input document. We incorporate linguistic knowledge (i.e., part-of-speech tags), and statistical information (i.e., frequency, position, lifespan) of each n-gram in defining candidate phrases and their respective feature sets. The proposed approach can be applied to any document, however, in order to know the effectiveness of the system for digital libraries, we have carried out the evaluation on a set of scientific documents, and compared our results with current keyphrase extraction systems. © 2010 Springer-Verlag.

Cite

CITATION STYLE

APA

Pudota, N., Dattolo, A., Baruzzo, A., & Tasso, C. (2010). A new domain independent keyphrase extraction system. In Communications in Computer and Information Science (Vol. 91 CCIS, pp. 67–78). https://doi.org/10.1007/978-3-642-15850-6_8

A new domain independent keyphrase extraction system

Abstract

Cite

Register to see more suggestions