Keyword extraction from a single document using centrality measures

Girish Keshav Palshikar

Conference ProceedingsOPEN ACCESS

Keyword extraction from a single document using centrality measures

Palshikar G

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2007) 4815 LNCS 503-510

DOI: 10.1007/978-3-540-77046-6_62

70Citations

40Readers

Abstract

Keywords characterize the topics discussed in a document. Extracting a small set of keywords from a single document is an important problem in text mining. We propose a hybrid structural and statistical approach to extract keywords. We represent the given document as an undirected graph, whose vertices are words in the document and the edges are labeled with a dissimilarity measure between two words, derived from the frequency of their co-occurrence in the document. We propose that central vertices in this graph are candidates as keywords. We model importance of a word in terms of its centrality in this graph. Using graph-theoretical notions of vertex centrality, we suggest several algorithms to extract keywords from the given document. We demonstrate the effectiveness of the proposed algorithms on real-life documents. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Palshikar, G. K. (2007). Keyword extraction from a single document using centrality measures. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4815 LNCS, pp. 503–510). Springer Verlag. https://doi.org/10.1007/978-3-540-77046-6_62

Keyword extraction from a single document using centrality measures

Abstract

Cite

Register to see more suggestions