Exploiting extensible background knowledge for clustering-based automatic keyphrase extraction

Hassan Alrehamy; Coral Walker

Journal ArticleOPEN ACCESS

Exploiting extensible background knowledge for clustering-based automatic keyphrase extraction

Soft Computing (2018) 22(21) 7041-7057

DOI: 10.1007/s00500-018-3414-4

12Citations

12Readers

Abstract

Keyphrases are single- or multi-word phrases that are used to describe the essential content of a document. Utilizing an external knowledge source such as WordNet is often used in keyphrase extraction methods to obtain relation information about terms and thus improves the result, but the drawback is that a sole knowledge source is often limited. This problem is identified as the coverage limitation problem. In this paper, we introduce SemCluster, a clustering-based unsupervised keyphrase extraction method that addresses the coverage limitation problem by using an extensible approach that integrates an internal ontology (i.e., WordNet) with other knowledge sources to gain a wider background knowledge. SemCluster is evaluated against three unsupervised methods, TextRank, ExpandRank, and KeyCluster, and under the F1-measure metric. The evaluation results demonstrate that SemCluster has better accuracy and computational efficiency and is more robust when dealing with documents from different domains.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Alrehamy, H., & Walker, C. (2018). Exploiting extensible background knowledge for clustering-based automatic keyphrase extraction. Soft Computing, 22(21), 7041–7057. https://doi.org/10.1007/s00500-018-3414-4

Readers over time

Readers' Seniority

PhD / Post grad / Masters / Doc 9

100%

Readers' Discipline

Computer Science 6

60%

Business, Management and Accounting 2

20%

Decision Sciences 1

10%

Nursing and Health Professions 1

10%

Exploiting extensible background knowledge for clustering-based automatic keyphrase extraction

Abstract

Author supplied keywords

References Powered by Scopus

Clustering by passing messages between data points

Features of similarity

DBpedia: A nucleus for a Web of open data

Cited by Powered by Scopus

Textual keyword extraction and summarization: State-of-the-art

AttentionRank: Unsupervised keyphrase Extraction using Self and Cross Attentions

Diverse feature set based Keyphrase extraction and indexing techniques

Register to see more suggestions

Cite

Readers over time

Readers' Seniority

Readers' Discipline