Exploring word embeddings in CRF-based keyphrase extraction from research papers

20Citations
Citations of this article
21Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Keyphrases associated with research papers provide an effective way to find useful information in the large and growing scholarly digital collections. However, keyphrases are not always provided with the papers, but they need to be extracted from their content. In this paper, we explore keyphrase extraction formulated as sequence labeling and utilize the power of Conditional Random Fields in capturing label dependencies through a transition parameter matrix consisting of the transition probabilities from one label to the neighboring label. We aim at identifying the features that, by themselves or in combination with others, perform well in extracting the descriptive keyphrases for a paper. Specifically, we explore word embeddings as features along with traditional, document-specific features for keyphrase extraction. Our results on five datasets of research papers show that the word embeddings combined with document specific features achieve high performance and outperform strong baselines for this task.

Cite

CITATION STYLE

APA

Patel, K., & Caragea, C. (2019). Exploring word embeddings in CRF-based keyphrase extraction from research papers. In K-CAP 2019 - Proceedings of the 10th International Conference on Knowledge Capture (pp. 37–44). Association for Computing Machinery, Inc. https://doi.org/10.1145/3360901.3364447

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free