Abstract
Keyphrases associated with research papers provide an effective way to find useful information in the large and growing scholarly digital collections. However, keyphrases are not always provided with the papers, but they need to be extracted from their content. In this paper, we explore keyphrase extraction formulated as sequence labeling and utilize the power of Conditional Random Fields in capturing label dependencies through a transition parameter matrix consisting of the transition probabilities from one label to the neighboring label. We aim at identifying the features that, by themselves or in combination with others, perform well in extracting the descriptive keyphrases for a paper. Specifically, we explore word embeddings as features along with traditional, document-specific features for keyphrase extraction. Our results on five datasets of research papers show that the word embeddings combined with document specific features achieve high performance and outperform strong baselines for this task.
Author supplied keywords
Cite
CITATION STYLE
Patel, K., & Caragea, C. (2019). Exploring word embeddings in CRF-based keyphrase extraction from research papers. In K-CAP 2019 - Proceedings of the 10th International Conference on Knowledge Capture (pp. 37–44). Association for Computing Machinery, Inc. https://doi.org/10.1145/3360901.3364447
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.