Graph-based ranking algorithm such as TextRank shows a remarkable effect on keyword extraction. However, these algorithms build graphs only considering the lexical sequence of the documents. Hence, graphs generated by these algorithm can not reflect the semantic relationships between documents. In this paper, we demonstrate that there exists an information loss in the graph-building process from textual documents to graphs. These loss will lead to the misjudgment of the algorithm. In order to solve this problem, we propose a new approach called Topic-based TextRank. Different from the traditional algorithm, our approach takes the lexical meaning of the text unit (i.e. words and phrase) into account. The result of our experiments shows that our proposed algorithm can outperform the state-of-the-art algorithms.
CITATION STYLE
Yang, K., Chen, Z., Cai, Y., Huang, D. P., & Leung, H. fung. (2016). Improved automatic keyword extraction given more semantic knowledge. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9645, pp. 112–125). Springer Verlag. https://doi.org/10.1007/978-3-319-32055-7_10
Mendeley helps you to discover research relevant for your work.