Abstract
Topical annotation of documents with keyphrases is a proven method for revealing the subject of scientific and research documents. However, scientific documents that are manually annotated with keyphrases are in the minority. This paper describes a machine learning-based automatic keyphrase annotation method for scientific documents, which utilizes Wikipedia as a thesaurus for candidate selection from documents' content and deploys genetic algorithms to learn a model for ranking and filtering the most probable keyphrases. Reported experimental results show that the performance of our method, evaluated in terms of inter-consistency with human annotators, is on a par with that achieved by humans and outperforms rival supervised methods. © 2012 Springer-Verlag.
Author supplied keywords
Cite
CITATION STYLE
Joorabchi, A., & Mahdi, A. E. (2012). Automatic subject metadata generation for scientific documents using wikipedia and genetic algorithms. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7603 LNAI, pp. 32–41). https://doi.org/10.1007/978-3-642-33876-2_6
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.