Automatic subject metadata generation for scientific documents using wikipedia and genetic algorithms

4Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Topical annotation of documents with keyphrases is a proven method for revealing the subject of scientific and research documents. However, scientific documents that are manually annotated with keyphrases are in the minority. This paper describes a machine learning-based automatic keyphrase annotation method for scientific documents, which utilizes Wikipedia as a thesaurus for candidate selection from documents' content and deploys genetic algorithms to learn a model for ranking and filtering the most probable keyphrases. Reported experimental results show that the performance of our method, evaluated in terms of inter-consistency with human annotators, is on a par with that achieved by humans and outperforms rival supervised methods. © 2012 Springer-Verlag.

Cite

CITATION STYLE

APA

Joorabchi, A., & Mahdi, A. E. (2012). Automatic subject metadata generation for scientific documents using wikipedia and genetic algorithms. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7603 LNAI, pp. 32–41). https://doi.org/10.1007/978-3-642-33876-2_6

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free