Automatic subject metadata generation for scientific documents using wikipedia and genetic algorithms

Arash Joorabchi; Abdulhussain E. Mahdi

Conference Proceedings

Automatic subject metadata generation for scientific documents using wikipedia and genetic algorithms

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7603 LNAI 32-41

DOI: 10.1007/978-3-642-33876-2_6

4Citations

8Readers

Get full text

Abstract

Topical annotation of documents with keyphrases is a proven method for revealing the subject of scientific and research documents. However, scientific documents that are manually annotated with keyphrases are in the minority. This paper describes a machine learning-based automatic keyphrase annotation method for scientific documents, which utilizes Wikipedia as a thesaurus for candidate selection from documents' content and deploys genetic algorithms to learn a model for ranking and filtering the most probable keyphrases. Reported experimental results show that the performance of our method, evaluated in terms of inter-consistency with human annotators, is on a par with that achieved by humans and outperforms rival supervised methods. © 2012 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Joorabchi, A., & Mahdi, A. E. (2012). Automatic subject metadata generation for scientific documents using wikipedia and genetic algorithms. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7603 LNAI, pp. 32–41). https://doi.org/10.1007/978-3-642-33876-2_6

Automatic subject metadata generation for scientific documents using wikipedia and genetic algorithms

Abstract

Author supplied keywords

Cite

Register to see more suggestions