Constructing a focused taxonomy from a document collection

20Citations
Citations of this article
54Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

We describe a new method for constructing custom taxonomies from document collections. It involves identifying relevant concepts and entities in text; linking them to knowledge sources like Wikipedia, DBpedia, Freebase, and any supplied taxonomies from related domains; disambiguating conflicting concept mappings; and selecting semantic relations that best group them hierarchically. An RDF model supports interoperability of these steps, and also provides a flexible way of including existing NLP tools and further knowledge sources. From 2000 news articles we construct a custom taxonomy with 10,000 concepts and 12,700 relations, similar in structure to manually created counterparts. Evaluation by 15 human judges shows the precision to be 89% and 90% for concepts and relations respectively; recall was 75% with respect to a manually generated taxonomy for the same domain. © 2013 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Medelyan, O., Manion, S., Broekstra, J., Divoli, A., Huang, A. L., & Witten, I. H. (2013). Constructing a focused taxonomy from a document collection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7882 LNCS, pp. 367–381). Springer Verlag. https://doi.org/10.1007/978-3-642-38288-8_25

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free