Building the “Plant glossary”—a controlled botanical vocabulary using terms extracted from the floras of North America and China

12Citations
Citations of this article
28Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Taxonomic descriptions contain valuable phenotypic data that is often not directly accessible for modern evolutionary, ecological, or biodiversity analyses. We describe a process for building a consensus-based controlled vocabulary from taxonomic descriptions for plants, which also can be applied for building controlled vocabularies for other taxon groups. Controlled vocabularies are useful as lexicons for text mining algorithms, as source of candidate terms for ontologies, and as guides to help future authors use domain vocabulary more appropriately and consistently. We extracted phenotype-describing phrases terms from descriptions of 30 volumes of the Flora of North America and Flora of China and merged these with terms from the Categorical Glossary of the Flora of North America. Seven contributors placed the terms into a set of categories until there was an agreement among two or more categorizations per term. Term categorization makes the meaning of a term more explicit for the subsequent users of the glossary. The resulting “Plant Glossary” (terms and categorization of terms) contains 9228 terms grouped in 53 categories. Differences in term categorization represented 49% of the categorization effort, and the many differences among individual classifications can be attributed to individual interpretation of terms and to the fluid nature of descriptive language used in Floras. The difficulties experienced while classifying the terms allowed us to explore cases where the use of language can hinder the accurate and detailed annotation of taxonomic descriptions. The Plant Glossary represents a significant step towards creating and enriching formal ontologies for plant phenotypes as the semantic phenomena found through this exercise is useful background information for building ontologies. The glossary has been used by new software to parse and annotate plant taxonomic descriptions, and over 6000 new terms are available for creating ontologies.

Cite

CITATION STYLE

APA

Endara, L., Cole, H. A., Gordon Burleigh, J., Nagalingum, N. S., Macklin, J. A., Liu, J., … Cui, H. (2017). Building the “Plant glossary”—a controlled botanical vocabulary using terms extracted from the floras of North America and China. Taxon, 66(4), 953–966. https://doi.org/10.12705/664.9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free