Unsupervised extraction of text segments from heterogeneous document collections

Hong Cui

Conference ProceedingsOPEN ACCESS

Unsupervised extraction of text segments from heterogeneous document collections

Cui H

Proceedings of the ASIST Annual Meeting (2010) 47

DOI: 10.1002/meet.14504701355

1Citations

8Readers

Abstract

This paper describes a simple, unsupervised bootstrapping procedure that identifies morphological description segments from heterogeneous biodiversity document collections. While the procedure is used to preprocess biodiversity literature for semantic annotation of morphological descriptions in our project, it also can be used to crawl the Web for morphological descriptions for a biodiversity niche search engine.

Author supplied keywords

Biodiversity document collections
Morphological description
Segment information retrieval
Semantic annotation
Unsupervised machine learning

Cite

CITATION STYLE

APA

Cui, H. (2010). Unsupervised extraction of text segments from heterogeneous document collections. In Proceedings of the ASIST Annual Meeting (Vol. 47). https://doi.org/10.1002/meet.14504701355

Unsupervised extraction of text segments from heterogeneous document collections

Abstract

Author supplied keywords

Cite

Register to see more suggestions