Dictionary-based biological concept extraction is still the state-of-the-art approach to large-scale biomedical literature annotation and indexing. The exact dictionary lookup is a very simple approach, but always achieves low extraction recall because a biological term often has many variants while a dic-tionary is impossible to collect all of them. We propose a generic extraction approach, referred to as approximate dictionary lookup, to cope with term variations and implement it as an extraction system called MaxMatcher. The basic idea of this approach is to capture the significant words instead of all words to a particular concept. The new approach dramatically improves the ex-traction recall while maintaining the precision. In a comparative study on GENIA corpus, the recall of the new approach reaches a 57% recall while the exact dictionary lookup only achieves a 26% recall.
CITATION STYLE
Zhou, X., Zhang, X., & Hu, X. (2006). MaxMatcher: Biological Concept Extraction Using Approximate Dictionary Lookup (pp. 1145–1149). https://doi.org/10.1007/978-3-540-36668-3_150
Mendeley helps you to discover research relevant for your work.