We ran both Brill's rule-based tagger and TNT, a statistical tagger, with a default German newspaper-language model on a medical text corpus. Supplied with limited lexicon resources, TNT outperforms the Brill tagger with state-of-the-art performance figures (close to 97% accuracy). We then trained TNT on a large annotated medical text corpus, with a slightly extended tagset that captures certain medical language particularities, and achieved 98% tagging accuracy. Hence, statistical off-the-shelf POS taggers cannot only be immediately reused for medical NLP, but they also achieve - when trained on medical corpora - a higher performance level than for the newspaper genre. © Springer-Verlag Berlin Heidelberg 2004.
CITATION STYLE
Hahn, U., & Wermter, J. (2004). Tagging medical documents with high accuracy. In Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 3157, pp. 852–861). Springer Verlag. https://doi.org/10.1007/978-3-540-28633-2_90
Mendeley helps you to discover research relevant for your work.