MorphoSaurus in ImageCLEF 2006: The effect of subwords on biomedical IR

1Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In the 2006 ImageCLEF Medical Image Retrieval task we evaluate the effects of deep morphological analysis for mono- and crosslingual document retrieval in the biomedical domain. The morphological analysis is based on the MORPHOSAURUS system in which subwords are introduced as morphologically meaningful word units. Subwords are organized in language specific lexica that were partly manually and partly automatically generated and currently cover six European languages. They are linked together in a multilingual thesaurus. The use of subwords instead of full words significantly reduces the number of lexical entries that are needed to sufficiently cover a specific language and domain. A further benefit of the approach is its independence from the underlying retrieval system, We combined MORPHOSAURUS with the open-source search engine Lucene and achieved precision gains of up to 25% over the baseline for a monolingual setting and promising results in a multilingual scenario. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Daumke, P., Paetzold, J., & Mark, K. (2007). MorphoSaurus in ImageCLEF 2006: The effect of subwords on biomedical IR. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4730 LNCS, pp. 652–659). Springer Verlag. https://doi.org/10.1007/978-3-540-74999-8_80

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free