Hybrid text segmentation for Hungarian clinical records

György Orosz; Attila Novaḱ; Gab́or Prósźeky

Conference Proceedings

Hybrid text segmentation for Hungarian clinical records

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2013) 8265 LNAI(PART 1) 306-317

DOI: 10.1007/978-3-642-45114-0_25

N/ACitations

10Readers

Get full text

Abstract

Nowadays clinical documents are getting widely available to researchers who are aiming to develop resources and tools that may help clinicians in their work. While several attempts exist for English medical text processing, there are only few for other languages. Moreover, word and sentence segmentation tasks are commonly treated as simple engineering issues. In this study, we introduce the difficulties that arise during the segmentation of Hungarian clinical records, and describe a complex method that results in a normalized and segmented text. Our approach is a hybrid combination of a rule-based and an unsupervised statistical solution. The presented system is compared with other algorithms that are available and commonly used. These fail to segment clinical text (all of them reach F-scores below 75%), while our method scores above 90%. This means that only the hybrid tool described in this study can be used for the segmentation of Hungarian clinical texts in practical applications. © Springer-Verlag 2013.

Author supplied keywords

Cite

CITATION STYLE

APA

Orosz, G., Novaḱ, A., & Prósźeky, G. (2013). Hybrid text segmentation for Hungarian clinical records. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8265 LNAI, pp. 306–317). https://doi.org/10.1007/978-3-642-45114-0_25

Hybrid text segmentation for Hungarian clinical records

Abstract

Author supplied keywords

Cite

Register to see more suggestions