The role of feature selection in text mining in the process of discovering missing clinical annotations – Case study

1Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Vocabulary used by the doctors to describe the results of medical procedures changes alongside with the new standards. Text data, which is immediately understandable by the medical professional, is difficult to use in mass scale analysis. Extraction of data relevant to the given case, e.g. Bethesda class, means taking on the challenge of normalizing the freeform text and all the grammatical forms associated with it. This is particularly difficult in the Polish language where words change their form significantly according to their function in the sentence. We found common black-box methods for text mining inaccurate for this purpose. Here we described a word-frequency-based method for annotation of text data for Bethesda class extraction. We compared them with an algorithm based on a decision tree C4.5. We showed how important is the choice of the method and range of features to avoid conflicting classification. Proposed algorithms allowed to avoid the rule-base limitations.

Cite

CITATION STYLE

APA

Płaczek, A., Płuciennik, A., Pach, M., Jarząb, M., & Mrozek, D. (2019). The role of feature selection in text mining in the process of discovering missing clinical annotations – Case study. In Communications in Computer and Information Science (Vol. 1018, pp. 248–262). Springer Verlag. https://doi.org/10.1007/978-3-030-19093-4_19

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free