Extraction of structured data from unstructured medical records using text data mining technologies: Process automation

2Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The paper discusses technologies for processing text-based medical data stored in the Microsoft Word text format. Processing such data is aimed at data mining the text for new, potentially useful knowledge that can later be used to study various diseases and to form a personalized approach to diagnosis and treatment. During the study, 3244 depersonalized medical records of children and adolescents in Altai Krai suffering from diabetes mellitus were processed. Information is stored in the records in both structured and unstructured forms. Most of the valuable data, such as the dynamics of the disease course, patient complaints, patient's life history, etc. are kept in natural language. The difficulty of processing text medical records is associated with a great number of abbreviations, synonyms and misprints, which makes it impossible to use a unified template. Therefore, the study is aimed at minimizing information losses while extracting knowledge by means of applying various text data mining methods. The practical outcome of this study is a database containing a large amount of valuable information on diabetes mellitus, various types of its clinical course and complications. The obtained data will be further used to build mining models for diagnosing and predicting the disease and its complications. To reach the goal of the research, we used the PostgreSQL DBMS and modern linguistically oriented software created within the framework of the Python programming language and its libraries: python-docx, natasha, Natural Language Toolkit (NLTK).

Cite

CITATION STYLE

APA

Moskalev, I. V., Krotova, O. S., Khvorova, L. A., & Bobkova, D. G. (2020). Extraction of structured data from unstructured medical records using text data mining technologies: Process automation. In Journal of Physics: Conference Series (Vol. 1615). IOP Publishing Ltd. https://doi.org/10.1088/1742-6596/1615/1/012031

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free