CASCADENET: An LSTM based deep learning model for automated ICD-10 coding

8Citations
Citations of this article
26Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, a cascading hierarchical architecture using LSTM is proposed for automatic mapping of ICD-10 codes from clinical documents. The fact that it becomes increasingly difficult to train a robust classifier as the number of classes (over 93k ICD-10 codes) grows, coupled with other challenges such as the variance in length, structure and context of the text data, and the lack of training data, puts this task among some of the hardest tasks of Machine Learning (ML) and Natural Language Processing (NLP). This work evaluates the performance of various methods on this task, which include basic techniques such as TF-IDF, inverted indexing using concept aggregation based on exhaustive Unified Medical Language System (UMLS) knowledge sources, as well as advanced methods such as SVM trained on a bag-of-words model, CNN and LSTM trained on distributed word embeddings. The effect of breaking down the problem into a hierarchy is also explored. Data used is an aggregate of ICD-10 long descriptions along with anonymised annotated training data provided by few of the private hospitals from India. A study of the above-mentioned techniques leads to the observation that hierarchical LSTM network outperforms other methods in terms of accuracy as well as micro and macro-averaged precision and recall scores on the held out data (or test data).

Cite

CITATION STYLE

APA

Azam, S. S., Raju, M., Pagidimarri, V., & Kasivajjala, V. C. (2020). CASCADENET: An LSTM based deep learning model for automated ICD-10 coding. In Lecture Notes in Networks and Systems (Vol. 70, pp. 55–74). Springer. https://doi.org/10.1007/978-3-030-12385-7_6

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free