Rare Codes Count: Mining Inter-code Relations for Long-tail Clinical Text Classification

2Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

Abstract

Multi-label clinical text classification, such as automatic ICD coding, has always been a challenging subject in Natural Language Processing, due to its long, domain-specific documents and long-tail distribution over a large label set. Existing methods adopt different model architectures to encode the clinical notes. Whereas without digging out the useful connections between labels, the model presents a huge gap in predicting performances between rare and frequent codes. In this work, we propose a novel method for further mining the helpful relations between different codes via a relationenhanced code encoder to improve the rare code performance. Starting from the simple code descriptions, the model reaches comparable, even better performances than models with heavy external knowledge. Our proposed method is evaluated on MIMIC-III, a common dataset in the medical domain. It outperforms the previous state-of-art models on both overall metrics and rare code performances. Moreover, the interpretation results further prove the effectiveness of our methods. Our code is publicly available1 .

References Powered by Scopus

Focal Loss for Dense Object Detection

17820Citations
N/AReaders
Get full text

Neural machine translation of rare words with subword units

4514Citations
N/AReaders
Get full text

BioBERT: A pre-trained biomedical language representation model for biomedical text mining

3977Citations
N/AReaders
Get full text

Cited by Powered by Scopus

GP-HTNLoc: A graph prototype head-tail network-based model for multi-label subcellular localization prediction of ncRNAs

2Citations
N/AReaders
Get full text

A Novel ICD Coding Method Based on Associated and Hierarchical Code Description Distillation

0Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Chen, J., Li, X., Xi, J., Yu, L., & Xiong, H. (2023). Rare Codes Count: Mining Inter-code Relations for Long-tail Clinical Text Classification. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 403–413). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.clinicalnlp-1.43

Readers' Seniority

Tooltip

Researcher 2

50%

Lecturer / Post doc 1

25%

PhD / Post grad / Masters / Doc 1

25%

Readers' Discipline

Tooltip

Computer Science 5

71%

Medicine and Dentistry 2

29%

Save time finding and organizing research with Mendeley

Sign up for free