Abstract
Radiology is used as an important assessment for patients with pulmonary disease. The radiology images are usually accompanied by a written report from a radiologist to be passed to the other referring physicians. These radiology reports are written in a natural language where they can have different systematic structures based on the language used. In our study, the radiology reports were collected from an Indonesian hospital and written in Bahasa Indonesia. We performed an automatic text classification to differentiate the information written in the radiology reports into two classes, COVID-19 and non COVID-19. To find the best model, we evaluated several embedding techniques available for Bahasa and five Machine Learning (ML) models, namely (1) XGBoost, (2) fastText, (3) LSTM, (4) Bi-LSTM and (5) IndoBERT. The result shows that IndoBERT outperformed the others with an accuracy of 98%. In terms of training speed, the shallow neural network architecture implemented with the fastText library can train the model in under one second and still result in a reasonably good accuracy of 86%.
Author supplied keywords
Cite
CITATION STYLE
Qomariyah, N. N., Araminta, A. S., Reynaldi, R., Senjaya, M., Asri, S. D. A., & Kazakov, D. (2022). NLP Text Classification for COVID-19 Automatic Detection from Radiology Report in Indonesian Language. In 2022 5th International Seminar on Research of Information Technology and Intelligent Systems, ISRITI 2022 (pp. 565–569). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/ISRITI56927.2022.10053077
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.