Classification and Explanation of Iron Deficiency Anemia from Complete Blood Count Data Using Machine Learning

Siddartha Pullakhandam; Susan McRoy

Journal ArticleOPEN ACCESS

Classification and Explanation of Iron Deficiency Anemia from Complete Blood Count Data Using Machine Learning

BioMedInformatics (2024) 4(1) 661-672

DOI: 10.3390/biomedinformatics4010036

4Citations

23Readers

Abstract

Background: Currently, discriminating Iron Deficiency Anemia (IDA) from other anemia requires an expensive test (serum ferritin). Complete Blood Count (CBC) tests are less costly and more widely available. Machine learning models have not yet been applied to discriminating IDA but do well for similar tasks. Methods: We constructed multiple machine learning methods to classify IDA from CBC data using a US NHANES dataset of over 19,000 instances, calculating accuracy, precision, recall, and precision AUC (PR AUC). We validated the results using an unseen dataset from Kenya, using the same model. We calculated ranked feature importance to explain the global behavior of the model. Results: Our model classifies IDA with a PR AUC of 0.87 and recall/sensitivity of 0.98 and 0.89 for the original dataset and an unseen Kenya dataset, respectively. The explanations indicate that low blood level of hemoglobin, higher age, and higher Red Blood Cell distribution width were most critical. We also found that optimization made only minor changes to the explanations and that the features used remained consistent with professional practice. Conclusions: The overall high performance and consistency of the results suggest that the approach would be acceptable to health professionals and would support enhancements to current automated CBC analyzers.

Author supplied keywords

Cite

CITATION STYLE

APA

Pullakhandam, S., & McRoy, S. (2024). Classification and Explanation of Iron Deficiency Anemia from Complete Blood Count Data Using Machine Learning. BioMedInformatics, 4(1), 661–672. https://doi.org/10.3390/biomedinformatics4010036

Classification and Explanation of Iron Deficiency Anemia from Complete Blood Count Data Using Machine Learning

Abstract

Author supplied keywords

Cite

Register to see more suggestions