Performance of GPT-4 and GPT-3.5 in generating accurate and comprehensive diagnoses across medical subspecialties

Dik Wai Anderson Luk; Whitney Chin Tung Ip; Yat Fung Shea

Journal ArticleOPEN ACCESS

Performance of GPT-4 and GPT-3.5 in generating accurate and comprehensive diagnoses across medical subspecialties

Journal of the Chinese Medical Association (2024) 87(3) 259-260

DOI: 10.1097/JCMA.0000000000001064

14Citations

21Readers

Get full text

Abstract

Artificial intelligence has demonstrated a promising potential for diagnosing complex medical cases, with Generative Pre-Trained Transformer 4 (GPT-4) being the most recent advancement in this field. This study evaluated the diagnostic performance of the GPT-4 in comparison with that of its predecessor, GPT-3.5, using 81 complex medical case records from the New England Journal of Medicine. The cases were categorized as cognitive impairment, infectious disease, rheumatology, or drug reactions. The GPT-4 achieved a primary diagnostic accuracy of 38.3%, which improved to 71.6% when differential diagnoses were included. In 84.0% of cases, primary diagnoses were made by conducting investigations suggested by GPT-4. GPT-4 outperformed GPT-3.5 in all subspecialties except for drug reactions. GPT-4 demonstrated the highest performance in infectious diseases and drug reactions, whereas it underperformed in cases of cognitive impairment. These findings indicate that GPT-4 can provide reasonably accurate diagnoses, comprehensive differential diagnoses, and appropriate investigations. However, its performance varies across subspecialties.

Author supplied keywords

Cite

CITATION STYLE

APA

Luk, D. W. A., Ip, W. C. T., & Shea, Y. F. (2024). Performance of GPT-4 and GPT-3.5 in generating accurate and comprehensive diagnoses across medical subspecialties. Journal of the Chinese Medical Association, 87(3), 259–260. https://doi.org/10.1097/JCMA.0000000000001064

Performance of GPT-4 and GPT-3.5 in generating accurate and comprehensive diagnoses across medical subspecialties

Abstract

Author supplied keywords

Cite

Register to see more suggestions