Abstract
Background and Objectives: The treatment of central nervous system (CNS) tumors is complex and resource-intensive, with higher mortality in underserved regions. Large language models (LLMs) show promise in medical support, but their real-world performance in CNS tumor outpatient care remains unclear. This study aims to assess the diagnostic and treatment capabilities of LLMs in bilingual clinical settings. Methods: This retrospective study evaluated three LLMs (ChatGPT-4o, DeepSeek-R1, and Doubao) in assisting neuro-oncology outpatient decision-making within bilingual (Chinese/English) clinical environments. A total of 338 outpatient cases were included, with each model assigned three clinical tasks: differential diagnosis, main diagnosis, and treatment advice. Model outputs were compared against assessments by experienced neurosurgeons. Statistical analysis employed McNemar tests (P < 0.05). Results: ChatGPT-4o and DeepSeek-R1 achieved over 90 % accuracy in differential diagnosis, showing no significant difference compared to doctors (P > 0.05), while Doubao performed significantly worse (Chinese: P = 0.02, English: P = 0.01). In main diagnosis, both ChatGPT-4o and DeepSeek-R1 showed no significant deviation from doctors performance (P > 0.05), whereas Doubao underperformed (Chinese: P = 0.019, English: P = 0.011). For treatment recommendations, all models showed reduced accuracy (ChatGPT-4o: 80.5 %; DeepSeek-R1: 79 %; Doubao: 71.3 %), significantly lower than doctors (Whether in Chinese or English: P < 0.05). No performance difference was observed between Chinese and English cases. Conclusion: LLMs show strong potential in the preliminary diagnosis and decision support for CNS tumors, and their cross-lingual adaptability underscores their clinical feasibility.
Author supplied keywords
Cite
CITATION STYLE
Pan, Y., Tian, S., Guo, J., Cai, H., Wan, J., & Fang, C. (2025). Clinical feasibility of AI Doctors: Evaluating the replacement potential of large language models in outpatient settings for central nervous system tumors. International Journal of Medical Informatics, 203. https://doi.org/10.1016/j.ijmedinf.2025.106013
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.