Real-world application of large language models for automated TNM staging using unstructured gynecologic oncology reports

1Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Manual data entry in cancer registries is both time-consuming and prone to error. Although large language models (LLMs) offer promising solutions, prior studies have frequently relied on preprocessed datasets or required complex fine-tuning, limiting their applicability in clinical settings. Here, we assessed the performance of out-of-the-box LLMs on TNM classification tasks using only prompt engineering, without data anonymization or model fine-tuning. We identified manual registry error rates of 5.5–17.0% in a real-world gynecologic cancer registry. Both a cloud-based LLM (Gemini 1.5; T- and N-stage accuracy: 0.994 and 0.993, respectively) and the top-performing local model (Qwen2.5 72B; T- and N-stage accuracy: 0.971 and 0.923, respectively) outperformed existing manual entries in extracting pathological T and N classifications. These models also achieved accuracies of 0.909 and 0.895 in clinical M classification, respectively. Our approach reflects real-world clinical workflows and offers a practical solution for enhancing data integrity in clinical registries using LLMs.

Cite

CITATION STYLE

APA

Ishida, K., Murakami, R., Yamanoi, K., Hamada, K., Hasebe, K., Sakurai, A., … Mandai, M. (2025). Real-world application of large language models for automated TNM staging using unstructured gynecologic oncology reports. Npj Precision Oncology, 9(1). https://doi.org/10.1038/s41698-025-01157-4

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free