Improving Large Language Model Applications in the Medical and Nursing Domains With Retrieval-Augmented Generation: Scoping Review

Yiqun Miao; Yuhan Zhao; Yuan Luo; Huiying Wang; Ying Wu

Article

Improving Large Language Model Applications in the Medical and Nursing Domains With Retrieval-Augmented Generation: Scoping Review

Journal of Medical Internet Research

DOI: 10.2196/80557

7Citations

68Readers

Get full text

Abstract

Background: Retrieval-augmented generation (RAG) is increasingly used to improve large language models in the medical and nursing domains. However, a comprehensive understanding of its specific architecture and applications in medical and nursing reasoning remains limited. Objective: We aimed to summarize the current state, existing limitations, and future development directions of RAG in the medical and nursing domains. Methods: The PubMed, Web of Science, IEEE Xplore, and arXiv databases were searched for relevant articles using queries that combined terms related to RAG, medical, and nursing domains, covering the period from November 1, 2022, to May 31, 2025. This review was conducted following the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) guidelines. Results: A total of 917 articles were retrieved, of which 67 met the inclusion criteria. Most studies focused on the medical domain (63/67, 94%), while only a few addressed nursing applications (4/67, 6%). The RAG frameworks included in this review were categorized into 5 functional types: text-based RAG (36/67, 54%), knowledge graph–enhanced RAG (17/67, 25%), agentic RAG (6/67, 9%), multimodal RAG (2/67, 3%), and plug-and-play RAG (6/67, 9%). On the basis of the Simon decision-making process theory, we divided the RAG workflow into 4 stages: intent recognition, knowledge retrieval, knowledge integration, and generation. Only 26 studies included explicit reasoning support, and few were aligned with real-world clinical workflows. Only 12 studies attempted to address ethical considerations related to RAG. Conclusions: We identified 4 key shifts in recent RAG development: shifting from surface-level matching toward contextualized intent recognition, from vague semantics toward logic-driven dynamic retrieval, from passive toward active knowledge retrieval, and from simple aggregation toward coherent context construction. However, most RAG systems in the medical and nursing domains have not yet introduced reasoning methods, and those that have are still predominantly reliant on data-driven associations without causal modeling. This highlights the need to integrate causal mechanisms for more effective and domain-relevant reasoning in health care.

Author supplied keywords

Cite

CITATION STYLE

APA

Miao, Y., Zhao, Y., Luo, Y., Wang, H., & Wu, Y. (2025). Improving Large Language Model Applications in the Medical and Nursing Domains With Retrieval-Augmented Generation: Scoping Review. Journal of Medical Internet Research. JMIR Publications Inc. https://doi.org/10.2196/80557

Improving Large Language Model Applications in the Medical and Nursing Domains With Retrieval-Augmented Generation: Scoping Review

Abstract

Author supplied keywords

Cite

Register to see more suggestions