Evaluating base and retrieval augmented LLMs with document or online support for evidence based neurology

15Citations
Citations of this article
50Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Effectively managing evidence-based information is increasingly challenging. This study tested large language models (LLMs), including document- and online-enabled retrieval-augmented generation (RAG) systems, using 13 recent neurology guidelines across 130 questions. Results showed substantial variability. RAG improved accuracy compared to base models but still produced potentially harmful answers. RAG-based systems performed worse on case-based than knowledge-based questions. Further refinement and improved regulation is needed for safe clinical integration of RAG-enhanced LLMs.

Cite

CITATION STYLE

APA

Masanneck, L., Meuth, S. G., & Pawlitzki, M. (2025). Evaluating base and retrieval augmented LLMs with document or online support for evidence based neurology. Npj Digital Medicine, 8(1). https://doi.org/10.1038/s41746-025-01536-y

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free