Abstract
Background: Generative artificial intelligence (GAI) systems by Google have recently been updated from Bard to Gemini and Gemini Advanced as of December 2023. Gemini is a basic, free-to-use model after a user's login, while Gemini Advanced operates on a more advanced model requiring a fee-based subscription. These systems have the potential to enhance medical diagnostics. However, the impact of these updates on comprehensive diagnostic accuracy remains unknown. Objective: This study aimed to compare the accuracy of the differential diagnosis lists generated by Gemini Advanced, Gemini, and Bard across comprehensive medical fields using case report series. Methods: We identified a case report series with relevant final diagnoses published in the American Journal Case Reports from January 2022 to March 2023. After excluding nondiagnostic cases and patients aged 10 years and younger, we included the remaining case reports. After refining the case parts as case descriptions, we input the same case descriptions into Gemini Advanced, Gemini, and Bard to generate the top 10 differential diagnosis lists. In total, 2 expert physicians independently evaluated whether the final diagnosis was included in the lists and its ranking. Any discrepancies were resolved by another expert physician. Bonferroni correction was applied to adjust the P values for the number of comparisons among 3 GAI systems, setting the corrected significance level at P value
Author supplied keywords
Cite
CITATION STYLE
Hirosawa, T., Harada, Y., Tokumasu, K., Ito, T., Suzuki, T., & Shimizu, T. (2024). Comparative Study to Evaluate the Accuracy of Differential Diagnosis Lists Generated by Gemini Advanced, Gemini, and Bard for a Case Report Series Analysis: Cross-Sectional Study. JMIR Medical Informatics, 12. https://doi.org/10.2196/63010
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.