Large Language Models Are Poor Medical Coders — Benchmarking of Medical Code Querying

  • Soroush A
  • Glicksberg B
  • Zimlichman E
  • et al.
N/ACitations
Citations of this article
53Readers
Mendeley users who have this article in their library.

Abstract

Tokenization algorithms may to be blame when generative large language models inconsistently match medical billing codes to their preferred code descriptions.

Cite

CITATION STYLE

APA

Soroush, A., Glicksberg, B. S., Zimlichman, E., Barash, Y., Freeman, R., Charney, A. W., … Klang, E. (2024). Large Language Models Are Poor Medical Coders — Benchmarking of Medical Code Querying. NEJM AI, 1(5). https://doi.org/10.1056/aidbp2300040

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free