HuatuoGPT, Towards Taming Language Models To Be a Doctor

16Citations
Citations of this article
55Readers
Mendeley users who have this article in their library.

Abstract

In this paper, we present HuatuoGPT, a Large Language Model (LLM) for medical consultation. The core recipe of HuatuoGPT is to leverage both distilled data from ChatGPT and real-world data from doctors in the supervised fine-tuning stage. This is not only because purely using ChatGPT-distilled data might cause 'model collapse', but also because real-world data from doctors would be complementary to ChatGPT-distilled data. The responses from ChatGPT are usually detailed, well-presented, fluent, and instruction-followed, but it cannot perform like a doctor in many aspects, e.g. for interactive diagnosis. Therefore, the extra doctors' data could tame a distilled language model to perform like doctors. To synergize the strengths of both data sources, we introduce RLMF (Reinforcement Learning from Mixed Feedback) where a reward model is trained to align the language model with the merits that both sources (ChatGPT and doctors) bring. Experimental results (in GPT-4 evaluation, human evaluation, and medical benchmark datasets) demonstrate that HuatuoGPT achieves state-of-the-art results in performing medical consultation among open-source LLMs. It is worth noting that by using additional real-world data and RLMF, the distilled language model (i.e., HuatuoGPT) outperforms its teacher model (i.e., ChatGPT) in most cases.

Cite

CITATION STYLE

APA

Zhang, H., Chen, J., Jiang, F., Yu, F., Chen, Z., Li, J., … Li, H. (2023). HuatuoGPT, Towards Taming Language Models To Be a Doctor. In Findings of the Association for Computational Linguistics: EMNLP 2023 (pp. 10859–10885). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.findings-emnlp.725

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free