Adversarial Cooperative Imitation Learning for Dynamic Treatment Regimes

19Citations
Citations of this article
30Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Recent developments in discovering dynamic treatment regimes (DTRs) have heightened the importance of deep reinforcement learning (DRL) which are used to recover the doctor's treatment policies. However, existing DRL-based methods expose the following limitations: 1) supervised methods based on behavior cloning suffer from compounding errors; 2) the self-defined reward signals in reinforcement learning models are either too sparse or need clinical guidance; 3) only positive trajectories (e.g. survived patients) are considered in current imitation learning models, with negative trajectories (e.g. deceased patients) been largely ignored, which are examples of what not to do and could help the learned policy avoid repeating mistakes. To address these limitations, in this paper, we propose the adversarial cooperative imitation learning model, ACIL, to deduce the optimal dynamic treatment regimes that mimics the positive trajectories while differs from the negative trajectories. Specifically, two discriminators are used to help achieve this goal: an adversarial discriminator is designed to minimize the discrepancies between the trajectories generated from the policy and the positive trajectories, and a cooperative discriminator is used to distinguish the negative trajectories from the positive and generated trajectories. The reward signals from the discriminators are utilized to refine the policy for dynamic treatment regimes. Experiments on the publicly real-world medical data demonstrate that ACIL improves the likelihood of patient survival and provides better dynamic treatment regimes with the exploitation of information from both positive and negative trajectories.

Cite

CITATION STYLE

APA

Wang, L., Yu, W., He, X., Cheng, W., Ren, M. R., Wang, W., … Zha, H. (2020). Adversarial Cooperative Imitation Learning for Dynamic Treatment Regimes. In The Web Conference 2020 - Proceedings of the World Wide Web Conference, WWW 2020 (pp. 1785–1795). Association for Computing Machinery, Inc. https://doi.org/10.1145/3366423.3380248

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free