Adversarial Cooperative Imitation Learning for Dynamic Treatment Regimes

Lu Wang; Wenchao Yu; Xiaofeng He; Wei Cheng; Martin Renqiang Ren; Wei Wang; Bo Zong; Haifeng Chen; Hongyuan Zha

Conference ProceedingsOPEN ACCESS

Adversarial Cooperative Imitation Learning for Dynamic Treatment Regimes

The Web Conference 2020 - Proceedings of the World Wide Web Conference, WWW 2020 (2020) 1785-1795

DOI: 10.1145/3366423.3380248

25Citations

30Readers

Get full text

Abstract

Recent developments in discovering dynamic treatment regimes (DTRs) have heightened the importance of deep reinforcement learning (DRL) which are used to recover the doctor's treatment policies. However, existing DRL-based methods expose the following limitations: 1) supervised methods based on behavior cloning suffer from compounding errors; 2) the self-defined reward signals in reinforcement learning models are either too sparse or need clinical guidance; 3) only positive trajectories (e.g. survived patients) are considered in current imitation learning models, with negative trajectories (e.g. deceased patients) been largely ignored, which are examples of what not to do and could help the learned policy avoid repeating mistakes. To address these limitations, in this paper, we propose the adversarial cooperative imitation learning model, ACIL, to deduce the optimal dynamic treatment regimes that mimics the positive trajectories while differs from the negative trajectories. Specifically, two discriminators are used to help achieve this goal: an adversarial discriminator is designed to minimize the discrepancies between the trajectories generated from the policy and the positive trajectories, and a cooperative discriminator is used to distinguish the negative trajectories from the positive and generated trajectories. The reward signals from the discriminators are utilized to refine the policy for dynamic treatment regimes. Experiments on the publicly real-world medical data demonstrate that ACIL improves the likelihood of patient survival and provides better dynamic treatment regimes with the exploitation of information from both positive and negative trajectories.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Wang, L., Yu, W., He, X., Cheng, W., Ren, M. R., Wang, W., … Zha, H. (2020). Adversarial Cooperative Imitation Learning for Dynamic Treatment Regimes. In The Web Conference 2020 - Proceedings of the World Wide Web Conference, WWW 2020 (pp. 1785–1795). Association for Computing Machinery, Inc. https://doi.org/10.1145/3366423.3380248

Readers' Seniority

PhD / Post grad / Masters / Doc 14

88%

Professor / Associate Prof. 1

Researcher 1

Readers' Discipline

Computer Science 13

76%

Business, Management and Accounting 2

12%

Physics and Astronomy 1

Medicine and Dentistry 1

Adversarial Cooperative Imitation Learning for Dynamic Treatment Regimes

Abstract

Author supplied keywords

References Powered by Scopus

The third international consensus definitions for sepsis and septic shock (sepsis-3)

MIMIC-III, a freely accessible critical care database

Apprenticeship learning via inverse reinforcement learning

Cited by Powered by Scopus

Multimodal sensing and therapeutic systems for wound healing and management: A review

Learning and Assessing Optimal Dynamic Treatment Regimes Through Cooperative Imitation Learning

PateGail: A Privacy-Preserving Mobility Trajectory Generator with Imitation Learning

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline