Differentially Private Reinforcement Learning

Pingchuan Ma; Zhiqiang Wang; Le Zhang; Ruming Wang; Xiaoxiang Zou; Tao Yang

Conference Proceedings

Differentially Private Reinforcement Learning

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 11999 LNCS 668-683

DOI: 10.1007/978-3-030-41579-2_39

2Citations

9Readers

Get full text

Abstract

With remarkable performance and extensive applications, reinforcement learning is becoming one of the most popular learning techniques. Often, the policy released by reinforcement learning model may contain sensitive information, and an adversary can infer demographic information through observing the output of the environment. In this paper, we formulate differential privacy in reinforcement learning contexts, design mechanisms for -greedy and Softmax in the K-armed bandit problem to achieve differentially private guarantees. Our implementation and experiments illustrate that the output policies are under good privacy guarantees with a tolerable utility cost.

Author supplied keywords

Cite

CITATION STYLE

APA

Ma, P., Wang, Z., Zhang, L., Wang, R., Zou, X., & Yang, T. (2020). Differentially Private Reinforcement Learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11999 LNCS, pp. 668–683). Springer. https://doi.org/10.1007/978-3-030-41579-2_39

Differentially Private Reinforcement Learning

Abstract

Author supplied keywords

Cite

Register to see more suggestions