Abstract
Deep learning sits at the forefront of many on-going advances in a variety of learning tasks. Despite its supremacy in accuracy under benign environments, Deep learning suffers from adversarial vulnerability and privacy leakage (e.g., sensitive attribute inference) in adversarial environments. Also, many deep learning systems exhibit discriminatory behaviors against certain groups of subjects (e.g., demographic disparity). In this paper, we propose a unified information-theoretic framework to defend against sensitive attribute inference and mitigate demographic disparity in deep learning for the model partitioning scenario, by minimizing two mutual information terms. We prove that as one mutual information term decreases, an upper bound on the chance for any adversary to infer the sensitive attribute from model representations will decrease. Also, the extent of demographic disparity is bounded by the other mutual information term. Since direct optimization on the mutual information is intractable, we also propose a tractable Gaussian mixture based method and a gumbel-softmax trick based method for estimating the two mutual information terms. Extensive evaluations in a variety of application domains, including computer vision and natural language processing, demonstrate our framework's overall better performance than the existing baselines.
Author supplied keywords
Cite
CITATION STYLE
Zheng, T., & Li, B. (2022). Info Censor: An Information-Theoretic Framework against Sensitive Attribute Inference and Demographic Disparity. In ASIA CCS 2022 - Proceedings of the 2022 ACM Asia Conference on Computer and Communications Security (pp. 437–451). Association for Computing Machinery, Inc. https://doi.org/10.1145/3488932.3517402
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.