Nebula: A Scalable Privacy-Preserving Machine Learning System in Ant Financial

4Citations
Citations of this article
25Readers
Mendeley users who have this article in their library.
Get full text

Abstract

With the rapid growth of data volume, data-driven machine learning models have become a necessary part of many industrial applications. Intuitively, the more high-quality data used for training leads to better model performance. However, in reality, data are usually scattered and isolated in different organizations or companies. Such a "data isolation" problem stimulates both academia and industry to explore the collaborative learning paradigm to build better models jointly with multiple data sources. Despite the potential performance gains, this learning paradigm inevitably faces privacy issues, especially for the Fintech domain where data are sensitive by nature. In this paper, we present a privacy-preserving collaborative learning system in Ant Financial, named Nebula. Our system aims to facilitate privacy-preserving collaborative model training for industrial-scale applications. Our system is built upon a ring-allreduce MPI based distributed framework. On top of that, with some optimization strategies and novel sharing scheme, our system is able to scale up to tens of millions of data samples with hundreds of thousands of features and achieve more than 100x speedup compared with the existing state-of-the-art implementations.

Cite

CITATION STYLE

APA

Chen, C., Wu, B., Wang, L., Chen, C., Tan, J., Wang, L., … Zhang, B. (2020). Nebula: A Scalable Privacy-Preserving Machine Learning System in Ant Financial. In International Conference on Information and Knowledge Management, Proceedings (pp. 3369–3372). Association for Computing Machinery. https://doi.org/10.1145/3340531.3417418

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free