Recent increase in online privacy concerns prompts the following question: can a recommendation engine be accurate if end-users do not entrust it with their private data? To answer this, we study the problem of predicting user-ratings under local or 'user-end' differential privacy, a powerful, formal notion of data privacy. We develop a systematic approach for lower bounds on the complexity of learning item structure from privatized user inputs, based on mutual information. Our results identify a sample complexity separation between learning in the scarce information regime and the rich information regime, thereby highlighting the role of the amount of ratings (information) available to each user. In the information-rich regime (where each user rates a constant fraction of items), a spectral clustering approach is shown to achieve optimal sample complexity. However, the information-scarce regime (where each user rates only a vanishing fraction of the total item set) is found to require a fundamentally different approach. We propose a new algorithm, MaxSense, and show that it achieves optimal sample complexity in this setting. The techniques we develop for bounding mutual information may be of broader interest. To illustrate this, we show their applicability to (i) learning based on 1-bit sketches (in contrast to differentially private sketches), and (ii) adaptive learning, where queries can be adapted based on answers to past queries. © 2012 IEEE.
CITATION STYLE
Banerjee, S., Hegde, N., & Massoulie, L. (2012). The price of privacy in untrusted recommendation engines. In 2012 50th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2012 (pp. 920–927). https://doi.org/10.1109/Allerton.2012.6483317
Mendeley helps you to discover research relevant for your work.