Differentially private recommender systems

作者: Frank McSherry , Ilya Mironov

DOI: 10.1145/1557019.1557090

关键词: Artificial intelligenceComputer scienceNoise (video)Recommender systemBaseline (configuration management)Differential (infinitesimal)Differential privacyPrivacy softwareInferenceMachine learning

摘要: We consider the problem of producing recommendations from collective user behavior while simultaneously providing guarantees privacy for these users. Specifically, we Netflix Prize data set, and its leading algorithms, adapted to framework differential privacy.Unlike prior work concerned with cryptographically securing computation recommendations, constrains a in way that precludes any inference about underlying records output. Such algorithms necessarily introduce uncertainty--i.e., noise--to computations, trading accuracy privacy.We find several approaches competition can be provide privacy, without significantly degrading their accuracy. To adapt explicitly factor them into two parts, an aggregation/learning phase performed guarantees, individual recommendation uses learned correlations individual's personalized recommendations. The adaptations are non-trivial, involve both careful analysis per-record sensitivity calibrate noise, as well new post-processing steps mitigate impact this noise.We measure empirical trade-off between adaptations, non-trivial formal still outperforming Cinematch baseline provides.

参考文章(20)
Chris Volinsky, Yehuda Koren, Robert M. Bell, The BellKor 2008 Solution to the Netflix Prize ,(2008)
Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, Moni Naor, Our Data, Ourselves: Privacy Via Distributed Noise Generation Advances in Cryptology - EUROCRYPT 2006. ,vol. 4004, pp. 486- 503 ,(2006) , 10.1007/11761679_29
Charu C. Aggarwal, On k -anonymity and the curse of dimensionality very large data bases. pp. 901- 909 ,(2005)
Cynthia Dwork, Frank McSherry, Kobbi Nissim, Adam Smith, Calibrating Noise to Sensitivity in Private Data Analysis Theory of Cryptography. ,vol. 3876, pp. 265- 284 ,(2006) , 10.1007/11681878_14
Justin Brickell, Vitaly Shmatikov, The cost of privacy Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD 08. pp. 70- 78 ,(2008) , 10.1145/1401890.1401904
Yehuda Koren, Factorization meets the neighborhood Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD 08. pp. 426- 434 ,(2008) , 10.1145/1401890.1401944
Avrim Blum, Cynthia Dwork, Frank McSherry, Kobbi Nissim, Practical privacy: the SuLQ framework symposium on principles of database systems. pp. 128- 138 ,(2005) , 10.1145/1065167.1065184
Esma Aïmeur, Gilles Brassard, José M. Fernandez, Flavien Serge Mani Onana, A lambic : a privacy-preserving recommender system for electronic commerce International Journal of Information Security. ,vol. 7, pp. 307- 334 ,(2008) , 10.1007/S10207-007-0049-3
John Canny, Collaborative filtering with privacy via factor analysis Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '02. pp. 238- 245 ,(2002) , 10.1145/564376.564419
Cynthia Dwork, Differential privacy: a survey of results theory and applications of models of computation. ,vol. 4978, pp. 1- 19 ,(2008) , 10.1007/978-3-540-79228-4_1