Cynthia Dwork (Microsoft); Differential Privacy: A Survey of Results; Manindra Agrawal, Angsheng Li (editors); In *Theory and Applications of Models of Computations* (TAMC); 2008; 19 pages.

## Abstract

Over the past five years a new approach to privacy-preserving data analysis has born fruit [13, 18, 7, 19, 5, 37, 35, 8, 32]. This approach differs from much (but not all!) of the related literature in the statistics, databases, theory, and cryptography communities, in that a formal and ad omnia privacy guarantee is defined, and the data analysis techniques presented are rigorously proved to satisfy the guarantee. The key privacy guarantee that has emerged is differential privacy. Roughly speaking, this ensures that (almost, and quantifiably) no risk is incurred by joining a statistical database. In this survey, we recall the definition of differential privacy and two basic techniques for achieving it. We then show some interesting applications of these techniques, presenting algorithms for three specific tasks and three general results on differentially private learning.

## Highlighted

- B. Barak, K. Chaudhuri, C. Dwork, S. Kale, F. McSherry, K. Talwar. Privacy, Accuracy, and Consistency Too. A Holistic Solution to Contingency Table Release. In
*Proceedings of the 26th Symposium on Principles of Database Systems*, pages 273–282 (2007)
- A. Blum, C. Dwork, F. McSherry, K. Nissim. Practical Privacy. The SuLQ framework. In
*Proceedings of the 24th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems* (2005-06)
- A. Blum, K. Ligett, A. Roth. A Learning Theory Approach to Non-Interactive Database Privacy. In
*Proceedings of the 40th ACM SIGACT Symposium on Thoery of Computing* (2008)
- I. Dinur, K. Nissim. Revealing Information While Preserving Privacy. In
*Proceedings of the Twenty-Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems*, pp. 202–210 (2003)
- C. Dwork, K. Nissim. Privacy-Preserving Datamining on Vertically Partitioned Databases. In M. Franklin (editor),
*Proceedings of CRYPTO*, 2004. LNCS, vol. 3152, pp. 528–544. Springer, Heidelberg (2004)
- C. Dwork, F. McSherry, K. Nissim, A. Smith. Calibrating Noise to Sensitivity in Private Data Analysis. In
*Proceedings of the 3rd Theory of Cryptography Conference*, pp. 265–284 (2006)
- S. Kasiviswanathan, H. Lee, K. Nissim, S. Raskhodnikova, S. Smith. What Can We Learn Privately? (manuscript, 2007)
- F. McSherry, K. Talwar. Mechanism Design via Differential Privacy. In
*Proceedings of the 48th Annual Symposium on Foundations of Computer Science* (FOCS) (2007)
- K. Nissim, S. Raskhodnikova, A. Smith. Smooth Sensitivity and Sampling in Private Data Analysis. In
*Proceedings of the 39th ACM Symposium on Theory of Computing*, pages 75–84 (2007)

## References

- J.O. Achugbue, F.Y. Chin. The Effectiveness of Output Modification by Rounding for Protection of Statistical Databases. In
*INFOR* 17(3), 209–218 (1979)
- N.R. Adam, J.C. Wortmann: Security-Control Methods for Statistical Databases. A Comparative Study. In
*omputing Surveys*, ACM, 21(4), 515–556 (1989)
- D. Agrawal, C. Aggarwal: On the Design and Quantification of Privacy Preserving Data Mining Algorithms. In
*Proceedings of the 20th Symposium on Principles of Database Systems* (2001)
- R. Agrawal, R. Srikant. Privacy-Preserving Data Mining. In
*Proceedings of the ACM SIGMOD Conference on Management of Data*, pages 439–450 (2000)
- B. Barak, K. Chaudhuri, C. Dwork, S. Kale, F. McSherry, K. Talwar. Privacy, Accuracy, and Consistency Too. A Holistic Solution to Contingency Table Release. In
*Proceedings of the 26th Symposium on Principles of Database Systems*, pages 273–282 (2007)
- L.L. Beck. A Security Mechanism for Statistical Databases. In
*Transactions on Database Systems*, ACM, 5(3), 316–338 (1980)
- A. Blum, C. Dwork, F. McSherry, K. Nissim. Practical Privacy. The SuLQ framework. In
*Proceedings of the 24th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems* (2005-06)
- A. Blum, K. Ligett, A. Roth. A Learning Theory Approach to Non-Interactive Database Privacy. In
*Proceedings of the 40th ACM SIGACT Symposium on Thoery of Computing* (2008)
- S. Chawla, C. Dwork, F. McSherry, A. Smith, H. Wee. Toward Privacy in Public Databases. In
*Proceedings of the 2nd Theory of Cryptography Conference* (2005)
- T. Dalenius. Towards a methodology for statistical disclosure control. In
*Statistik Tidskrift*, 15, 222–429 (1977)
- D.E. Denning. Secure Statistical Databases with Random Sample Queries. In
*Transactions on Database Systems*, ACM, 5(3), 291–315 (1980)
- D. Denning, P. Denning, M. Schwartz. The Tracker. A Threat to Statistical Database Security. In
*Transactions on Database Systems*, ACM, 4(1), 76–96 (1979)
- I. Dinur, K. Nissim. Revealing Information While Preserving Privacy. In
*Proceedings of the Twenty-Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems*, pp. 202–210 (2003)
- G. Duncan. Confidentiality and statistical disclosure limitation. In N. Smelser, P. Baltes (editors)
*International Encyclopedia of the Social and Behavioral Sciences*, Elsevier, New York (2001)
- C. Dwork. Differential Privacy. In, M. Bugliesi, B. Preneel, V. Sassone, I. Wegener (editors).
*Proceedings of the International Colloquium on Languages, Automata and Programming* (ICALP). 2006. LNCS Vol. 4052, pp. 1–12. Springer, Heidelberg (2006)
- C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, M. Naor. Our Data, Ourselves. Privacy Via Distributed Noise Generation. In S. Vaudenay (editor),
*Proceedings of EUROCRYPT*, 2006. LNCS, vol. 4004, pp. 486–503. Springer, Heidelberg (2006)
- C. Dwork, F. McSherry, K. Talwar. The Price of Privacy and the Limits of LP Decoding. In
*Proceedings of the 39th ACM Symposium on Theory of Computing*, pp. 85–94 (2007)
- C. Dwork, K. Nissim. Privacy-Preserving Datamining on Vertically Partitioned Databases. In M. Franklin (editor),
*Proceedings of CRYPTO*, 2004. LNCS, vol. 3152, pp. 528–544. Springer, Heidelberg (2004)
- C. Dwork, F. McSherry, K. Nissim, A. Smith. Calibrating Noise to Sensitivity in Private Data Analysis. In
*Proceedings of the 3rd Theory of Cryptography Conference*, pp. 265–284 (2006)
- C. Dwork, S. Yekhanin. New Efficient Attacks on Statistical Disclosure Control Mechanisms (manuscript, 2008)
- A.V. Evfimievski, J. Gehrke, R. Srikant. Limiting Privacy Breaches in Privacy Preserving Data Mining. In
*Proceedings of the Twenty-Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems*, pp. 211–222 (2003)
- D. Agrawal, C.C. Aggarwal. On the design and Quantification of Privacy Pre- serving Data Mining Algorithms. In
*Proceedings of the 20th Symposium on Principles of Database Systems*, pages 247–255 (2001)
- R. Agrawal, R. Srikant. Privacy-Preserving Data Mining. In
*Proceedings of the ACM SIGMOD International Conference on Management of Data*, pages 439–450 (2000)
- F.Y. Chin, G. Ozsoyoglu. Auditing and infrence control in statistical databases. In
*Transactions on Software Engineering*, IEEE, SE-8(6), 113–139 (1982)
- D. Dobkin, A. Jones, R. Lipton. Secure Databases. Protection Against User Influence. In
*Transactions on Databases*, ACM, S 4(1), 97–106 (1979)
- I. Fellegi. On the question of statistical confidentiality. In
*Journal of the American Statistical Association* 67, 7–18 (1972)
- S. Fienberg. Confidentiality and Data Protection Through Disclosure Limitation. Evolving Principles and Technical Advances. In
*IAOS Conference on Statistics, Development and Human Rights* (2000-09),
- S. Fienberg, U. Makov, R. Steele. Disclosure Limitation and Related Methods for Categorical Data. In
*Journal of Official Statistics*, 14, 485–502 (1998)
- L. Franconi, G. Merola. Implementing Statistical Disclosure Control for Aggregated Data Released Via Remote Access. In. United Nations Statistical Commission and European Commission, joint ECE/EUROSTAT work session on statistical data confidentiality, Working Paper No.30 (2003-04),
- S. Goldwasser, S. Micali. Probabilistic Encryption. J. Comput. Syst. Sci. 28(2), 270–299 (1984)
- D. Gusfield. A Graph Theoretic Approach to Statistical Data Security. In
*Journal of Computing*, SIAM, 17(3), 552–571 (1988)
- S. Kasiviswanathan, H. Lee, K. Nissim, S. Raskhodnikova, S. Smith. What Can We Learn Privately? (manuscript, 2007)
- E. Lefons, A. Silvestri, F. Tangorra. An analytic approach to statistical databases. In
*Proceedings of the 9th International Conference on Very Large Data Bases* (VLDB), 1983-10/1983-11, pages 260–274. Morgan Kaufmann, San Francisco (1983)
- A. Machanavajjhala, J. Gehrke, D. Kifer, M. Venkitasubramaniam. l-Diversity. Privacy Beyond k-Anonymity. In
*Proceedings of the 22nd International Conference on Data Engineering* (ICDE 2006), page 24 (2006)
- F. McSherry, K. Talwar. Mechanism Design via Differential Privacy. In
*Proceedings of the 48th Annual Symposium on Foundations of Computer Science* (FOCS) (2007)
- A. Narayanan, V. Shmatikov. How to Break Anonymity of the Netflix Prize Dataset,
- K. Nissim, S. Raskhodnikova, A. Smith. Smooth Sensitivity and Sampling in Private Data Analysis. In
*Proceedings of the 39th ACM Symposium on Theory of Computing*, pages 75–84 (2007)
- T.E. Raghunathan, J.P. Reiter, D.B. Rubin. Multiple Imputation for Statistical Disclosure Limitation.
*Journal of Official Statistics*, 19(1), 1–16 (2003)
- S. Reiss. Practical Data Swapping. The First Steps. In
*Transactions on Database Systems*, ACM, 9(1), 20–37 (1984)
- D.B. Rubin. Discussion. Statistical Disclosure Limitation. In
*Journal of Official Statistics*, 9(2), 461–469 (1993)
- A. Shoshani. Statistical databases. Characteristics, problems and some solutions. In
*Proceedings of the 8th International Conference on Very Large Data Bases* (VLDB 1982), pages 208–222 (1982)
- P. Samarati, L. Sweeney. Protecting Privacy when Disclosing Information. k-Anonymity and its Enforcement Through Generalization and Specialization, Technical Report SRI-CSL-98-04, SRI International. (1998)
- P. Samarati, L. Sweeney. Generalizing Data to Provide Anonymity when Disclosing Information (Abstract). In
*Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems*, pages 188 (1998)
- L. Sweeney. Weaving Technology and Policy Together to Maintain Confidentiality. in
*Journal of Law Medicine & Ethics*, 25(2-3), 98–110 (1997)
- L. Sweeney. k-anonymity. A Model for Protecting Privacy. International Journal on Uncertainty, In
*Fuzziness and Knowledge-based Systems* 10(5), 557–570 (2002)
- L. Sweeney. Achieving k-Anonymity Privacy Protection Using Generalization and Suppression. In
*International Journal on Uncertainty, Fuzziness and Knowledge-based Systems*, 10(5), pages 571–588 (2002)
- L.G. Valiant. A Theory of the Learnable. In
*Proceedings of the 16th Annual ACM SIGACT Symposium on Theory of Computing*, pages 436–445 (1984)
- X. Xiao, Y. Tao. M-invariance. towards privacy preserving re-publication of dynamic datasets. In
*Proceedings of ACM Special Interest Group on the Management of Data*(SIGMOD) 2007, pages 689–700 (2007)