Subtopic Deep Dive

← Privacy-Preserving Technologies in Data

Differential Privacy
Research Guide

What is Differential Privacy?

Differential privacy is a mathematical framework that quantifies privacy loss by ensuring that the output of a data analysis algorithm changes by at most a small amount when any single individual's data is added or removed.

Introduced by Cynthia Dwork in 2006, differential privacy adds calibrated noise to query outputs to provide provable privacy guarantees. Key mechanisms include the Laplace mechanism for numeric queries and the exponential mechanism for non-numeric selections. Foundational surveys by Dwork and Roth (2014) have over 3800 citations and cover composition theorems and advanced applications.

Curated Papers

Key Challenges

Why It Matters

Differential privacy enables safe data sharing in healthcare, as in federated learning for medical imaging (Kaissis et al., 2020; 1148 citations) and digital health (Rieke et al., 2020; 2068 citations). In machine learning, it protects against model inversion attacks (Fredrikson et al., 2015; 2650 citations) and inference attacks (Nasr et al., 2019; 1457 citations). Real-world systems like Google's RAPPOR (Erlingsson et al., 2014; 1454 citations) apply it for crowdsourced statistics from billions of client devices.

Key Research Challenges

Privacy-Utility Tradeoff

Adding noise for privacy degrades query accuracy, especially for complex statistics or high-dimensional data. Dwork and Roth (2014) analyze how epsilon parameter balances this tradeoff. Achieving strong utility requires advanced mechanisms like smooth sensitivity.

Composition Across Queries

Sequential queries accumulate privacy loss under basic composition theorems. Advanced composition provides tighter bounds but remains complex for adaptive queries (Dwork and Roth, 2013). Researchers seek optimal composition for machine learning pipelines.

Integration with ML Models

Applying differential privacy to deep learning increases gradient noise, slowing convergence. Wei et al. (2020; 1991 citations) propose algorithms for federated learning with DP. Attacks like model inversion (Fredrikson et al., 2015) highlight vulnerabilities in non-private models.

Essential Papers

The Algorithmic Foundations of Differential Privacy

Cynthia Dwork, Aaron Roth · 2013 · now publishers, Inc. eBooks · 3.9K citations

The problem of privacy-preserving data analysis has a long history spanning multiple disciplines. As electronic data about individuals becomes increasingly detailed, and as technology enables ever ...

Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures

Matt Fredrikson, Somesh Jha, Thomas Ristenpart · 2015 · 2.6K citations

Machine-learning (ML) algorithms are increasingly utilized in privacy-sensitive applications such as predicting lifestyle choices, making medical diagnoses, and facial recognition. In a model inver...

The future of digital health with federated learning

Nicola Rieke, Jonny Hancox, Wenqi Li et al. · 2020 · npj Digital Medicine · 2.1K citations

Edge Intelligence: Paving the Last Mile of Artificial Intelligence With Edge Computing

Zhi Zhou, Xu Chen, En Li et al. · 2019 · Proceedings of the IEEE · 2.0K citations

With the breakthroughs in deep learning, the recent years have witnessed a booming of artificial intelligence (AI) applications and services, spanning from personal assistant to recommendation syst...

Federated Learning With Differential Privacy: Algorithms and Performance Analysis

Kang Wei, Jun Li, Ming Ding et al. · 2020 · IEEE Transactions on Information Forensics and Security · 2.0K citations

Federated learning (FL), as a type of distributed machine learning, is capable of significantly preserving clients&#x2019; private data from being exposed to adversaries. Nevertheless, private ...

Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning

Milad Nasr, Reza Shokri, Amir Houmansadr · 2019 · 1.5K citations

10.1109/SP.2019.00065

RAPPOR

Úlfar Erlingsson, Vasyl Pihur, Aleksandra Korolova · 2014 · 1.5K citations

Randomized Aggregatable Privacy-Preserving Ordinal Response, or RAPPOR, is a\ntechnology for crowdsourcing statistics from end-user client software,\nanonymously, with strong privacy guarantees. In...

Reading Guide

Foundational Papers

Start with Dwork and Roth (2014; 3838 citations) for complete theory including mechanisms and composition. Follow with Erlingsson et al. (2014; RAPPOR) for practical end-to-end system. Nikolaenko et al. (2013; 446 citations) shows scalable ridge regression application.

Recent Advances

Wei et al. (2020) for DP-federated learning algorithms (1991 citations). Rieke et al. (2020) for healthcare applications (2068 citations). Nasr et al. (2019) for white-box attack analysis (1457 citations).

Core Methods

Laplace/Gaussian noise addition for queries. Advanced composition theorems. Subsampling amplification. Exponential mechanism for optimization. Shuffle and zero-concentrated DP variants.

How PapersFlow Helps You Research Differential Privacy

Discover & Search

Research Agent uses searchPapers and citationGraph to map the 3874-citation foundational book by Dwork and Roth (2014), revealing connections to 1991-citation federated learning paper by Wei et al. (2020). exaSearch uncovers practical implementations, while findSimilarPapers expands from RAPPOR (Erlingsson et al., 2014) to related mechanisms.

Analyze & Verify

Analysis Agent employs readPaperContent on Dwork and Roth (2014) to extract Laplace mechanism equations, then runPythonAnalysis simulates noise addition with NumPy for epsilon=1.0 on synthetic census data. verifyResponse (CoVe) with GRADE grading confirms composition theorem bounds against claims in Wei et al. (2020), providing statistical verification of privacy loss.

Synthesize & Write

Synthesis Agent detects gaps in utility optimization between RAPPOR (Erlingsson et al., 2014) and modern ML attacks (Nasr et al., 2019), flagging contradictions in composition bounds. Writing Agent uses latexEditText to draft proofs, latexSyncCitations for 10+ references, and latexCompile for camera-ready sections with exportMermaid diagrams of mechanism architectures.

Use Cases

"Simulate Laplace mechanism privacy cost on Gaussian data with epsilon=0.5"

Research Agent → searchPapers → Analysis Agent → runPythonAnalysis (NumPy/pandas simulation of noise addition and utility metrics) → matplotlib plot of privacy-accuracy tradeoff.

"Write LaTeX section comparing RAPPOR and exponential mechanism for my DP survey"

Research Agent → citationGraph on Erlingsson et al. (2014) → Synthesis Agent → gap detection → Writing Agent → latexEditText + latexSyncCitations + latexCompile → PDF with equations and citations.

"Find GitHub repos implementing DP-SGD from recent federated learning papers"

Research Agent → searchPapers 'DP-SGD federated' → Code Discovery → paperExtractUrls → paperFindGithubRepo → githubRepoInspect → verified code snippets from Wei et al. (2020) implementations.

Automated Workflows

Deep Research workflow conducts systematic review of 50+ DP papers, chaining citationGraph from Dwork and Roth (2014) to recent FL applications, outputting structured report with GRADE-scored evidence. DeepScan applies 7-step analysis with CoVe checkpoints to verify RAPPOR mechanism claims (Erlingsson et al., 2014) against attacks. Theorizer generates new composition theorems by synthesizing gaps in Wei et al. (2020).

Try Doxa for Differential Privacy Research

Frequently Asked Questions

What is the definition of differential privacy?

Differential privacy ensures that adding or removing one individual's data changes query output probabilities by at most e^ε factor. Defined formally by Dwork (2006), measured by privacy parameters (ε, δ).

What are core mechanisms in differential privacy?

Laplace mechanism adds Lap(Δf/ε) noise to numeric queries. Exponential mechanism samples from exp(ε u / 2Δu) probabilities. Gaussian mechanism uses normal noise for (ε,δ)-DP (Dwork and Roth, 2014).

What are key papers on differential privacy?

Dwork and Roth (2014; 3838 citations) provides algorithmic foundations. Erlingsson et al. (2014; RAPPOR, 1454 citations) demonstrates production deployment. Wei et al. (2020; 1991 citations) covers federated learning integration.

What are open problems in differential privacy?

Tighter composition bounds for adaptive queries. High-utility mechanisms for deep learning. Shuffle model privacy amplification without trusted shufflers (post-2020 advances beyond listed papers).

Research Privacy-Preserving Technologies in Data with AI

PapersFlow provides specialized AI tools for Computer Science researchers. Here are the most relevant for this topic:

AI Literature Review

Automate paper discovery and synthesis across 474M+ papers

Code & Data Discovery

Find datasets, code repositories, and computational tools

Deep Research Reports

Multi-source evidence synthesis with counter-evidence

AI Academic Writing

Write research papers with AI assistance and LaTeX support

See how researchers in Computer Science & AI use PapersFlow

Field-specific workflows, example queries, and use cases.

Computer Science & AI Guide

Start Researching Differential Privacy with AI

Search 474M+ papers, run AI-powered literature reviews, and write with integrated citations — all in one workspace.

Try PapersFlow Free See AI Literature Review

See how PapersFlow works for Computer Science researchers

Part of the Privacy-Preserving Technologies in Data Research Guide