Subtopic Deep Dive
Black-Box Adversarial Attacks
Research Guide
What is Black-Box Adversarial Attacks?
Black-box adversarial attacks generate adversarial examples by querying target machine learning models without access to their internal parameters or gradients.
These attacks simulate realistic threat models where adversaries lack white-box access, relying on query efficiency and transferability from surrogate models. Papernot et al. (2017) introduced practical black-box attacks using transferability, achieving high success rates with limited queries (3366 citations). Research spans vision, NLP, and autonomous systems, with over 10 key papers cited here.
Why It Matters
Black-box attacks expose vulnerabilities in deployed ML systems like autonomous vehicles and facial recognition, as shown in Sharif et al. (2016) where adversaries bypassed authentication using eyeglass frames (1531 citations). They guide defense strategies for production models, with Papernot et al. (2017) demonstrating attacks on real-world APIs. Tian et al. (2018) applied them to DNN-driven cars via DeepTest, revealing safety risks in sensor processing (1187 citations).
Key Research Challenges
Query Efficiency Limits
Black-box attacks require many model queries, making them impractical for rate-limited APIs. Papernot et al. (2017) reduced queries via transferability but success drops on robust models. Optimizing query budgets remains open (3366 citations).
Transferability Variability
Adversarial examples transfer inconsistently from surrogate to target models. Jia and Liang (2017) showed variable transfer in NLP reading comprehension attacks (1271 citations). Enhancing reliable transfer across architectures challenges defenses.
Real-World Realism
Lab attacks often fail under physical constraints like lighting or angles. Sharif et al. (2016) succeeded with printed eyeglasses but scaling to diverse environments is hard (1531 citations). Tian et al. (2018) highlighted sensor noise impacts in driving scenarios (1187 citations).
Essential Papers
Practical Black-Box Attacks against Machine Learning
Nicolas Papernot, Patrick McDaniel, Ian Goodfellow et al. · 2017 · 3.4K citations
Machine learning (ML) models, e.g., deep neural networks (DNNs), are vulnerable to adversarial examples: malicious inputs modified to yield erroneous model outputs, while appearing unmodified to hu...
Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures
Matt Fredrikson, Somesh Jha, Thomas Ristenpart · 2015 · 2.6K citations
Machine-learning (ML) algorithms are increasingly utilized in privacy-sensitive applications such as predicting lifestyle choices, making medical diagnoses, and facial recognition. In a model inver...
Accessorize to a Crime
Mahmood Sharif, Sruti Bhagavatula, Lujo Bauer et al. · 2016 · 1.5K citations
Machine learning is enabling a myriad innovations, including new algorithms for cancer diagnosis and self-driving cars. The broad use of machine learning makes it important to understand the extent...
Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning
Milad Nasr, Reza Shokri, Amir Houmansadr · 2019 · 1.5K citations
10.1109/SP.2019.00065
Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods
Eyke Hüllermeier, Willem Waegeman · 2021 · Machine Learning · 1.3K citations
Adversarial Examples for Evaluating Reading Comprehension Systems
Robin Jia, Percy Liang · 2017 · 1.3K citations
Standard accuracy metrics indicate that reading comprehension systems are making rapid progress, but the extent to which these systems truly understand language remains unclear. To reward systems w...
DeepTest
Yuchi Tian, Kexin Pei, Suman Jana et al. · 2018 · 1.2K citations
Recent advances in Deep Neural Networks (DNNs) have led to the development of DNN-driven autonomous cars that, using sensors like camera, LiDAR, etc., can drive without any human intervention. Most...
Reading Guide
Foundational Papers
Start with Papernot et al. (2017) for core transferability concept (3366 citations), then Fredrikson et al. (2015) for model inversion basics (2650 citations); no pre-2015 papers available.
Recent Advances
Tian et al. (2018) on DeepTest for autonomous driving (1187 citations); Jin et al. (2020) on BERT text attacks (815 citations); Nasr et al. (2019) on privacy inference (1457 citations).
Core Methods
Transfer attacks (Papernot et al., 2017), physical perturbations (Sharif et al., 2016), NLP adversaries (Jia and Liang, 2017), testing suites (Tian et al., 2018).
How PapersFlow Helps You Research Black-Box Adversarial Attacks
Discover & Search
Research Agent uses searchPapers and citationGraph to map Papernot et al. (2017) as the central hub, revealing 3366 citations and downstream works like Tian et al. (2018); exaSearch uncovers query-efficient variants, while findSimilarPapers links to Sharif et al. (2016) for physical attacks.
Analyze & Verify
Analysis Agent applies readPaperContent to extract query counts from Papernot et al. (2017), verifies transferability claims via verifyResponse (CoVe), and runs PythonAnalysis to replicate attack success rates on MNIST with NumPy; GRADE grading scores evidence strength for efficiency metrics.
Synthesize & Write
Synthesis Agent detects gaps in query-efficient physical attacks post-Sharif et al. (2016), flags contradictions in transferability; Writing Agent uses latexEditText, latexSyncCitations for Papernot et al. (2017), and latexCompile to produce reports with exportMermaid diagrams of attack pipelines.
Use Cases
"Reproduce query-efficient black-box attack from Papernot 2017 on surrogate models"
Research Agent → searchPapers('Papernot black-box') → Analysis Agent → readPaperContent + runPythonAnalysis (NumPy replication of transfer attack) → matplotlib plot of success rates vs. queries.
"Write LaTeX section comparing transferability in Jia 2017 and Papernot 2017"
Synthesis Agent → gap detection → Writing Agent → latexEditText (draft comparison) → latexSyncCitations (add Jia Liang 2017) → latexCompile → PDF with formatted tables.
"Find GitHub repos implementing DeepTest black-box attacks from Tian 2018"
Research Agent → searchPapers('DeepTest Tian') → Code Discovery → paperExtractUrls → paperFindGithubRepo → githubRepoInspect → list of verified attack codebases with README summaries.
Automated Workflows
Deep Research workflow conducts systematic review: searchPapers on 'black-box attacks' → citationGraph on Papernot et al. (2017) → structured report with 50+ papers ranked by citations. DeepScan applies 7-step analysis with CoVe checkpoints to verify Sharif et al. (2016) physical attack claims. Theorizer generates hypotheses on query reduction from Tian et al. (2018) and Jia and Liang (2017).
Frequently Asked Questions
What defines black-box adversarial attacks?
Attacks that query target models without gradient or parameter access, using transferability or direct optimization, as in Papernot et al. (2017).
What are key methods in black-box attacks?
Transfer-based attacks from surrogate models (Papernot et al., 2017) and query-efficient optimization; physical variants use printed perturbations (Sharif et al., 2016).
What are the most cited papers?
Papernot et al. (2017, 3366 citations) on practical attacks; Fredrikson et al. (2015, 2650 citations) on model inversion; Sharif et al. (2016, 1531 citations) on physical attacks.
What open problems exist?
Improving query efficiency under rate limits, reliable transferability across defenses, and physical robustness beyond lab settings (Tian et al., 2018).
Research Adversarial Robustness in Machine Learning with AI
PapersFlow provides specialized AI tools for Computer Science researchers. Here are the most relevant for this topic:
AI Literature Review
Automate paper discovery and synthesis across 474M+ papers
Code & Data Discovery
Find datasets, code repositories, and computational tools
Deep Research Reports
Multi-source evidence synthesis with counter-evidence
AI Academic Writing
Write research papers with AI assistance and LaTeX support
See how researchers in Computer Science & AI use PapersFlow
Field-specific workflows, example queries, and use cases.
Start Researching Black-Box Adversarial Attacks with AI
Search 474M+ papers, run AI-powered literature reviews, and write with integrated citations — all in one workspace.
See how PapersFlow works for Computer Science researchers