Subtopic Deep Dive

Feature Importance and Attribution Methods
Research Guide

What is Feature Importance and Attribution Methods?

Feature Importance and Attribution Methods quantify individual feature contributions to predictions in black-box machine learning models using techniques like SHAP, LIME, and Integrated Gradients.

These methods provide local and global explanations for model decisions across tabular, image, and text data. Key approaches include SHAP (SHapley Additive exPlanations) for game-theoretic attributions and LIME (Local Interpretable Model-agnostic Explanations) for surrogate model approximations. Surveys by Carvalho et al. (2019, 1644 citations) and Samek et al. (2021, 1177 citations) cover over 50 methods and evaluation metrics.

10
Curated Papers
3
Key Challenges

Why It Matters

Attribution methods enable debugging of biased models in healthcare, as Tjoa and Guan (2020, 1908 citations) show for medical XAI, and support regulatory compliance under EU AI Act requirements for high-risk systems. Holzinger et al. (2019, 1530 citations) highlight causability for clinical trust, while Markus et al. (2020, 678 citations) demonstrate improved adoption in health informatics. Slack et al. (2020, 680 citations) reveal vulnerabilities like adversarial fooling of LIME and SHAP, critical for robust deployments in finance and justice.

Key Research Challenges

Attribution Stability

Explanations vary across perturbations, as Slack et al. (2020) demonstrate by fooling LIME and SHAP with minimal input changes. This undermines reliability in sensitive applications. Zhou et al. (2021, 537 citations) survey metrics showing low consistency across methods.

Local Fidelity Evaluation

Measuring how well attributions match model behavior locally remains inconsistent. Carvalho et al. (2019) review metrics like deletion AUC but note lack of standardization. Samek et al. (2021) emphasize need for application-specific validation.

Computational Scalability

SHAP and Integrated Gradients scale poorly to high-dimensional data like images. Burkart and Huber (2021, 900 citations) identify this as barrier for real-time use. Hassija et al. (2023, 1280 citations) call for efficient approximations.

Essential Papers

1.

A Survey on Explainable Artificial Intelligence (XAI): Toward Medical XAI

Erico Tjoa, Cuntai Guan · 2020 · IEEE Transactions on Neural Networks and Learning Systems · 1.9K citations

Recently, artificial intelligence and machine learning in general have demonstrated remarkable performances in many tasks, from image processing to natural language processing, especially with the ...

2.

Machine Learning Interpretability: A Survey on Methods and Metrics

Diogo V. Carvalho, Eduardo M. Pereira, Jaime S. Cardoso · 2019 · Electronics · 1.6K citations

Machine learning systems are becoming increasingly ubiquitous. These systems’s adoption has been expanding, accelerating the shift towards a more algorithmic society, meaning that algorithmically i...

3.

Causability and explainability of artificial intelligence in medicine

Andreas Holzinger, Georg Langs, Helmut Denk et al. · 2019 · Wiley Interdisciplinary Reviews Data Mining and Knowledge Discovery · 1.5K citations

Explainable artificial intelligence (AI) is attracting much interest in medicine. Technically, the problem of explainability is as old as AI itself and classic AI represented comprehensible retrace...

4.

Interpreting Black-Box Models: A Review on Explainable Artificial Intelligence

Vikas Hassija, Vinay Chamola, A. Mahapatra et al. · 2023 · Cognitive Computation · 1.3K citations

Abstract Recent years have seen a tremendous growth in Artificial Intelligence (AI)-based methodological development in a broad range of domains. In this rapidly evolving field, large number of met...

5.

Explaining Deep Neural Networks and Beyond: A Review of Methods and Applications

Wojciech Samek, Grégoire Montavon, Sebastian Lapuschkin et al. · 2021 · Proceedings of the IEEE · 1.2K citations

With the broader and highly successful usage of machine learning in industry\nand the sciences, there has been a growing demand for Explainable AI.\nInterpretability and explanation methods for gai...

6.

A Survey on the Explainability of Supervised Machine Learning

Nadia Burkart, Marco F. Huber · 2021 · Journal of Artificial Intelligence Research · 900 citations

Predictions obtained by, e.g., artificial neural networks have a high accuracy but humans often perceive the models as black boxes. Insights about the decision making are mostly opaque for humans. ...

7.

Fooling LIME and SHAP

Dylan Slack, Sophie Hilgard, Emily Jia et al. · 2020 · Proceedings of the AAAI/ACM Conference on AI Ethics and Society · 680 citations

As machine learning black boxes are increasingly being deployed in domains such as healthcare and criminal justice, there is growing emphasis on building tools and techniques for explaining these b...

Reading Guide

Foundational Papers

No pre-2015 papers available; start with Carvalho et al. (2019) for broad methods/metrics survey and Samek et al. (2021) for DNN-specific attributions as pseudo-foundational overviews.

Recent Advances

Hassija et al. (2023) reviews black-box interpretations; Burkart and Huber (2021) surveys supervised ML explainability; Zhou et al. (2021) focuses on explanation quality metrics.

Core Methods

SHAP (game theory), LIME (local surrogates), Integrated Gradients (gradient paths), Anchors (high-precision rules), permutation importance (model-agnostic), as classified in Carvalho et al. (2019) and Samek et al. (2021).

How PapersFlow Helps You Research Feature Importance and Attribution Methods

Discover & Search

Research Agent uses searchPapers('SHAP LIME feature attribution stability') to find 50+ papers including Slack et al. (2020), then citationGraph reveals Tjoa and Guan (2020) as highly cited hub, while findSimilarPapers on Carvalho et al. (2019) uncovers metrics-focused works.

Analyze & Verify

Analysis Agent applies readPaperContent on Slack et al. (2020) to extract fooling experiments, verifyResponse with CoVe checks attribution claims against raw data, and runPythonAnalysis recreates SHAP values using NumPy/pandas for tabular models with GRADE scoring for evidence strength.

Synthesize & Write

Synthesis Agent detects gaps in stability metrics via contradiction flagging across Zhou et al. (2021) and Samek et al. (2021), while Writing Agent uses latexEditText for explanation equations, latexSyncCitations for 20+ refs, latexCompile for PDF, and exportMermaid for attribution workflow diagrams.

Use Cases

"Reproduce SHAP fooling attack from Slack 2020 on tabular data"

Research Agent → searchPapers('Fooling LIME SHAP') → Analysis Agent → readPaperContent(Slack et al.) → runPythonAnalysis(SHAP computation + adversarial perturbation sandbox) → matplotlib plot of stability metrics.

"Write LaTeX section comparing SHAP vs LIME fidelity metrics"

Research Agent → citationGraph(Carvalho 2019) → Synthesis → gap detection → Writing Agent → latexEditText(draft) → latexSyncCitations(10 papers) → latexCompile(PDF with tables).

"Find GitHub repos implementing Integrated Gradients from XAI surveys"

Research Agent → exaSearch('Integrated Gradients code') → Code Discovery → paperExtractUrls(Samek 2021) → paperFindGithubRepo → githubRepoInspect(analysis scripts + runPythonAnalysis verification).

Automated Workflows

Deep Research workflow scans 50+ papers via searchPapers on 'feature attribution metrics', structures report with Carvalho et al. (2019) as anchor, and GRADE-scores claims. DeepScan applies 7-step CoVe chain to verify Slack et al. (2020) experiments with runPythonAnalysis checkpoints. Theorizer generates hypotheses on scalable attributions from Burkart and Huber (2021) gaps.

Frequently Asked Questions

What defines feature importance methods?

Techniques like SHAP compute Shapley values for fair feature contributions, LIME fits local surrogates, and Integrated Gradients accumulate gradients along paths, as surveyed in Carvalho et al. (2019).

What are main evaluation methods?

Metrics include faithfulness (deletion AUC), stability (consistency under noise), and plausibility (human alignment), detailed in Zhou et al. (2021) with benchmarks across 20+ methods.

What are key papers?

Carvalho et al. (2019, 1644 citations) surveys methods/metrics; Samek et al. (2021, 1177 citations) covers DNN explanations; Slack et al. (2020, 680 citations) exposes LIME/SHAP vulnerabilities.

What are open problems?

Challenges include causal attributions beyond correlational importance (Holzinger et al., 2019), scalability to billion-parameter models (Burkart and Huber, 2021), and standardized benchmarks (Zhou et al., 2021).

Research Explainable Artificial Intelligence (XAI) with AI

PapersFlow provides specialized AI tools for Computer Science researchers. Here are the most relevant for this topic:

See how researchers in Computer Science & AI use PapersFlow

Field-specific workflows, example queries, and use cases.

Computer Science & AI Guide

Start Researching Feature Importance and Attribution Methods with AI

Search 474M+ papers, run AI-powered literature reviews, and write with integrated citations — all in one workspace.

See how PapersFlow works for Computer Science researchers