Skip to content

Files

Latest commit

 

History

History

feature_attribution

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 

Feature Attribution/Importance

Surveys

Gradient based Feature Attribution in Explainable AI: A Technical Review, Arxiv Preprint

Papers

An Unsupervised Approach to Achieve Supervised-Level Explainability in Healthcare Records, EMNLP 2024, Blog

LimeAttack: Local Explainable Method for Textual Hard-Label Adversarial Attack, AAAI 2024

Integrated Decision Gradients: Compute Your Attributions Where the Model Makes Its Decision, AAAI 2024

Using stratified sampling to improve LIME Image explanations, AAAI 2024

Attention Guided CAM: Visual Explanations of Vision Transformer Guided by Self-Attention, AAAI 2024

Empowering CAM-Based Methods with Capability to Generate Fine-Grained and High-Faithfulness Explanations, AAAI 2024

Beyond TreeSHAP: Efficient Computation of anyorder Shapley Interactions for Tree Ensembles, AAAI 2024

SHAP@k: Efficient and Probably Approximately Correct (PAC) Identification of Top-K Features, AAAI 2024

Approximating the Shapley Value without Marginal Contributions, AAAI 2024

GLIME: General, Stable and Local LIME Explanation, NIPS 2024

Deeply Explain CNN via Hierarchical Decomposition, IJCV 2023

Negative Flux Aggregation to Estimate Feature Attributions, IJCAI 2023, code

On Minimizing the Impact of Dataset Shifts on Actionable Explanations, UAI 2023

Counterfactual-based Saliency Map: Towards Visual Contrastive Explanations for Neural Networks, CVPR 2023

A Practical Upper Bound for the Worst-Case Attribution Deviations, CVPR 2023

IDGI: A Framework to Eliminate Explanation Noise from Integrated Gradients, CVPR 2023

Explaining Image Classifiers with Multiscale Directional Image Representation, CVPR 2023

SplineCam: Exact Visualization and Characterization of Deep Network Geometry and Decision Boundaries, CVPR 2023

Extending class activation mapping using Gaussian receptive field, CVIU Journal 2023

TSGB: Target-selective gradient backprop for probing CNN visual saliency, TIP 2022

Transferable Adversarial Attack Based on Integrated Gradients, ICLR 2022

OrphicX: A Causality-Inspired Latent Variable Model for Interpreting Graph Neural Networks, CVPR 2022

Consistent Explanations by Contrastive Learning, CVPR 2022

VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language Transformers, CVPR 2022

REX: Reasoning-aware and Grounded Explanation, CVPR 2022

FAM: Visual Explanations for the Feature Representations from Deep Convolutional Networks, CVPR 2022

NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks, CVPR 2022

Do Explanations Explain? Model Knows Best, CVPR 2022

On Computing Probabilistic Explanations for Decision Trees, NeurIPS 2022

Exploiting the Relationship Between Kendall’s Rank Correlation and Cosine Similarity for Attribution Protection, NeurIPS 2022

Linear TreeShap, NeurIPS 2022

CS-SHAPLEY: Class-wise Shapley Values for Data Valuation in Classification, NeurIPS 2022

Consistent Sufficient Explanations and Minimal Local Rules for explaining any classifier or regressor, NeurIPS 2022

New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound, NeurIPS 2022

Accurate Shapley Values for explaining tree-based models, AISTATS 2022

Which Explanation Should I Choose? A Function Approximation Perspective to Characterizing Post Hoc Explanations, NeurIPS 2022

What I Cannot Predict, I Do Not Understand: A Human-Centered Evaluation Framework for Explainability Methods, NeurIPS 2022

Listen to Interpret: Post-hoc Interpretability for Audio Networks with NMF, NeurIPS 2022

What You See is What You Classify: Black Box Attributions, NeurIPS 2022

Explaining Preferences with Shapley Values, NeurIPS 2022

Where do Models go Wrong? Parameter-Space Saliency Maps for Explainability, NeurIPS 2022

Benchmarking Heterogeneous Treatment Effect Models through the Lens of Interpretability, NeurIPS 2022

Is this the Right Neighborhood? Accurate and Query Efficient Model Agnostic Explanations, NeurIPS 2022

Bayesian subset selection and variable importance for interpretable prediction and classification, NeurIPS 2022

Robust Models Are More Interpretable Because Attributions Look Normal, ICML 2022

Accelerating Shapley Explanation via Contributive Cooperator Selection, ICML 2022

Framework for Evaluating Faithfulness of Local Explanations, ICML 2022

XAI for Transformers: Better Explanations through Conservative Propagation, ICML 2022

A Functional Information Perspective on Model Interpretation, ICML 2022

A Psychological Theory of Explainability, ICML 2022

A Consistent and Efficient Evaluation Strategy for Attribution Methods, ICML 2022

A Rigorous Study of Integrated Gradients Method and Extensions to Internal Neuron Attributions, ICML 2022

Interpretable Neural Networks with Frank-Wolfe: Sparse Relevance Maps and Relevance Orderings, ICML 2022

Rational Shapley Values, FAccT 2022

Human Interpretation of Saliency-based Explanation Over Text, FAccT 2022

Higher-Order Explanations of Graph Neural Networks via Relevant Walks, TPAMI 2022

Explaining Explanations: Axiomatic Feature Interactions for Deep Networks, JMLR 2022

Counterfactual Interpolation Augmentation (CIA): A Unified Approach to Enhance Fairness and Explainability of DNN, IJCAI 2022

DERIVING EXPLAINABLE DISCRIMINATIVE ATTRIBUTES USING CONFUSION ABOUT COUNTERFACTUAL CLASS, ICASSP 2022

FastSHAP: Real-Time Shapley Value Estimation, ICLR 2022

Explain, Edit, and Understand: Rethinking User Study Design for Evaluating Model Explanations, AAAI 2022

Backdoor Attacks on the DNN Interpretation System, AAAI 2022

Feature Importance Explanations for Temporal Black-Box Models, AAAI 2022

Evaluating Explainable AI on a Multi-Modal Medical Imaging Task: Can Existing Algorithms Fulfill Clinical Requirements?, AAAI 2022

Do Feature Attribution Methods Correctly Attribute Features?, AAAI 2022

Improving performance of deep learning models with axiomatic attribution priors and expected gradients., Nature Machine Intelligence 2021

One Explanation is Not Enough: Structured Attention Graphs for Image Classification, NeurIPS 2021

On Locality of Local Explanation Models, NeurIPS 2021

Shapley Residuals: Quantifying the limits of the Shapley value for explanations, NeurIPS 2021

The Out-of-Distribution Problem in Explainability and Search Methods for Feature Importance Explanations, NeurIPS 2021

Reliable Post hoc Explanations: Modeling Uncertainty in Explainability, NeurIPS 2021

Look at the Variance! Efficient Black-box Explanations with Sobol-based Sensitivity Analysis, NeurIPS 2021

Do Input Gradients Highlight Discriminative Features?, NeurIPS 2021

The effectiveness of feature attribution methods and its correlation with automatic evaluation scores, NeurIPS 2021

From global to local MDI variable importances for random forests and when they are Shapley values, NeurIPS 2021

Fast Axiomatic Attribution for Neural Networks, NeurIPS 2021

On Guaranteed Optimal Robust Explanations for NLP Models, IJCAI 2021

Explaining deep neural network models with adversarial gradient integration, IJCAI 2021

Integrated Directional Gradients: Feature Interaction Attribution for Neural NLP Models, ACL 2021

What does LIME really see in images?, ICML 2021

Explanations for Monotonic Classifiers, ICML 2021

Explaining Time Series Predictions with Dynamic Masks, ICML 2021

On Explainability of Graph Neural Networks via Subgraph Explorations, ICML 2021

Generative Causal Explanations for Graph Neural Networks, ICML 2021

Explaining Explanations: Axiomatic Feature Interactions for Deep Networks, ICML 2021

How Interpretable and Trustworthy are GAMs?, KDD 2021

Leveraging Latent Features for Local Explanations, KDD 2021

S-LIME: Stabilized-LIME for Model Explanation, KDD 2021

An Experimental Study of Quantitative Evaluations on Saliency Methods, KDD 2021

TimeSHAP: Explaining Recurrent Models through Sequence Perturbations, KDD 2021

Black-box Explanation of Object Detectors via Saliency Maps, CVPR 2021 Interpreting Super-Resolution Networks with Local Attribution Maps, CVPR 2021

Transformer Interpretability Beyond Attention Visualization, CVPR 2021

A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts, CVPR 2021

Building Reliable Explanations of Unreliable Neural Networks: Locally Smoothing Perspective of Model Interpretation, CVPR 2021

Relevance-CAM: Your Model Already Knows Where to Look, CVPR 2021, code

Guided integrated gradients: An adaptive path method for removing noise, CVPR 2021

An Analysis of LIME for Text Data, AISTATS 2021

Improving KernelSHAP: Practical Shapley Value Estimation Using Linear Regression, AISTATS 2021

A Unified Taylor Framework for Revisiting Attribution Methods, AAAI 2021

If You Like Shapley Then You’ll Love the Core, AAAI 2021

Interpreting Deep Neural Networks with Relative Sectional Propagation by Analyzing Comparative Gradients and Hostile Activations, AAAI 2021

Explaining Convolutional Neural Networks through Attribution-Based Input Sampling and Block-Wise Feature Aggregation, AAAI 2021

Explainable Models with Consistent Interpretations, AAAI 2021

On the Tractability of SHAP Explanations, AAAI 2021

Interpreting Multivariate Shapley Interactions in DNNs, AAAI 2021

Interpreting Deep Neural Networks with Relative Sectional Propagation by Analyzing Comparative Gradients and Hostile Activations, AAAI 2021

Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking, ICLR 2021

Scaling Symbolic Methods using Gradients for Neural Model Explanation, ICLR 2021

Rethinking the Role of Gradient-based Attribution Methods for Model Interpretability, ICLR 2021

Shapley explainability on the data manifold, ICLR 2021

ICAM: Interpretable Classification via Disentangled Representations and Feature Attribution Mapping, NeurIPS 2020

What went wrong and when? Instance-wise Feature Importance for Time-series Models, NeurIPS 2020

How Can I Explain This to You? An Empirical Study of Deep Neural Network Explanation Methods, NeurIPS 2020 code

Asymmetric Shapley values: incorporating causal knowledge into model-agnostic explainability, NeurIPS 2020

Parameterized Explainer for Graph Neural Network, NeurIPS 2020

PGM-Explainer: Probabilistic Graphical Model Explanations for Graph Neural Networks, NeurIPS 2020

Visualizing the Impact of Feature Attribution Baselines, Distill 2020

There and Back Again: Revisiting Backpropagation Saliency Methods, CVPR 2020

Towards Visually Explaining Variational Autoencoders, CVPR 2020

Blur integrated gradient: Attribution in Scale and Space, CVPR 2020

Understanding Integrated Gradients with SmoothTaylor for Deep Neural Network Attribution arxiv preprint 2020

GCN-LRP explanation: exploring latent attention of graph convolutional networks, IJCNN 2020

Visualizing Deep Networks by Optimizing with Integrated Gradients, AAAI 2020

Relative Attributing Propagation: Interpreting the Comparative Contributions of Individual Units in Deep Neural Networks, AAAI 2020

LS-Tree: Model Interpretation When the Data Are Linguistic, AAAI 2020, slides

Investigating Saturation Effects in Integrated Gradients, ICMLW on WHI 2020

Robust and Stable Black Box Explanations, ICML 2020

Concise Explanations of Neural Networks using Adversarial Training, ICML 2020

TOWARDS HIERARCHICAL IMPORTANCE ATTRIBUTION: EXPLAINING COMPOSITIONAL SEMANTICS FOR NEURAL SEQUENCE MODELS, ICLR 2020

Feature relevance quantification in explainable AI: A causal problem relevance quantification in explainable AI: A causal problem, AISTATS 2020

You Shouldn’t Trust Me: Learning Models Which Conceal Unfairness From Multiple Explanation Methods, ECAI 2020

Bias also matters: Bias attribution for deep neural network explanation, ICML 2019

Understanding Impacts of High-Order Loss Approximations and Features in Deep Learning Interpretation, ICML 2019

On the Connection Between Adversarial Robustness and Saliency Map Interpretability, ICML 2019

Explaining Deep Neural Networks with a Polynomial Time Algorithm for Shapley Value Approximation, ICML 2019

Explainability Techniques for Graph Convolutional Networks, ICML Workshop 2019

FullGrad, Full-Gradient Representation for Neural Network Visualization, NeurIPS 2019

Towards Automatic Concept-based Explanations, NeurIPS 2019

GNNExplainer: Generating Explanations for Graph Neural Networks, NeurIPS 2019

On the (In)fidelity and Sensitivity for Explanations, NeurIPS 2019

Robust Attribution Regularization, NeurIPS 2019

Explanations can be manipulated and geometry is to blame, NeurIPS 2019

Interpretation of Neural Networks is Fragile, AAAI 2019

XRAI: Better Attributions Through Regions, ICCV 2019

Understanding Deep Networks via Extremal Perturbations and Smooth Masks, ICCV 2019

L-Shapley and C-Shapley: Efficient Model Interpretation for Structured Data, ICLR 2019

Interpretable and Fine-Grained Visual Explanations for Convolutional Neural Networks, CVPR 2019

Explainability Methods for Graph Convolutional Neural Networks, CVPR 2019

This Looks Like That: Deep Learning for Interpretable Image Recognition, NeurIPS 2019

“Why Should You Trust My Explanation?” Understanding Uncertainty in LIME Explanations, ICML 2019

Gradient-Based Vs. Propagation-Based Explanations: An Axiomatic Comparison, In book: Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, pp.253-265, Springer 2019

The Many Shapley Values for Model Explanation, arxiv preprint 2019

Explaining the Explainer: A First Theoretical Analysis of LIME, arxiv preprint 2020

VarGard,Local Explanation Methods for Deep Neural Networks Lack Sensitivity to Parameter Values, ICLR 2018 workshop

NoiseTunnel, Sanity checks for saliency maps, NeurIPS 2018

Towards Robust Interpretability with Self-Explaining Neural Networks, NeurIPS 2018

Model Agnostic Supervised Local Explanations, NeurIPS 2018

Integrated Gradients, Did the Model Understand the Question?, ACL 2018

Neuron Integrated Gradients: Computationally Efficient Measures of Internal Neuron Importance , preprint 2018

TCAV: Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV), ICML 2018

A Theoretical Explanation for Perplexing Behaviors of Backpropagation-based Visualizations, ICML 2018

L2X: Learning to Explain: An Information-Theoretic Perspective on Model Interpretation, ICML 2018 code

Noise-adding Methods of Saliency Map as Series of Higher Order Partial Derivative, ICML 2018 workshop

InternalInfluence, Influence-Directed Explanations for Deep Convolutional Networks, IEEE International Test Conference 2018

Interpretable Basis Decomposition for Visual Explanation, 2018 ECCV

Grounding Visual Explanations, ECCV 2018

RuleMatrix: RuleMatrix: Visualizing and Understanding Classifiers with Rules, TVCG 2018

Manifold: Manifold: A Model-Agnostic Framework for Interpretation and Diagnosis of Machine Learning Models, TVCG 2018

Top-down neural attention by excitation backprop, IJCV 2018, (ECCV 2016)

RISE: Randomized Input Sampling for Explanation of Black-box Models, BMVC 2018

Shap: A unified approach to interpreting model predictions, NeurIPS 2017

Real Time Image Saliency for Black Box Classifiers, NeurIPS 2017

Explaining nonlinear classification decisions with deep Taylor decomposition, Pattern Recognition 2017

Interpretable Explanations of Black Boxes by Meaningful Perturbation, ICCV 2017

Hide-and-Seek: Forcing a Network to be Meticulous for Weakly-Supervised Object and Action Localization, ICCV 2017

Grad-CAM: Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization ICCV 2017, IJCV 2019

Network Dissection: Quantifying Interpretability of Deep Visual Representations, CVPR 2017

DeepLIFT: Learning important features through propagating activation differences, ICML 2017

Integrated Gradients: Axiomatic attribution for deep networks, ICML 2017

SmoothGard: SmoothGrad: removing noise by adding noise, ICML 2017

Visualizing deep neural network decisions: Prediction difference analysis, ICLR 2017

Visualizing deep neural net- work decisions: Prediction difference analysis, arxiv preprint 2017

Lime: "Why Should I Trust You?": Explaining the Predictions of Any Classifier, SIGKDD 2016

Visualizing deep convolutional neural networks using natural pre-images, IJCV 2016

Visualizing the Effects of Predictor Variables in Black Box Supervised Learning Models, Arxiv preprint 2016

Salient deconvolutional networks, ECCV 2016

Top-down Neural Attention by Excitation Backprop, ECCV 2016

LRP: Layer-wise relevance propagation for neural networks with local renormalization layers, ICANN 2016

Gradient * input: Not Just a Black Box: Learning Important Features Through Propagating Activation Differences, arxiv preprint 2016

Investigating the influence of noise and distractors on the interpretation of neural networks, NeurIPS 2016

QII, Algorithmic Transparency via Quantitative Input Influence: Theory and Experiments with Learning Systems, IEEE Symposium on Security and Privacy (SP)

epsilon-LRP, On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation, PloS one 2015

Perturbation-Based method, Predicting effects of noncoding variants with deep learning–based sequence model, nature method 2015

CAM: Learning Deep Features for Discriminative Localization, CVPR 2015

Guided Backpropagation, Striving for simplicity: The all convolutional net, ICLR 2015

Understanding neural networks through deep visualization, arxiv preprint 2015

Back progagation: Deep inside convolutional networks: Visualising image classification models and saliency maps, ICLR 2014

Deconvnet: Visualizing and Understanding Convolutional Networks

Shapley sampling values: Explaining prediction models and individual predictions with feature contributions, ACM Knowledge and Information Systems 2014

Bounding the Estimation Error of Sampling-based Shapley Value Approximation, arxiv preprint 2013

Permutation importance: a corrected feature importance measure, Bioinformatics 2010

How to explain individual classification decisions, Journal of Machine Learning Research 2010

An Efficient Explanation of Individual Classifications using Game Theory, Journal of Machine Learning Research 2010

Explaining Classifications for Individual Instances, TKDE 2008

Review and comparison of methods to study the contribution of variables in artificial neural network models, Ecological Modelling 2003