Learning how to explain neural networks: PatternNet and PatternAttribution

Been Kim; Dumitru Erhan; Klaus-Robert M\"uller; Kristof T. Sch\"utt; Maximilian Alber; Pieter-Jan Kindermans; Sven D\"ahne

arxiv: 1705.05598 · v2 · pith:YHSKJKISnew · submitted 2017-05-16 · 📊 stat.ML · cs.LG

Learning how to explain neural networks: PatternNet and PatternAttribution

Pieter-Jan Kindermans , Kristof T. Sch\"utt , Maximilian Alber , Klaus-Robert M\"uller , Dumitru Erhan , Been Kim , Sven D\"ahne This is my paper

classification 📊 stat.ML cs.LG

keywords linearnetworksmodelsneuralexplanationdeepmethodspatternattribution

0 comments

read the original abstract

DeConvNet, Guided BackProp, LRP, were invented to better understand deep neural networks. We show that these methods do not produce the theoretically correct explanation for a linear model. Yet they are used on multi-layer networks with millions of parameters. This is a cause for concern since linear models are simple neural networks. We argue that explanation methods for neural nets should work reliably in the limit of simplicity, the linear models. Based on our analysis of linear models we propose a generalization that yields two explanation techniques (PatternNet and PatternAttribution) that are theoretically sound for linear models and produce improved explanations for deep networks.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

In Defense of Information Leakage in Concept-based Models
cs.LG 2026-06 conditional novelty 7.0

Concept-based models can use controlled 'benign' information leakage to remain accurate and intervenable under real-world concept incompleteness by reframing their training objective.
TEVI: Text-Conditioned Editing of Visual Representations via Sparse Autoencoders for Improved Vision-Language Alignment
cs.CV 2026-06 unverdicted novelty 6.0

TEVI applies sparse autoencoders and caption-conditioned masking to edit image embeddings, yielding better retrieval on MS COCO, Flickr, IIW, DOCCI, and RoCOCO benchmarks with larger gains on richer captions.