Understanding image representations by measuring their equivariance and equivalence

· 2014 · cs.CV · arXiv 1411.5908

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Despite the importance of image representations such as histograms of oriented gradients and deep Convolutional Neural Networks (CNN), our theoretical understanding of them remains limited. Aiming at filling this gap, we investigate three key mathematical properties of representations: equivariance, invariance, and equivalence. Equivariance studies how transformations of the input image are encoded by the representation, invariance being a special case where a transformation has no effect. Equivalence studies whether two representations, for example two different parametrisations of a CNN, capture the same visual information or not. A number of methods to establish these properties empirically are proposed, including introducing transformation and stitching layers in CNNs. These methods are then applied to popular representations to reveal insightful aspects of their structure, including clarifying at which layers in a CNN certain geometric invariances are achieved. While the focus of the paper is theoretical, direct applications to structured-output regression are demonstrated too.

representative citing papers

Transformer Field Theory: A Response-Theoretic Approach to Mechanistic Interpretability

cs.LG · 2026-05-24 · unverdicted · novelty 7.0

Transformer Field Theory frames the residual stream as a field, models patching as source insertion, and uses first-order sensitivities plus Green functions to predict and describe responses, with empirical tests on GPT-2 autoregressive models.

citing papers explorer

Showing 1 of 1 citing paper.

Transformer Field Theory: A Response-Theoretic Approach to Mechanistic Interpretability cs.LG · 2026-05-24 · unverdicted · none · ref 25 · internal anchor
Transformer Field Theory frames the residual stream as a field, models patching as source insertion, and uses first-order sensitivities plus Green functions to predict and describe responses, with empirical tests on GPT-2 autoregressive models.

Understanding image representations by measuring their equivariance and equivalence

fields

years

verdicts

representative citing papers

citing papers explorer