pith. sign in

arxiv: 2606.27864 · v1 · pith:RISG5LNUnew · submitted 2026-06-26 · 💻 cs.CV · cs.LG

A Unified Framework for Vision Transformers Equivariant to Discrete Subgroups of O(2)

Pith reviewed 2026-06-29 04:34 UTC · model grok-4.3

classification 💻 cs.CV cs.LG
keywords vision transformersequivarianceO(2) subgroupsself-attentiondiscrete symmetrieshexagonal patchesaerial image recognition
0
0 comments X

The pith

Vision transformers can be made equivariant to arbitrary discrete subgroups of O(2) with natural embeddings between symmetry groups.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a unified construction for vision transformers that respect the symmetries of any discrete subgroup of the orthogonal group O(2). This generalizes earlier special cases built only for reflections or the square dihedral group D4. The construction supplies equivariant versions of self-attention and other transformer blocks, together with two expressivity results: when one symmetry group H sits inside another G, every G-equivariant model embeds into an H-equivariant one; and, in the single-head case, the equivariant attention layer can reproduce every symmetry-respecting attention map that ordinary attention could produce under the same group. A hexagonal-patch version is also built for the six-fold group D6 and tested on aerial imagery in low-data regimes.

Core claim

We introduce a family of vision transformers equivariant to arbitrary discrete subgroups of O(2), yielding equivariant analogues of the core transformer components. Whenever H ≤ G, the class of G-equivariant ViTs embeds naturally into the class of H-equivariant ViTs. In the single-head setting, the corresponding equivariant self-attention layer realizes every G-equivariant self-attention map representable by ordinary self-attention. We further construct a D6-equivariant model based on hexagonal patches.

What carries the argument

Equivariant self-attention layers built from group actions on homogeneous spaces, together with the embedding map that sends G-equivariant parameters into H-equivariant ones when H is a subgroup of G.

If this is right

  • The same transformer weights can serve as both G-equivariant and H-equivariant models whenever H is contained in G.
  • A single-head equivariant attention layer is fully expressive for the space of G-equivariant attention maps.
  • Hexagonal patches allow a D6-equivariant transformer compatible with six-fold rotational symmetry.
  • Equivariant models can be compared directly across different discrete subgroups on the same dataset.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The embedding result suggests a systematic way to study how increasing the enforced symmetry group trades off model capacity against data efficiency.
  • The hexagonal construction opens the possibility of equivariant transformers on non-Cartesian grids without first resampling the input.
  • Performance differences across homogeneous-space choices in the nonlinearities point to a tunable design knob that may be optimized per dataset.

Load-bearing premise

Equivariant layers and nonlinearities can be realized by choosing homogeneous-space configurations that are compatible with the chosen discrete subgroup action.

What would settle it

An explicit G-equivariant attention map that can be written with ordinary self-attention but cannot be reproduced by the constructed equivariant layer, or a concrete pair H ≤ G for which no embedding of the full G-equivariant ViT class into the H-equivariant class exists.

Figures

Figures reproduced from arXiv: 2606.27864 by Georg B\"okman, T\={\i}kun \^Ong.

Figure 1
Figure 1. Figure 1: Feature maps of a 𝐷6-equivariant vision transformer after four transformer blocks in a trained classification model. For each irrep 𝜌 ∈ 𝐷c6, we select a single channel from the irrep component 𝑥 𝜌 ∈ R H ⊗ R 𝐶𝜌 ⊗ 𝑉𝜌. The features in the two-dimensional irreps E1 and E2 are represented by encoding the polar angle and length of a vector in R 2 using the hue and a combination of saturation and brightness respe… view at source ↗
Figure 2
Figure 2. Figure 2: Illustration of the equivariant patch embedding layer with 𝐺 = 𝐷6. 𝛼 ∈ [𝜈𝜌], the linear map defined by 𝐸 𝜌 𝛼 𝑗 ↦→ 𝑒 𝑗 is 𝐺-equivariant from span{𝐸 𝜌 𝛼1 , · · · , 𝐸𝜌 𝛼𝑑𝜌 } onto𝑉𝜌. Then, the 𝜌-th component of the patch embedding layer is given by 𝑦 𝜌 (𝑎)𝑐 = Í 𝑐 ′ ,𝑘,𝑙,𝛼,𝜇 𝐾 𝜌 𝛼𝑐𝑐′ ;𝜇 (𝐿𝜇)𝑘𝑙 ⟨𝐸 𝜌 𝛼𝑙, 𝑥(𝑎)𝑐 ′⟩ 𝑒𝑘, where 𝐾 𝜌 ∈ R 𝜈𝜌×𝐶𝜌×3×dim End𝐺 (𝑉𝜌 ) is a learnable tensor. We can intepret 𝐹 𝜌 𝑐𝑐′𝑘 := Í 𝛼,𝜇 𝐾 𝜌… view at source ↗
Figure 3
Figure 3. Figure 3: Illustration of our equivariant nonlinearity for 𝐺 = 𝐷3. In this case, 𝐶A1 = 8, 𝐶A2 = 4, 𝐶E1 = 4, 𝑛{𝑒} = 1, 𝑛{𝑒,𝑡 } = 2, 𝑛𝐶3 = 3, and 𝑛𝐷3 = 2. For fixed 𝜎 and up to 𝐺-equivariant linear bi￾jections 𝑉 ′ → 𝑉 ′ , the map represented by Eq. (10) depends only on the orbit structure of 𝑋. That is, we can decompose 𝑋 into a disjoint union of copies of homogeneous 𝐺-spaces, 𝑋 = Ä 𝛼∈Sub(G)/∼ Ä𝑛𝛼 𝑠=1 𝑋𝛼, (11) and th… view at source ↗
Figure 4
Figure 4. Figure 4: Top: equivariant models on the square grid. Bottom: equivariant models on the hexagonal grid [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Comparison of irrep-wise and coupled attention with 10% (left) and 100% (right) of the training data. 11 [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Performance comparison of 𝐷4-equivariant ViTs with different combinations of homogeneous spaces in the MLP layers. Instead of varying the number of homogeneous spaces 𝑛 independently of 𝑞, we fix three “MLP ratios”, defined as 𝑛 3𝑞 . The findings are summarized in [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗
read the original abstract

Vision transformers have become a dominant architecture for visual recognition. However, standard models do not explicitly encode the planar symmetries that arise in many vision domains. We introduce a family of vision transformers equivariant to arbitrary discrete subgroups of $\mathrm{O}(2)$, providing a unified framework that generalizes prior flipping- and $D_4$-equivariant transformer architectures. Our construction yields equivariant analogues of the core transformer components, together with expressivity guarantees for the resulting layers. In particular, we show that whenever $H \le G$, the class of $G$-equivariant ViTs embeds naturally into the class of $H$-equivariant ViTs. We also prove that, in the single-head setting, the corresponding equivariant self-attention layer realizes every $G$-equivariant self-attention map representable by ordinary self-attention. We further construct a $D_6$-equivariant model based on hexagonal patches, making the architecture compatible with six-fold rotational symmetries. We evaluate the resulting models on the PatternNet aerial image dataset in artificially data-scarce regimes across subgroups of $D_4$ and $D_6$. Our experiments compare two equivariant attention mechanisms and analyze how the choice of homogeneous-space configurations used in the nonlinearities affects performance. Preliminary results under matched parameter budgets indicate that equivariance can improve recognition accuracy, motivating further study of how discrete symmetry groups shape transformer-based visual recognition models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript introduces a unified framework for vision transformers equivariant to arbitrary discrete subgroups of O(2), generalizing prior flipping- and D4-equivariant architectures. It constructs equivariant versions of core transformer components (including self-attention and nonlinearities via homogeneous spaces) and proves two central results: whenever H ≤ G the class of G-equivariant ViTs embeds naturally into the class of H-equivariant ViTs, and in the single-head setting the equivariant self-attention layer realizes every G-equivariant self-attention map representable by ordinary self-attention. The framework is instantiated for D6 symmetry using hexagonal patches; models are evaluated on PatternNet in data-scarce regimes, comparing two equivariant attention mechanisms and analyzing the effect of homogeneous-space configuration choices.

Significance. If the stated proofs hold, the embedding and expressivity guarantees constitute a substantive theoretical advance that unifies and extends existing equivariant transformer constructions. The D6 hexagonal instantiation and the explicit treatment of homogeneous-space nonlinearities are concrete strengths. The preliminary PatternNet experiments provide initial evidence that equivariance can yield accuracy gains under matched parameter budgets in low-data regimes, though the results remain exploratory.

minor comments (3)
  1. [§4] §4 (or equivalent): the description of the two equivariant attention mechanisms should include an explicit side-by-side comparison of their computational complexity and parameter counts under the same group action.
  2. [Experiments] Experiments section, Table 1 or equivalent: report the number of random seeds and any statistical significance tests for the reported accuracy improvements; the current 'preliminary results' phrasing leaves the robustness of the gains unclear.
  3. The notation for homogeneous-space configurations in the nonlinearity construction is introduced without a dedicated summary table; adding one would improve readability when comparing D4 and D6 cases.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. No major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity; derivations are self-contained mathematical constructions

full rationale

The central claims consist of explicit constructions of equivariant ViT components (layers, nonlinearities via homogeneous spaces) together with stated proofs of the natural embedding of G-equivariant ViTs into H-equivariant ones whenever H ≤ G, and of the expressivity result for single-head equivariant self-attention. These steps are derived directly from the definitions of group actions, equivariance, and standard transformer blocks; they do not reduce to fitted quantities, self-citations that carry the load of the result, or ansatzes smuggled from prior work by the same authors. Experiments on PatternNet are presented separately as empirical support and do not enter the derivation chain. The paper is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework extends standard equivariant neural network constructions from finite group representation theory without introducing new free parameters or postulated entities beyond the group actions already studied in prior work.

axioms (1)
  • standard math Standard facts from finite group representation theory and equivariant function spaces on homogeneous spaces
    Invoked to define the equivariant analogues of transformer components and the nonlinearities.

pith-pipeline@v0.9.1-grok · 5796 in / 1304 out tokens · 62718 ms · 2026-06-29T04:34:01.338344+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

47 extracted references · 4 canonical work pages · 1 internal anchor

  1. [1]

    URLhttp://arxiv.org/abs/1906.04015

    BrandonAnderson,Truong-SonHy,andRisiKondor.Cormorant: CovariantMolecularNeuralNetworks, November 2019. URLhttp://arxiv.org/abs/1906.04015. arXiv:1906.04015 [physics]

  2. [2]

    VN-Transformer: Rotation-Equivariant Attention for Vector Neurons, January 2023

    Serge Assaad, Carlton Downey, Rami Al-Rfou, Nigamaa Nayakanti, and Ben Sapp. VN-Transformer: Rotation-Equivariant Attention for Vector Neurons, January 2023. URLhttp://arxiv.org/abs/ 2206.04176. arXiv:2206.04176 [cs]

  3. [3]

    Ilyes Batatia, Dávid Péter Kovács, Gregor N. C. Simm, Christoph Ortner, and Gábor Csányi. MACE: HigherOrderEquivariantMessagePassingNeuralNetworksforFastandAccurateForceFields,January

  4. [4]

    arXiv:2206.07697 [stat]

    URLhttp://arxiv.org/abs/2206.07697. arXiv:2206.07697 [stat]

  5. [5]

    Mailoa, Mordechai Kornbluth, Nicola Molinari, Tess E

    Simon Batzner, Albert Musaelian, Lixin Sun, Mario Geiger, Jonathan P. Mailoa, Mordechai Kornbluth, Nicola Molinari, Tess E. Smidt, and Boris Kozinsky. E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials.Nature Communications, 13(1):2453, May 2022. ISSN 2041-1723. doi: 10.1038/s41467-022-29939-5. URLhttps://www.nat...

  6. [6]

    Bekkers, Maxime W

    Erik J. Bekkers, Maxime W. Lafarge, Mitko Veta, Koen AJ Eppenhof, Josien PW Pluim, and Remco Duits. Roto-Translation Covariant Convolutional Networks for Medical Image Analysis, April 2018. URLhttps://arxiv.org/abs/1804.03393v3. 13

  7. [7]

    GeometricDeepLearning: Grids, Groups, Graphs, Geodesics, and Gauges, May 2021

    MichaelM.Bronstein,JoanBruna,TacoCohen,andPetarVeličković. GeometricDeepLearning: Grids, Groups, Graphs, Geodesics, and Gauges, May 2021. URLhttp://arxiv.org/abs/2104.13478. arXiv:2104.13478 [cs]

  8. [8]

    Flopping for FLOPs: Leveraging equivariance for computationalefficiency,June2025

    Georg Bökman, David Nordström, and Fredrik Kahl. Flopping for FLOPs: Leveraging equivariance for computationalefficiency,June2025. URL http://arxiv.org/abs/2502.05169. arXiv:2502.05169 [cs]

  9. [9]

    SE(3)- Equivariant Attention Networks for Shape Reconstruction in Function Space, February 2023

    Evangelos Chatzipantazis, Stefanos Pertigkiozoglou, Edgar Dobriban, and Kostas Daniilidis. SE(3)- Equivariant Attention Networks for Shape Reconstruction in Function Space, February 2023. URL http://arxiv.org/abs/2204.02394. arXiv:2204.02394 [cs]

  10. [10]

    URLhttp://arxiv.org/abs/1811.02017

    TacoCohen,MarioGeiger,andMauriceWeiler.AGeneralTheoryofEquivariantCNNsonHomogeneous Spaces, January 2020. URLhttp://arxiv.org/abs/1811.02017. arXiv:1811.02017 [cs]

  11. [11]

    Cohen and Max Welling

    Taco S. Cohen and Max Welling. Group Equivariant Convolutional Networks, February 2016. URL https://arxiv.org/abs/1602.07576v3

  12. [12]

    Cohen and Max Welling

    Taco S. Cohen and Max Welling. Steerable CNNs, December 2016. URLhttp://arxiv.org/abs/ 1612.08498. arXiv:1612.08498 [cs]

  13. [13]

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, June

    Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, June

  14. [14]

    arXiv:2010.11929 [cs]

    URLhttp://arxiv.org/abs/2010.11929. arXiv:2010.11929 [cs]

  15. [15]

    E(n) Equivariant Message Passing Simplicial Networks, October 2023

    Floor Eijkelboom, Rob Hesselink, and Erik Bekkers. E(n) Equivariant Message Passing Simplicial Networks, October 2023. URLhttp://arxiv.org/abs/2305.07100. arXiv:2305.07100 [cs]

  16. [16]

    Fuchs, Daniel E

    Fabian B. Fuchs, Daniel E. Worrall, Volker Fischer, and Max Welling. SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks, November 2020. URLhttp://arxiv.org/abs/ 2006.10503. arXiv:2006.10503 [cs]

  17. [17]

    Geometricdeeplearningandequivariantneuralnetworks.ArtificialIntelligence Review, 56(12):14605–14662, December 2023

    JanE.Gerken,JimmyAronsson,OscarCarlsson,HampusLinander,FredrikOhlsson,ChristofferPeters- son,andDanielPersson. Geometricdeeplearningandequivariantneuralnetworks.ArtificialIntelligence Review, 56(12):14605–14662, December 2023. ISSN 1573-7462. doi: 10.1007/s10462-023-10502-7. URLhttps://doi.org/10.1007/s10462-023-10502-7

  18. [18]

    Onthesymmetriesofdeeplearning models and their internal representations

    CharlesGodfrey, DavisBrown, TeganEmerson, andHenryKvinge. Onthesymmetriesofdeeplearning models and their internal representations. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, andA.Oh,editors,AdvancesinNeuralInformationProcessingSystems,volume35,pages11893–11905. Curran Associates, Inc., 2022. URLhttps://proceedings.neurips.cc/paper_files/pa...

  19. [19]

    Efficient Equivariant Network

    Lingshen He, Yuxuan Chen, zhengyang shen, Yiming Dong, Yisen Wang, and Zhouchen Lin. Efficient Equivariant Network. InAdvances in Neural Information Processing Systems, volume 34, pages 5290–5302. Curran Associates, Inc., 2021. URLhttps://proceedings.neurips.cc/paper/ 2021/hash/2a79ea27c279e471f4d180b08d62b00a-Abstract.html

  20. [20]

    Emiel Hoogeboom, Jorn W. T. Peters, Taco S. Cohen, and Max Welling. HexaConv, March 2018. URL http://arxiv.org/abs/1803.02108. arXiv:1803.02108 [cs]. 14

  21. [21]

    PlatonicTransformers: A Solid Choice For Equivariance, October 2025

    MohammadMohaiminulIslam,RishabhAnand,DavidR.Wessels,FrisodeKruiff,ThijsP.Kuipers,Rex Ying,ClaraI.Sánchez,SharvareeVadgama,GeorgBökman,andErikJ.Bekkers. PlatonicTransformers: A Solid Choice For Equivariance, October 2025. URL http://arxiv.org/abs/2510.03511. arXiv:2510.03511 [cs]

  22. [22]

    Spatial Transformer Networks, February 2016

    Max Jaderberg, Karen Simonyan, Andrew Zisserman, and Koray Kavukcuoglu. Spatial Transformer Networks, February 2016. URLhttp://arxiv.org/abs/1506.02025. arXiv:1506.02025 [cs]

  23. [23]

    Equivariance with Learned Canonicalization Functions, July 2023

    Sékou-Oumar Kaba, Arnab Kumar Mondal, Yan Zhang, Yoshua Bengio, and Siamak Ravanbakhsh. Equivariance with Learned Canonicalization Functions, July 2023. URLhttp://arxiv.org/abs/ 2211.06489. arXiv:2211.06489 [cs]

  24. [24]

    On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups, November 2018

    Risi Kondor and Shubhendu Trivedi. On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups, November 2018. URLhttp://arxiv.org/abs/1802. 03690. arXiv:1802.03690 [stat]

  25. [25]

    Clebsch-Gordan Nets: a Fully Fourier Space Spherical Convolutional Neural Network, November 2018

    Risi Kondor, Zhen Lin, and Shubhendu Trivedi. Clebsch-Gordan Nets: a Fully Fourier Space Spherical Convolutional Neural Network, November 2018. URLhttp://arxiv.org/abs/1806. 09231. arXiv:1806.09231 [stat]

  26. [26]

    A Geometric Approach to Steerable Convolutions, October 2025

    Soumyabrata Kundu and Risi Kondor. A Geometric Approach to Steerable Convolutions, October 2025. URLhttp://arxiv.org/abs/2510.18813. arXiv:2510.18813 [cs]

  27. [27]

    Steerable Transformers for Volumetric Data, October 2025

    Soumyabrata Kundu and Risi Kondor. Steerable Transformers for Volumetric Data, October 2025. URL http://arxiv.org/abs/2405.15932. arXiv:2405.15932 [cs]

  28. [28]

    Invariant and Equivariant Graph Networks, April 2019

    Haggai Maron, Heli Ben-Hamu, Nadav Shamir, and Yaron Lipman. Invariant and Equivariant Graph Networks, April 2019. URLhttp://arxiv.org/abs/1812.09902. arXiv:1812.09902 [cs]

  29. [29]

    Octic Vision Transformers: Quicker ViTs Through Equivariance, September 2025

    David Nordström, Johan Edstedt, Fredrik Kahl, and Georg Bökman. Octic Vision Transformers: Quicker ViTs Through Equivariance, September 2025. URLhttp://arxiv.org/abs/2505.15441. arXiv:2505.15441 [cs]

  30. [30]

    DINOv2: Learning Robust Visual Features without Supervision, February 2024

    Maxime Oquab, Timothée Darcet, Théo Moutakanni, Huy Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Mahmoud Assran, Nicolas Ballas, Wojciech Galuba, Russell Howes, Po-Yao Huang, Shang-Wen Li, Ishan Misra, Michael Rabbat, Vasu Sharma, Gabriel Synnaeve, Hu Xu, Hervé Jegou, Julien Mairal, Patrick Lab...

  31. [31]

    A characterization theorem for equivariant networks with point-wise activations

    Marco Pacini, Xiaowen Dong, Bruno Lepri, and Gabriele Santin. A characterization theorem for equivariant networks with point-wise activations. InThe Twelfth International Conference on Learning Representations, 2024. URLhttps://openreview.net/forum?id=79FVDdfoSR

  32. [32]

    On universality classes of equivariantnetworks

    Marco Pacini, Gabriele Santin, Bruno Lepri, and Shubhendu Trivedi. On universality classes of equivariantnetworks. InTheThirty-ninthAnnualConferenceonNeuralInformationProcessingSystems,

  33. [33]

    URLhttps://openreview.net/forum?id=V4YAS7NLXi

  34. [34]

    Universal Equivariant Multilayer Perceptrons

    Siamak Ravanbakhsh. Universal Equivariant Multilayer Perceptrons. InProceedings of the 37th International Conference on Machine Learning, pages 7996–8006. PMLR, November 2020. URL https://proceedings.mlr.press/v119/ravanbakhsh20a.html. 15

  35. [35]

    E(n) Equivariant Graph Neural Networks, February 2022

    Victor Garcia Satorras, Emiel Hoogeboom, and Max Welling. E(n) Equivariant Graph Neural Networks, February 2022. URLhttp://arxiv.org/abs/2102.09844. arXiv:2102.09844 [cs]

  36. [36]

    Graduate Texts in Mathematics

    Jean-Pierre Serre.Linear representations of finite groups. Graduate Texts in Mathematics. Springer, New York, NY, September 1977

  37. [37]

    Shawe-Taylor

    J. Shawe-Taylor. Building symmetries into feedforward networks. In1989 First IEE International Conference on Artificial Neural Networks, (Conf. Publ. No. 313), pages 158–162, October 1989. URL https://ieeexplore.ieee.org/document/51951

  38. [38]

    Equivariant Transformer Networks, May 2019

    Kai Sheng Tai, Peter Bailis, and Gregory Valiant. Equivariant Transformer Networks, May 2019. URL http://arxiv.org/abs/1901.11399. arXiv:1901.11399 [cs]

  39. [39]

    Gomez, Lukasz Kaiser, and Illia Polosukhin

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention Is All You Need, December 2017. URLhttp: //arxiv.org/abs/1706.03762. arXiv:1706.03762 [cs]

  40. [40]

    Veeling, Jasper Linmans, Jim Winkens, Taco Cohen, and Max Welling

    Bastiaan S. Veeling, Jasper Linmans, Jim Winkens, Taco Cohen, and Max Welling. Rotation Equivariant CNNs for Digital Pathology, June 2018. URLhttp://arxiv.org/abs/1806.03962. arXiv:1806.03962 [cs]

  41. [41]

    Vggt: Visual geometry grounded transformer

    Jianyuan Wang, Minghao Chen, Nikita Karaev, Andrea Vedaldi, Christian Rupprecht, and David Novotny. Vggt: Visual geometry grounded transformer. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5294–5306, June 2025

  42. [42]

    General $E(2)$-Equivariant Steerable CNNs, April 2021

    Maurice Weiler and Gabriele Cesa. General $E(2)$-Equivariant Steerable CNNs, April 2021. URL http://arxiv.org/abs/1911.08251. arXiv:1911.08251 [cs]

  43. [43]

    3D Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data, October 2018

    Maurice Weiler, Mario Geiger, Max Welling, Wouter Boomsma, and Taco Cohen. 3D Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data, October 2018. URLhttp: //arxiv.org/abs/1807.02547. arXiv:1807.02547 [cs]

  44. [44]

    Representation theory and invariant neural networks

    Jeffrey Wood and John Shawe-Taylor. Representation theory and invariant neural networks. Discrete Applied Mathematics, 69(1):33–60, 1996. ISSN 0166-218X. doi: https://doi.org/10. 1016/0166-218X(95)00075-3. URL https://www.sciencedirect.com/science/article/pii/ 0166218X95000753

  45. [45]

    $E(2)$-EquivariantVisionTransformer,July2023

    RenjunXu,KaifanYang,KeLiu,andFengxiangHe. $E(2)$-EquivariantVisionTransformer,July2023. URLhttp://arxiv.org/abs/2306.06722. arXiv:2306.06722 [cs]

  46. [46]

    Hofgard, Aria Mansouri Tehrani, Rui Wang, Ameya Daigavane, Montgomery Bohde, Jerry Kurtin, Qian Huang, Tuong Phung, Minkai Xu, Chaitanya K

    Xuan Zhang, Limei Wang, Jacob Helwig, Youzhi Luo, Cong Fu, Yaochen Xie, Meng Liu, Yuchao Lin, Zhao Xu, Keqiang Yan, Keir Adams, Maurice Weiler, Xiner Li, Tianfan Fu, Yucheng Wang, Alex Strasser, Haiyang Yu, YuQing Xie, Xiang Fu, Shenglong Xu, Yi Liu, Yuanqi Du, Alexandra Saxton, Hongyi Ling, Hannah Lawrence, Hannes Stärk, Shurui Gui, Carl Edwards, Nichola...

  47. [47]

    PatternNet: A Benchmark Dataset for Performance Evaluation of Remote Sensing Image Retrieval

    Weixun Zhou, Shawn Newsam, Congmin Li, and Zhenfeng Shao. PatternNet: A Benchmark Dataset for Performance Evaluation of Remote Sensing Image Retrieval.ISPRS Journal of Photogrammetry and Remote Sensing, 145:197–209, November 2018. ISSN 09242716. doi: 10.1016/j.isprsjprs.2018.01.004. URLhttp://arxiv.org/abs/1706.03424. arXiv:1706.03424 [cs]. A Discrete sub...