arxiv: 2604.08034 · v1 · submitted 2026-04-09 · 💻 cs.CV

Recognition: 2 theorem links

· Lean Theorem

Rotation Equivariant Convolutions in Deformable Registration of Brain MRI

Arghavan Rezvani , Kun Han , Anthony T. Wu , Pooya Khosravi , Xiaohui Xie

Authors on Pith no claims yet

Pith reviewed 2026-05-10 18:25 UTC · model grok-4.3

classification 💻 cs.CV

keywords rotation equivariancedeformable registrationbrain MRIconvolutional networksinductive biasmedical image registrationsample efficiency

0 comments

The pith

Rotation-equivariant convolutions integrated into brain MRI registration networks improve accuracy while using fewer parameters.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests the impact of replacing standard convolutional encoders with rotation-equivariant versions inside deformable registration networks for brain MRI. Conventional networks do not automatically produce rotated outputs for rotated inputs and must learn rotational symmetry from data alone. By swapping in equivariant encoders across three baseline architectures and evaluating on multiple public datasets, the authors report gains in alignment accuracy, robustness when input scans have different orientations, and performance even with reduced training data. These outcomes follow from the equivariant layers directly encoding the rotational symmetry of brain anatomy as an inductive bias rather than requiring the model to discover it.

Core claim

Replacing standard convolutional encoders with rotation-equivariant encoders in deformable registration networks for brain MRI produces higher registration accuracy with fewer parameters, stronger performance on input pairs that differ in orientation, and maintained accuracy when trained on smaller datasets, as shown across several baseline models and public brain MRI collections.

What carries the argument

Rotation-equivariant convolutions that rotate feature maps in exact correspondence with the input rotation, thereby embedding anatomical rotational symmetry as an inductive bias inside the registration encoder.

If this is right

Registration accuracy rises while the number of network parameters falls.
Models outperform standard baselines specifically when input image pairs have relative rotations.
Strong results are obtained even with smaller training sets, showing greater sample efficiency.
The encoder swap works across multiple existing registration architectures and public datasets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Equivariant designs could be tested in related tasks such as segmentation where scan orientation also varies.
Clinical pipelines might reduce preprocessing steps that standardize scan orientation before registration.
The parameter reduction could support faster inference on devices with limited compute.

Load-bearing premise

The measured gains arise specifically from the rotation-equivariant inductive bias and not from incidental differences in network implementation, hyperparameter tuning, or dataset properties.

What would settle it

Re-train the equivariant and baseline models with identical parameter counts, the same hyperparameters, identical random seeds, and the same data splits; if the accuracy advantage of the equivariant models disappears, the central claim would be falsified.

read the original abstract

Image registration is a fundamental task that aligns anatomical structures between images. While CNNs perform well, they lack rotation equivariance - a rotated input does not produce a correspondingly rotated output. This hinders performance by failing to exploit the rotational symmetries inherent in anatomical structures, particularly in brain MRI. In this work, we integrate rotation-equivariant convolutions into deformable brain MRI registration networks. We evaluate this approach by replacing standard encoders with equivariant ones in three baseline architectures, testing on multiple public brain MRI datasets. Our experiments demonstrate that equivariant encoders have three key advantages: 1) They achieve higher registration accuracy while reducing network parameters, confirming the benefit of this anatomical inductive bias. 2) They outperform baselines on rotated input pairs, demonstrating robustness to orientation variations common in clinical practice. 3) They show improved performance with less training data, indicating greater sample efficiency. Our results demonstrate that incorporating geometric priors is a critical step toward building more robust, accurate, and efficient registration models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Equivariant encoders in registration networks are a reasonable idea but the abstract supplies zero numbers, so the claimed gains cannot be assessed.

read the letter

They swap rotation-equivariant convolutions into the encoders of three deformable registration baselines for brain MRI and state that this produces higher accuracy with fewer parameters, better results on rotated image pairs, and stronger performance when training data is scarce. The approach is a direct application of an existing geometric prior to a task where orientation variations are common in practice. Testing the change across multiple public datasets and architectures is a straightforward way to check whether the inductive bias helps. The three reported advantages match what equivariance should deliver if the symmetries in brain anatomy are being underused by standard CNNs. That part of the work is clear and on target. The soft spot is exactly the one flagged in the stress test. The abstract contains no quantitative metrics, no tables, no error bars, and no description of how the baselines were matched for capacity, normalization, or training schedule. Without those controls it is impossible to know whether the improvements come from equivariance or from incidental differences in the implementation. The central attribution therefore rests on unverified ground. This paper is for researchers already working on deformable registration pipelines who want to try adding geometric structure. A reader in that niche could extract the basic recipe and test it themselves. It is not broad enough or conclusive enough to interest a wider audience yet. The work shows clear thinking about the problem and honest engagement with the literature, so it deserves a serious referee to examine the actual experiments and controls rather than a desk rejection.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes integrating rotation-equivariant convolutions into deformable brain MRI registration networks. It replaces standard encoders with equivariant versions in three baseline architectures and evaluates on public datasets, claiming three advantages: higher registration accuracy with fewer parameters, improved robustness on rotated input pairs, and better sample efficiency with limited training data.

Significance. If the performance differences can be isolated to the rotation-equivariant inductive bias, the work would provide concrete evidence that geometric priors improve accuracy, robustness, and efficiency in medical image registration. The practical strategy of modifying existing baselines across multiple architectures is a strength that facilitates direct comparison.

major comments (2)

[Experiments section] Experiments section: The manuscript does not explicitly state that the standard baselines were re-implemented with identical parameter counts, normalization layers, upsampling paths, optimizer schedules, and data augmentation as the equivariant variants. Since replacing the encoder necessarily changes filter parameterization and group representations, the absence of these controls leaves open the possibility that observed gains arise from unaccounted implementation differences rather than equivariance. This directly undermines attribution for all three claimed advantages.
[Results and tables] Results and tables: The abstract asserts that experiments 'demonstrate' the three advantages, yet the provided description supplies no quantitative metrics (e.g., Dice scores, TRE), error bars, or statistical tests. If the full results tables similarly lack these or do not report per-baseline comparisons with matched capacity, the empirical support for the central claims remains uninspectable and insufficient to establish the advantages.

minor comments (2)

[Abstract] The abstract would be strengthened by including one or two key quantitative results (e.g., average Dice improvement and parameter reduction) to make the claimed advantages concrete.
[Method] Notation for the equivariant convolution layers should be introduced with a brief equation or reference to the specific group representation used (e.g., C4 or SO(2) discretization) to clarify implementation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. The comments highlight important aspects of experimental rigor and results presentation that we address point by point below. We commit to revisions that clarify controls and strengthen the visibility of quantitative evidence while preserving the core findings.

read point-by-point responses

Referee: [Experiments section] Experiments section: The manuscript does not explicitly state that the standard baselines were re-implemented with identical parameter counts, normalization layers, upsampling paths, optimizer schedules, and data augmentation as the equivariant variants. Since replacing the encoder necessarily changes filter parameterization and group representations, the absence of these controls leaves open the possibility that observed gains arise from unaccounted implementation differences rather than equivariance. This directly undermines attribution for all three claimed advantages.

Authors: We agree that an explicit statement of implementation controls is necessary to firmly attribute gains to the rotation-equivariant bias. Although the baselines were re-implemented with matched parameter budgets, identical normalization layers, upsampling paths, optimizer schedules, and data augmentation, the manuscript did not detail these equivalences sufficiently. In the revised version we will insert a dedicated paragraph in the Experiments section confirming these controls and noting that architectural differences are isolated to the encoder's convolution parameterization and group representations. This revision will eliminate ambiguity regarding fair comparison. revision: yes
Referee: [Results and tables] Results and tables: The abstract asserts that experiments 'demonstrate' the three advantages, yet the provided description supplies no quantitative metrics (e.g., Dice scores, TRE), error bars, or statistical tests. If the full results tables similarly lack these or do not report per-baseline comparisons with matched capacity, the empirical support for the central claims remains uninspectable and insufficient to establish the advantages.

Authors: The full manuscript contains results tables that report Dice scores, TRE values, standard deviations as error bars, and direct per-baseline comparisons under matched capacity. The narrative, however, does not sufficiently highlight these numbers or include statistical tests. We will therefore revise the abstract to reference key quantitative improvements and expand the Results section to explicitly discuss the metrics, error bars, and statistical significance. These changes make the supporting evidence more prominent without altering the underlying data. revision: partial

Circularity Check

0 steps flagged

No circularity: purely empirical evaluation with no derivation chain

full rationale

The paper describes an empirical integration of rotation-equivariant convolutions into deformable registration networks by replacing encoders in three baseline architectures, followed by evaluation on public brain MRI datasets. No equations, first-principles derivations, fitted parameters renamed as predictions, or self-citation chains appear in the provided text or abstract. Claims of higher accuracy, robustness to rotations, and sample efficiency rest on experimental comparisons rather than any quantity that reduces to its own inputs by construction. This is a standard non-circular empirical ML paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that rotation equivariance supplies a beneficial inductive bias for brain anatomy; no free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption Rotation equivariance is a useful inductive bias for anatomical structures in brain MRI.
Invoked to explain why equivariant encoders should outperform standard ones.

pith-pipeline@v0.9.0 · 5485 in / 1076 out tokens · 40211 ms · 2026-05-10T18:25:03.998643+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

We replace the standard convolutional encoders in three representative architectures—VoxelMorph, Dual-PRNet++, and RDP Net—with SE(3)-equivariant encoders... steerable 3D convolution kernels are constructed from a basis that separates radial and angular dependencies: κ(x) = Σ_l Φ_l(∥x∥) Q_l(x/∥x∥)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

21 extracted references · 2 canonical work pages · 1 internal anchor

[1]

Traditional optimization-based methods achieve accurate alignment but are computationally expen- sive

INTRODUCTION Deformable image registration, which aligns anatomical structures between image pairs, is a fundamental task in med- ical image analysis. Traditional optimization-based methods achieve accurate alignment but are computationally expen- sive. Learning-based methods, popularized by V oxelMorph [1], revolutionized the field by using CNNs to predi...
[2]

METHOD 2.1. Problem Formulation:Given a moving and fixed im- ageI m, If ∈R H×W×D , registration aims to find a dense deformation fieldϕ: Ω→Ω(whereΩ⊂R 3) to align the anatomical structures of the warped moving imageIm◦ϕwith the fixed imageI f . This process minimizes a loss function of the form: L(ϕ) =L sim(If , Im ◦ϕ) +λL reg(ϕ)(1) whereL sim measures the...

work page internal anchor Pith review Pith/arXiv arXiv 2026
[3]

Datasets:We conducted our experiments on three main 3D brain MRI datasets:OASIS[10] with 35 anatomical labels

EXPERIMENTAL SETUP 3.1. Datasets:We conducted our experiments on three main 3D brain MRI datasets:OASIS[10] with 35 anatomical labels. LPBA40[11] with 54 anatomical labels.MindBoggle[12] with 97 anatomical labels, with both cortical (62 labels) and subcortical (35 labels) regions. 3.2. Implementation Details:We implemented models in PyTorch using the escn...
[4]

Parameter efficiency:Fig

RESULTS AND DISCUSSION 4.1. Parameter efficiency:Fig. 1 compares parameter counts between baseline and equivariant models. Equivariant versions contain 78% (VM), 96% (Dual-PR++), and 81% (RDP) of baseline parameters, with reductions solely from encoder modifications. This parameter efficiency contributes to better generalizability and sample efficiency (w...
[5]

For the mixed configurations (7:3, 4:4, 5:2:1, 2:2:2), we fixed the first layer to a 2:2 split of irrep-0 and irrep-1 following the design principle in 2

in the encoder design of the Dual-PR++ model to examine the effect of this hyperparameter on model performance on LPBA dataset. For the mixed configurations (7:3, 4:4, 5:2:1, 2:2:2), we fixed the first layer to a 2:2 split of irrep-0 and irrep-1 following the design principle in 2. Target ratios were applied at deeper layers (16+ channels) to maintain sim...
[6]

We evaluated this approach by replacing standard encoders in three baseline architectures (V oxelMorph, Dual- PRNet++, and RDP Net) across multiple brain MRI datasets

CONCLUSION In this work, we demonstrated that integrating rotation- equivariant SE(3) encoders into deformable brain MRI reg- istration networks improves performance and parameter ef- ficiency. We evaluated this approach by replacing standard encoders in three baseline architectures (V oxelMorph, Dual- PRNet++, and RDP Net) across multiple brain MRI datas...
[7]

Ethical approval was not required as confirmed by the licenses attached with these open access datasets

COMPLIANCE WITH ETHICAL STANDARDS This research study was conducted retrospectively using hu- man subject data made available in open access by Oasis [10], LPBA40 [11] and MindBoggle [12]. Ethical approval was not required as confirmed by the licenses attached with these open access datasets
[8]

ACKNOWLEDGMENTS The authors report no conflicts of interest
[9]

V oxelmorph: a learning framework for deformable medical image registration,

Guha Balakrishnan, Amy Zhao, Mert R Sabuncu, John Guttag, and Adrian V Dalca, “V oxelmorph: a learning framework for deformable medical image registration,” IEEE transactions on medical imaging, vol. 38, no. 8, pp. 1788–1800, 2019

2019
[10]

Dual-stream pyramid reg- istration network,

Miao Kang, Xiaojun Hu, Weilin Huang, Matthew R Scott, and Mauricio Reyes, “Dual-stream pyramid reg- istration network,”Medical image analysis, vol. 78, pp. 102379, 2022

2022
[11]

Recursive deformable pyramid network for unsupervised medi- cal image registration,

Haiqiao Wang, Dong Ni, and Yi Wang, “Recursive deformable pyramid network for unsupervised medi- cal image registration,”IEEE Transactions on Medical Imaging, vol. 43, no. 6, pp. 2229–2240, 2024

2024
[12]

Group equivariant con- volutional networks,

Taco Cohen and Max Welling, “Group equivariant con- volutional networks,” inInternational conference on machine learning. PMLR, 2016, pp. 2990–2999

2016
[13]

Steerable CNNs

Taco S Cohen and Max Welling, “Steerable cnns,”arXiv preprint arXiv:1612.08498, 2016

work page Pith review arXiv 2016
[14]

Leveraging so (3)-steerable convolutions for pose- robust semantic segmentation in 3d medical data,

Ivan Diaz, Mario Geiger, and Richard Iain McKin- ley, “Leveraging so (3)-steerable convolutions for pose- robust semantic segmentation in 3d medical data,”The journal of machine learning for biomedical imaging, vol. 2, no. May 2024, pp. 834, 2024

2024
[15]

Se (3)-equivariant and noise- invariant 3d rigid motion tracking in brain mri,

Benjamin Billot, Neel Dey, Daniel Moyer, Malte Hoff- mann, Esra Abaci Turk, Borjan Gagoski, P Ellen Grant, and Polina Golland, “Se (3)-equivariant and noise- invariant 3d rigid motion tracking in brain mri,”IEEE transactions on medical imaging, vol. 43, no. 11, pp. 4029–4040, 2024

2024
[16]

Rotir: Rotation-equivariant network and trans- formers for zebrafish scale image registration,

Ruixiong Wang, Alin Achim, Renata Raele-Rolfe, Qiao Tong, Dylan Bergen, Chrissy Hammond, and Stephen Cross, “Rotir: Rotation-equivariant network and trans- formers for zebrafish scale image registration,” inAn- nual Conference on Medical Image Understanding and Analysis. Springer, 2024, pp. 285–299

2024
[17]

3d steerable cnns: Learning rotationally equivariant features in volumetric data,

Maurice Weiler, Mario Geiger, Max Welling, Wouter Boomsma, and Taco S Cohen, “3d steerable cnns: Learning rotationally equivariant features in volumetric data,”Advances in Neural information processing sys- tems, vol. 31, 2018

2018
[18]

Open access series of imaging studies (oasis): cross- sectional mri data in young, middle aged, nondemented, and demented older adults,

Daniel S Marcus, Tracy H Wang, Jamie Parker, John G Csernansky, John C Morris, and Randy L Buckner, “Open access series of imaging studies (oasis): cross- sectional mri data in young, middle aged, nondemented, and demented older adults,”Journal of cognitive neuro- science, vol. 19, no. 9, pp. 1498–1507, 2007

2007
[19]

Construction of a 3d probabilistic at- las of human cortical structures,

David W Shattuck, Mubeena Mirza, Vitria Adisetiyo, Cornelius Hojatkashani, Georges Salamon, Kather- ine L Narr, Russell A Poldrack, Robert M Bilder, and Arthur W Toga, “Construction of a 3d probabilistic at- las of human cortical structures,”Neuroimage, vol. 39, no. 3, pp. 1064–1080, 2008

2008
[20]

Mind- boggling morphometry of human brains,

Arno Klein, Satrajit S Ghosh, Forrest S Bao, Joachim Giard, Yrj ¨o H ¨ame, Eliezer Stavsky, Noah Lee, Brian Rossa, Martin Reuter, Elias Chaibub Neto, et al., “Mind- boggling morphometry of human brains,”PLoS compu- tational biology, vol. 13, no. 2, pp. e1005350, 2017

2017
[21]

A program to build E(N)-equivariant steerable CNNs,

Gabriele Cesa, Leon Lang, and Maurice Weiler, “A program to build E(N)-equivariant steerable CNNs,” in International Conference on Learning Representations, 2022

2022