arxiv: 2605.10571 · v1 · submitted 2026-05-11 · 📡 eess.IV · cs.CV

Recognition: 2 theorem links

· Lean Theorem

Set-Based Groupwise Registration for Variable-Length, Variable-Contrast Cardiac MRI

Ma\v{s}a Bo\v{z}i\'c-Iven, Qian Tao, Sebastian Weing\"artner, Tijmen Toxopeus, Yidong Zhao, Yi Zhang

Pith reviewed 2026-05-12 02:53 UTC · model grok-4.3

classification 📡 eess.IV cs.CV

keywords groupwise registrationcardiac MRImotion correctionset-based learningpermutation invariancequantitative MRIzero-shot generalizationT1 mapping

0 comments

The pith

A single neural network registers cardiac MRI sequences of any length and contrast by treating them as unordered sets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a registration method that processes cardiac MRI image sequences as unordered sets rather than fixed-length stacks. This change removes the dependence on specific sequence lengths, image ordering, and contrast patterns that limit existing deep learning approaches. Trained only on one public T1 mapping dataset with 11 frames, the model applies directly to other sequences with lengths up to 60 frames and different contrast behaviors. If the approach holds, motion correction for quantitative cardiac MRI would no longer require separate models or retraining for each imaging protocol.

Core claim

The central claim is that a set-based groupwise registration framework processes a quantitative MRI sequence as an unordered set, using a shared encoder and correlation-guided feature aggregation to construct a permutation-invariant canonical reference while learning a permutation-equivariant mapping from images to deformation fields. This formulation, combined with contrast-insensitive features, enables zero-shot generalization from a single training dataset to unseen protocols with variable lengths and contrasts, consistently improving downstream quantitative mapping quality. The framework also extends directly to Cine MRI for inter-phase registration.

What carries the argument

AnyTwoReg, a set-based groupwise registration framework that processes quantitative MRI sequences as unordered sets via a shared encoder and correlation-guided feature aggregation to build a permutation-invariant canonical reference.

If this is right

The same network architecture handles sequences of lengths from 11 to 60 frames without modification.
Registration quality holds across different contrast dynamics, leading to improved quantitative parameter maps on unseen datasets.
The model applies directly to Cine MRI sequences for inter-cardiac-phase registration without retraining.
No protocol-specific ordering information or fine-tuning is required at inference time.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

A unified registration model could cover multiple cardiac imaging protocols and reduce the need for separate training datasets for each one.
The set formulation may extend to other medical imaging tasks involving variable-length time series or sequences.
Pairing the approach with larger foundation models for feature extraction could further improve robustness to contrast changes.

Load-bearing premise

That correlation-guided feature aggregation from a shared encoder can reliably construct a permutation-invariant canonical reference for registration, even under extreme contrast variations and without any protocol-specific fine-tuning or ordering information.

What would settle it

Registration accuracy and quantitative mapping quality on a third unseen MRI protocol with extreme contrast changes that fail to exceed standard pairwise methods would falsify the zero-shot generalization claim.

Figures

Figures reproduced from arXiv: 2605.10571 by Ma\v{s}a Bo\v{z}i\'c-Iven, Qian Tao, Sebastian Weing\"artner, Tijmen Toxopeus, Yidong Zhao, Yi Zhang.

**Figure 1.** Figure 1: Cardiac MRI sequences exhibit high heterogeneity across imaging protocols with different mechanisms (Mi) and sequence lengths (L). Analytical models are expressed by external parameters (red; e.g., inversion times T Ii, preparation P repi, or phases ti) and internal properties (dark blue; e.g., T1, T ∗ 1 , MBF, M0, A, or B) [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: Comparison of two learning-based groupwise registration designs: (a) The conventional channel stacking design concatenates frames; it is order-sensitive (fθ(π · Iseq) ̸= π · fθ(Iseq)) and tied to a fixed length L. (b) Set-based design (Any2Reg) uses shared encoders and a canonical reference T = Γ(I) broadcasted to all frames; it enables permutation-equivariant registration (R(π · I) = π · R(I)) for an arb… view at source ↗

**Figure 3.** Figure 3: T1 mapping quality on STONE (LV+Myo). (a) Slice-wise mean R 2 . Slice position: base (0) to apex (4). Any2Reg IO demonstrates the best alignment at all slice positions. (b) R 2 survival curves show Any2Reg has consistently higher fitting quality as a result of superior alignment [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Qualitative result on a representative STONE sequence with T1 map, fitting uncertainty SDT1 (↓), and pixelwise fitting R 2 (↑). White arrows indicate the location of large motion. Both Any2Reg and Any2Reg IO yield clear T1 maps with high precision. Scalability analysis. We evaluated Any2Reg’s scalability up to L = 512. Any2Reg inference scales linearly (∼2.4 ms per additional image), taking ∼0.2 s at L = 6… view at source ↗

read the original abstract

Quantitative cardiac magnetic resonance imaging (MRI) enables non-invasive myocardial tissue characterization but relies on robust motion correction within these variable-length, variable-contrast image sequences. Groupwise registration, which simultaneously aligns all images, has shown greater robustness than pairwise registration for motion correction. However, current deep-learning-based groupwise registration methods cannot generalize across MRI sequences: the architecture typically encodes input data as a fixed-length channel stack, which rigidly couples network design to protocol-specific sequence length, input ordering, and contrast dynamics. At inference time, any change in imaging protocols will render the network unusable. In this work, we introduce \emph{\AnyTwoReg}, a new set-based groupwise registration framework that takes a quantitative MRI sequence as an unordered set. This set formulation fundamentally decouples network design from sequence length and input ordering. By utilizing a shared encoder and correlation-guided feature aggregation, \emph{\AnyTwoReg} constructs a permutation-invariant canonical reference for registration, and learns a permutation-equivariant mapping from images to deformation fields. Additionally, we extract contrast-insensitive image features from an existing foundation model to handle extreme contrast variations. Trained exclusively on a single public $T_1$ mapping dataset (STONE, sequence length $L=11$), \AnyTwoReg generalizes to two unseen quantitative MRI datasets (MOLLI, ASL) with variable lengths ($L \in [11, 60]$) and different contrast dynamics. It achieves strong cross-protocol generalization in a zero-shot manner, and consistently improves downstream quantitative mapping quality. Notably, while designed for quantitative MRI sequences, our framework is directly applicable to Cine MRI sequences for inter-cardiac-phase registration.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

AnyTwoReg moves groupwise registration to an unordered-set input so one network can handle different lengths and contrasts in cardiac MRI without retraining, but the zero-shot results rest on thin public evidence.

read the letter

The main point is that this work replaces the usual fixed-length channel-stack input with a set formulation. A shared encoder plus correlation-guided aggregation builds a permutation-invariant reference, and a separate head produces equivariant deformations. They also pull contrast-insensitive features from a foundation model. Trained only on the public STONE T1-mapping set (11 frames), the model is said to run zero-shot on MOLLI and ASL sequences whose lengths range from 11 to 60 frames and whose contrast curves differ, with reported gains in downstream quantitative maps. The architecture directly targets the practical problem that changing protocols breaks existing deep groupwise methods.

Referee Report

2 major / 2 minor

Summary. The paper introduces AnyTwoReg, a set-based groupwise registration framework for quantitative cardiac MRI sequences of variable length and contrast. Unlike prior deep-learning methods that encode inputs as fixed-length channel stacks, AnyTwoReg treats the sequence as an unordered set, using a shared encoder, correlation-guided feature aggregation to produce a permutation-invariant canonical reference, and a permutation-equivariant deformation head. Contrast-insensitive features are extracted from a foundation model. The network is trained only on the public STONE T1-mapping dataset (L=11) and is claimed to generalize zero-shot to unseen MOLLI and ASL datasets with L in [11,60] and different contrast dynamics, improving downstream quantitative mapping; the framework is also noted as applicable to Cine MRI.

Significance. If the zero-shot cross-protocol generalization holds, the set-based formulation would meaningfully advance motion correction for quantitative MRI by removing the need for protocol-specific retraining or fixed-length assumptions. The use of a public foundation model for contrast-insensitive features and the explicit handling of permutation invariance are strengths that could extend to other variable-sequence imaging tasks. The work provides a concrete architectural decoupling from sequence length and ordering, which addresses a practical limitation in current groupwise registration networks.

major comments (2)

[Abstract / Methods] Abstract and methods description: the central claim of reliable permutation-invariant canonical reference construction via correlation-guided feature aggregation is load-bearing for the zero-shot generalization result, yet no scaling analysis, ablation on aggregation variants, or failure-case characterization is provided for L ≫ 11 or for contrast trajectories whose intensity distributions differ substantially from the STONE training distribution; this leaves the stability of the reference under the reported test conditions unverified.
[Abstract] Abstract: the assertion of 'strong cross-protocol generalization' and 'consistent improvement in downstream quantitative mapping quality' on MOLLI and ASL is presented without any reported evaluation metrics, baselines, statistical significance tests, or controls for data selection and sequence ordering; this absence prevents assessment of whether the data support the zero-shot claim.

minor comments (2)

[Methods] The notation for the set input and the precise form of the correlation-guided aggregation (e.g., whether it is attention-based pooling or weighted averaging) should be formalized with equations to improve reproducibility.
[Experiments] Figure captions and experimental tables would benefit from explicit reporting of sequence lengths and contrast types for each test case to make the variable-L and variable-contrast claims immediately verifiable.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thorough and constructive review. We address each major comment below and describe the revisions we will make to improve the manuscript.

read point-by-point responses

Referee: [Abstract / Methods] Abstract and methods description: the central claim of reliable permutation-invariant canonical reference construction via correlation-guided feature aggregation is load-bearing for the zero-shot generalization result, yet no scaling analysis, ablation on aggregation variants, or failure-case characterization is provided for L ≫ 11 or for contrast trajectories whose intensity distributions differ substantially from the STONE training distribution; this leaves the stability of the reference under the reported test conditions unverified.

Authors: We agree that additional verification of the reference construction would strengthen the work. Our experiments already demonstrate generalization to L up to 60 and to contrast dynamics in MOLLI and ASL that differ from STONE. In the revised manuscript we will add an ablation comparing aggregation variants on the existing test sets and a scaling analysis that evaluates performance as a function of sequence length within the available MOLLI and ASL data. Comprehensive failure-case characterization for contrast trajectories far outside the training distribution would require new acquisitions; we will instead expand the limitations section to discuss this boundary explicitly. revision: partial
Referee: [Abstract] Abstract: the assertion of 'strong cross-protocol generalization' and 'consistent improvement in downstream quantitative mapping quality' on MOLLI and ASL is presented without any reported evaluation metrics, baselines, statistical significance tests, or controls for data selection and sequence ordering; this absence prevents assessment of whether the data support the zero-shot claim.

Authors: We acknowledge that the abstract would be clearer with quantitative support. We will revise the abstract to include key metrics (e.g., registration accuracy and downstream mapping improvements) together with explicit references to the results section, where baselines, statistical tests, and controls for data selection and ordering are reported. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation is self-contained empirical architecture

full rationale

The paper proposes AnyTwoReg as a set-based DL framework using a shared encoder, correlation-guided feature aggregation for a permutation-invariant reference, and a permutation-equivariant deformation head, plus features from an external foundation model. It is trained on one public dataset (STONE) and evaluated zero-shot on others (MOLLI, ASL). No equations, parameters, or claims reduce to self-definitions, fitted inputs renamed as predictions, or load-bearing self-citations. The central claims are architectural choices and empirical generalization results, not tautological reductions to inputs. This matches the default case of an honest non-finding.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that neural networks can learn permutation-invariant and equivariant mappings suitable for registration, plus the availability of contrast-insensitive features from an external foundation model. No free parameters or invented entities are explicitly introduced in the abstract.

axioms (1)

domain assumption Neural networks can be designed to produce permutation-invariant canonical references and permutation-equivariant deformation fields from unordered image sets.
This underpins the set formulation and is invoked to justify decoupling from sequence length and ordering.

pith-pipeline@v0.9.0 · 5626 in / 1442 out tokens · 54483 ms · 2026-05-12T02:53:08.542410+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear
permutation-equivariant registration network R(π·I)=π·R(I) with invariant operator Γ

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages

[1]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Abulnaga, S.M., Hoopes, A., Dey, N., Hoffmann, M., Fischl, B., Guttag, J., Dalca, A.: Multimorph: On-demand atlas construction. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 30906–30917 (2025)

work page 2025
[2]

In: IEEE International Conference on Mi- crowaves, Antennas, Communications and Electronic Systems

Arava, D., Masarwy, M., Khawaled, S., Freiman, M.: Deep-learning based motion correction for myocardial t1 mapping. In: IEEE International Conference on Mi- crowaves, Antennas, Communications and Electronic Systems. pp. 55–59 (2021)

work page 2021
[3]

IEEE transactions on medical imaging38(8), 1788–1800 (2019)

Balakrishnan, G., Zhao, A., Sabuncu, M.R., Guttag, J., Dalca, A.V.: Voxelmorph: a learning framework for deformable medical image registration. IEEE transactions on medical imaging38(8), 1788–1800 (2019)

work page 2019
[4]

Bernard, O., Lalande, A., Zotti, C., Cervenansky, F., Yang, X., Heng, P.A., Cetin, I., Lekadir, K., Camara, O., Ballester, M.A.G., et al.: Deep learning techniques for automatic mri cardiac multi-structures segmentation and diagnosis: is the problem solved? IEEE transactions on medical imaging37(11), 2514–2525 (2018)

work page 2018
[5]

Magnetic Resonance in Medicine91(1), 118–132 (2024)

Božić-Iven, M., Rapacchi, S., Tao, Q., Pierce, I., Thornton, G., Nitsche, C., Treibel, T.A., Schad, L.R., Weingärtner, S.: Improved reproducibility for myocar- dial asl: Impact of physiological and acquisition parameters. Magnetic Resonance in Medicine91(1), 118–132 (2024)

work page 2024
[6]

IEEE Transactions on Medical Imaging40(12), 3543–3554 (2021)

Campello, V.M., Gkontra, P., Izquierdo, C., Martin-Isla, C., Sojoudi, A., Full, P.M., Maier-Hein, K., Zhang, Y., He, Z., Ma, J., et al.: Multi-centre, multi-vendor and multi-disease cardiac segmentation: the m&ms challenge. IEEE Transactions on Medical Imaging40(12), 3543–3554 (2021)

work page 2021
[7]

Magnetic resonance in medicine80(2), 780–791 (2018)

El-Rewaidy, H., Nezafat, M., Jang, J., Nakamori, S., Fahmy, A.S., Nezafat, R.: Nonrigid active shape model–based registration framework for motion correction of cardiac t1 mapping. Magnetic resonance in medicine80(2), 780–791 (2018)

work page 2018
[8]

Journal of Cardiovascular Magnetic Resonance18(1), 89 (2016)

Haaf, P., Garg, P., Messroghli, D.R., Broadbent, D.A., Greenwood, J.P., Plein, S.: Cardiac t1 mapping and extracellular volume (ecv) in clinical practice: a compre- hensive review. Journal of Cardiovascular Magnetic Resonance18(1), 89 (2016)

work page 2016
[9]

In: International Conference on Medical Image Computing and Computer-Assisted Intervention

Hanania, E., Volovik, I., Barkat, L., Cohen, I., Freiman, M.: Pcmc-t1: Free- breathing myocardial t1 mapping with physically-constrained motion correction. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 226–235. Springer (2023)

work page 2023
[10]

Medical Image Analysis102, 103495 (2025)

Hanania, E., Zehavi-Lenz, A., Volovik, I., Link-Sourani, D., Cohen, I., Freiman, M.: Mbss-t1: Model-based subject-specific self-supervised motion correction for robust cardiac t1 mapping. Medical Image Analysis102, 103495 (2025)

work page 2025
[11]

IEEE Transactions on Image Processing (2025)

He, Z., Chung, A.C.: Instantgroup: Instant template generation for scalable group of brain mri registration. IEEE Transactions on Image Processing (2025)

work page 2025
[12]

Medical image analysis29, 65–78 (2016)

Huizinga, W., Poot, D.H., Guyader, J.M., Klaassen, R., Coolen, B.F., van Kra- nenburg, M., Van Geuns, R., Uitterdijk, A., Polfliet, M., Vandemeulebroucke, J., et al.: Pca-based groupwise image registration for quantitative mri. Medical image analysis29, 65–78 (2016)

work page 2016
[13]

Nature methods18(2), 203–211 (2021)

Isensee, F., Jaeger, P.F., Kohl, S.A., Petersen, J., Maier-Hein, K.H.: nnu-net: a self-configuring method for deep learning-based biomedical image segmentation. Nature methods18(2), 203–211 (2021)

work page 2021
[14]

Journal of Cardiovascular Magnetic Resonance15(1), 1–12 (2013)

Kellman, P., Arai, A.E., Xue, H.: T1 and extracellular volume mapping in the heart: estimation of error maps and the influence of noise on precision. Journal of Cardiovascular Magnetic Resonance15(1), 1–12 (2013)

work page 2013
[15]

Jour- nal of cardiovascular magnetic resonance16, 1–20 (2014) 10 Y

Kellman, P., Hansen, M.S.: T1-mapping in the heart: accuracy and precision. Jour- nal of cardiovascular magnetic resonance16, 1–20 (2014) 10 Y. Zhang et al

work page 2014
[16]

IEEE transactions on medical imaging29(1), 196–205 (2009)

Klein, S., Staring, M., Murphy, K., Viergever, M.A., Pluim, J.P.: Elastix: a tool- box for intensity-based medical image registration. IEEE transactions on medical imaging29(1), 196–205 (2009)

work page 2009
[17]

In: International Workshop on Statistical Atlases and Computational Models of the Heart

Li, X., Zhang, Y., Zhao, Y., van Gemert, J., Tao, Q.: Contrast-agnostic groupwise registration by robust pca for quantitative cardiac mri. In: International Workshop on Statistical Atlases and Computational Models of the Heart. pp. 77–87 (2023)

work page 2023
[18]

NMR in Biomedicine35(10), e4775 (2022)

Li, Y., Wu, C., Qi, H., Si, D., Ding, H., Chen, H.: Motion correction for native my- ocardial t1 mapping using self-supervised deep learning registration with contrast separation. NMR in Biomedicine35(10), e4775 (2022)

work page 2022
[19]

Makela, T., Clarysse, P., Sipila, O., Pauna, N., Pham, Q.C., Katila, T., Magnin, I.E.:Areviewofcardiacimageregistrationmethods.IEEETransactionsonmedical imaging21(9), 1011–1021 (2002)

work page 2002
[20]

Entropy22(6), 687 (2020)

Martín-González, E., Sevilla, T., Revilla-Orodea, A., Casaseca-de-la Higuera, P., Alberola-López, C.: Groupwise non-rigid registration with deep learning: an afford- able solution applied to 2d cardiac cine mri reconstruction. Entropy22(6), 687 (2020)

work page 2020
[21]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Meng, M., Feng, D., Bi, L., Kim, J.: Correlation-aware coarse-to-fine mlps for de- formable medical image registration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 9645–9654 (2024)

work page 2024
[22]

Magnetic Resonance in Medicine52(1), 141–146 (2004)

Messroghli, D.R., Radjenovic, A., Kozerke, S., Higgins, D.M., Sivananthan, M.U., Ridgway, J.P.: Modified look-locker inversion recovery (molli) for high-resolution t1 mapping of the heart. Magnetic Resonance in Medicine52(1), 141–146 (2004)

work page 2004
[23]

Medical image analysis46, 15–25 (2018)

Polfliet, M., Klein, S., Huizinga, W., Paulides, M.M., Niessen, W.J., Vandemeule- broucke, J.: Intrasubject multimodal groupwise registration with the conditional template entropy. Medical image analysis46, 15–25 (2018)

work page 2018
[24]

In: Medical Image Computing and Computer-Assisted Intervention, October 5-9, 2015, Part III 18

Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomed- ical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention, October 5-9, 2015, Part III 18. pp. 234–241. Springer (2015)

work page 2015
[25]

Journal of Magnetic Resonance Imaging47(5), 1397–1405 (2018)

Tao, Q., van der Tol, P., Berendsen, F.F., Paiman, E.H., Lamb, H.J., van der Geest, R.J.:Robustmotioncorrectionformyocardialt1andextracellularvolumemapping by principle component analysis-based groupwise image registration. Journal of Magnetic Resonance Imaging47(5), 1397–1405 (2018)

work page 2018
[26]

Magnetic resonance in medicine74(1), 115–124 (2015)

Weingärtner,S.,Roujol,S.,Akçakaya,M.,Basha,T.A.,Nezafat,R.:Free-breathing multislice native myocardial t1 mapping using the slice-interleaved t1 (stone) se- quence. Magnetic resonance in medicine74(1), 115–124 (2015)

work page 2015
[27]

Magnetic res- onance in medicine67(6), 1644–1655 (2012)

Xue, H., Shah, S., Greiser, A., Guetter, C., Littmann, A., Jolly, M.P., Arai, A.E., Zuehlsdorff, S., Guehring, J., Kellman, P.: Motion correction for myocardial t1 mapping using image registration with synthetic image estimation. Magnetic res- onance in medicine67(6), 1644–1655 (2012)

work page 2012
[28]

In: International Con- ference on Medical Image Computing and Computer-Assisted Intervention

Zhang, Y., Zhao, Y., Huang, L., Xia, L., Tao, Q.: Deep-learning-based groupwise registration for motion correction of cardiac t 1 mapping. In: International Con- ference on Medical Image Computing and Computer-Assisted Intervention. pp. 586–596. Springer (2024)

work page 2024
[29]

Physics in Medicine & Biology66(4), 045030 (2021)

Zhang, Y., Wu, X., Gach, H.M., Li, H., Yang, D.: Groupregnet: a groupwise one- shot deep learning-based 4d image registration method. Physics in Medicine & Biology66(4), 045030 (2021)

work page 2021
[30]

In: International Confer- ence on Medical Image Computing and Computer-Assisted Intervention

Zhao, Y., Zhang, Y., Simonetti, O., Han, Y., Tao, Q.: Reverse imaging for wide- spectrum generalization of cardiac mri segmentation. In: International Confer- ence on Medical Image Computing and Computer-Assisted Intervention. Springer (2025)

work page 2025