Dual-Domain Equivariant Generative Adversarial Network for Multimodal CT-PET Synthesis

Alessandro Perelli; Alzahra Altalib; Gabriel Steele

arxiv: 2606.13341 · v1 · pith:6SGR4B6Enew · submitted 2026-06-11 · 💻 cs.CV · cs.AI· physics.med-ph

Dual-Domain Equivariant Generative Adversarial Network for Multimodal CT-PET Synthesis

Gabriel Steele , Alzahra Altalib , Alessandro Perelli This is my paper

Pith reviewed 2026-06-27 07:27 UTC · model grok-4.3

classification 💻 cs.CV cs.AIphysics.med-ph

keywords CT-PET synthesisequivariant GANdual-domain learningFourier domainrotational equivariancemultimodal image synthesismedical image translationHECCTOR dataset

0 comments

The pith

DDE-GAN demonstrates that dual-domain learning combined with rotational equivariance improves CT-PET image synthesis accuracy and robustness.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents DDE-GAN as a method for synthesizing CT-PET images that operates in both spatial and frequency domains to capture complementary anatomical and spectral information. It embeds rotational equivariance into the losses of the generator and discriminator to enforce consistent responses under rotations. A hierarchical training strategy with multi-stage losses maintains consistency within and across domains. Tests on the HECKTOR 2022 dataset show better results than standard approaches. The authors argue this combination supports applications such as completing missing PET scans and augmenting training data.

Core claim

DDE-GAN jointly learns from spatial and Fourier domains while integrating rotational equivariance into the loss of both the generator and discriminator, using a hierarchical dual-domain training strategy that enforces intra- and inter-domain consistency through multi-stage loss functions, and achieves superior synthesis quality over baseline models on the HECKTOR 2022 CT-PET dataset.

What carries the argument

Dual-domain equivariant losses that process both spatial and frequency domains while enforcing rotational consistency directly in the generator and discriminator objectives.

If this is right

Higher structural fidelity in synthesized CT-PET pairs compared to spatial-only GANs.
More reliable PET completion when one modality is missing or low-quality.
Improved data augmentation for training downstream medical imaging models.
Greater robustness to geometric variations in input scans.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same dual-domain plus equivariance pattern could apply to other paired modalities such as MRI-CT.
Extending equivariance to additional transformations like scaling might further stabilize results on varied patient sizes.
Real-world deployment would require checking whether the added frequency-domain path increases inference time unacceptably for clinical workflows.

Load-bearing premise

Embedding rotational equivariance into the generator and discriminator losses improves anatomical fidelity without creating new artifacts or reducing performance on standard non-rotated clinical scans.

What would settle it

A controlled test on the same dataset where half the images are artificially rotated during training and evaluation, measuring whether synthesis metrics improve consistently and whether new artifacts appear in non-rotated test cases.

read the original abstract

We present a Dual-Domain Equivariant Generative Adversarial Network (DDE-GAN) for multimodal CT-PET image synthesis. Traditional GAN-based approaches often operate solely in the spatial domain and ignore geometric consistency, resulting in limited structural fidelity. DDE-GAN addresses these challenges by jointly learning from both spatial and frequency (Fourier) domains, capturing complementary anatomical and spectral information. Furthermore, rotational equivariance embedded in the physics of the CT and PET measurements are integrated into the loss of both the generator and discriminator to ensure consistent responses under rotations, improving anatomical accuracy. A hierarchical dual-domain training strategy enforces intra- and inter-domain consistency through multi-stage loss functions. Evaluated on the HECKTOR 2022 CT-PET dataset, DDE-GAN achieves superior synthesis quality over baseline models for CT-PET image synthesis. The results demonstrate that combining dual-domain learning with geometric equivariance substantially enhances multimodal image synthesis accuracy and robustness, enabling practical applications in PET completion and data augmentation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper proposes a dual-domain GAN with rotational equivariance added to generator and discriminator losses for CT-PET synthesis, but the abstract alone gives no evidence that the equivariance term actually works or improves results.

read the letter

The main thing to know is that this work combines spatial and Fourier domain processing inside a GAN and adds rotational equivariance directly to the losses for both generator and discriminator, claiming better anatomical fidelity on the HECKTOR 2022 dataset.

The idea is new in this specific pairing for CT-PET synthesis. The paper correctly notes that standard spatial-only GANs miss geometric consistency, which is a real issue when anatomy should be invariant to rotation. The hierarchical multi-stage loss approach is a reasonable way to try enforcing both intra- and inter-domain consistency.

The soft spots are substantial because we only have the abstract. There are no equations showing how the equivariance loss is constructed, no weighting schedule, no ablation studies, and no numbers comparing against baselines. The central assumption—that a loss penalty will produce consistent rotated outputs without new artifacts or degraded performance on standard non-rotated scans—remains untested in the visible material. The stress-test concern about loss-only equivariance often failing to enforce the property (unlike architecture-level methods) lands directly here.

This is for researchers already working on medical image synthesis GANs who want to explore physics-based priors. A reader in that niche could get a high-level design idea, but the lack of implementation details limits its immediate value.

I would send it to peer review so the full methods, training procedure, and quantitative results can be examined.

Referee Report

2 major / 0 minor

Summary. The paper proposes DDE-GAN, a dual-domain equivariant GAN for multimodal CT-PET synthesis. It claims to jointly learn in spatial and Fourier domains while embedding rotational equivariance (derived from CT/PET physics) directly into the generator and discriminator losses via a hierarchical multi-stage training strategy. On the HECKTOR 2022 dataset the method is reported to outperform baselines in synthesis quality, with suggested uses in PET completion and data augmentation.

Significance. If the empirical gains are reproducible and the equivariance term does not degrade performance on non-rotated clinical data, the dual-domain plus loss-based equivariance combination could offer a practical route to improved anatomical fidelity in medical image synthesis tasks.

major comments (2)

[Abstract] Abstract: the central claim that 'rotational equivariance embedded in the physics of the CT and PET measurements are integrated into the loss of both the generator and discriminator' is load-bearing, yet the abstract supplies neither the explicit form of the equivariance penalty, its weighting schedule, nor any ablation isolating its effect on rotated versus non-rotated inputs; without these the risk that the term is simply down-weighted or introduces new inconsistencies cannot be evaluated.
[Abstract] Abstract: no quantitative results, tables, or figures are provided, so the statement that DDE-GAN 'achieves superior synthesis quality over baseline models' cannot be inspected for effect size, statistical significance, or comparison to standard dual-domain or equivariant baselines.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed comments on the abstract. We address each point below and will revise the abstract accordingly to enhance clarity while preserving its conciseness.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that 'rotational equivariance embedded in the physics of the CT and PET measurements are integrated into the loss of both the generator and discriminator' is load-bearing, yet the abstract supplies neither the explicit form of the equivariance penalty, its weighting schedule, nor any ablation isolating its effect on rotated versus non-rotated inputs; without these the risk that the term is simply down-weighted or introduces new inconsistencies cannot be evaluated.

Authors: We agree that the abstract would benefit from additional detail on this load-bearing component. The explicit form of the equivariance penalty, derived from the rotational properties inherent to CT and PET physics, along with its weighting schedule in the hierarchical multi-stage training, is specified in the methods section of the manuscript. Ablation results isolating the contribution of the equivariance term, including performance on rotated versus non-rotated inputs, appear in the experimental results. We will revise the abstract to include a concise description of the penalty integration and its effect. revision: yes
Referee: [Abstract] Abstract: no quantitative results, tables, or figures are provided, so the statement that DDE-GAN 'achieves superior synthesis quality over baseline models' cannot be inspected for effect size, statistical significance, or comparison to standard dual-domain or equivariant baselines.

Authors: Abstracts are typically kept free of specific numerical values for readability, but we acknowledge the referee's point that key metrics would allow better evaluation of the claimed improvements. We will revise the abstract to report the primary quantitative gains (e.g., improvements in standard synthesis metrics over the listed baselines on the HECKTOR 2022 dataset) to convey effect size without exceeding length constraints. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical claims rest on dataset evaluation, not self-referential derivation

full rationale

The abstract and available text present DDE-GAN as a proposed architecture whose performance is evaluated empirically on the HECKTOR 2022 CT-PET dataset. No equations, loss derivations, or parameter-fitting procedures are exhibited that reduce a claimed prediction back to its own inputs by construction. The integration of rotational equivariance into losses is described as a design choice motivated by physics, not as a mathematical identity derived from the model itself. No self-citations are invoked as load-bearing uniqueness theorems, and no ansatz is smuggled via prior work. The central claim therefore remains an empirical outcome rather than a tautological re-expression of fitted quantities or self-referential premises.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Only abstract available; no explicit free parameters, axioms, or invented entities are stated. Standard GAN convergence assumptions are implicitly used but not detailed.

axioms (1)

domain assumption Standard assumptions underlying stable GAN training and convergence
Any GAN paper relies on these background assumptions about minimax optimization.

pith-pipeline@v0.9.1-grok · 5714 in / 1201 out tokens · 20201 ms · 2026-06-27T07:27:04.087721+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

17 extracted references · 1 linked inside Pith

[1]

Despite its diagnostic power, dual-modality acquisition is associated with increased cost, scan time, and patient radiation exposure

INTRODUCTION Multimodal imaging, particularly Computed Tomography- Positron Emission Tomography (CT-PET), has become in- dispensable in modern clinical workflows, providing com- plementary metabolic and anatomical information essential for diagnosis, staging, and treatment planning in oncology, neurology, and cardiology [1]. Despite its diagnostic power, ...

Pith/arXiv arXiv 2026
[2]

DUAL-DOMAIN EQUIV ARIANT GAN Generalised Dual-Domain Generative Framework [11] for reconstruction makes use of information in both the imaging and raw data domains to generate additional cost functions that introduce different consistency constraints allowing for incremental stages of model training. We definex s ∈R n the source image andx s ∈R n the targ...
[3]

The datasets were composed of data from 10 different centres, each using different imaging procedures and machines

V ALIDA TION AND RESULTS The selected dataset for the study is sourced from the HEad and neCK TumOR segmentation in PET/CT images (HECK- TOR) 2022 [12]. The datasets were composed of data from 10 different centres, each using different imaging procedures and machines. The training and testing datasets were used, result- ing in the selection of CHUS, HMR a...

2022
[4]

We extend the framework using the prop- erty of rotational equivariance which is inherited from the acquisition geometry of both PET and CT imaging

CONCLUSIONS In this work we proposed a new methods for synthesis of PET images from CT based on dual domain Generative Adver- sarial Network. We extend the framework using the prop- erty of rotational equivariance which is inherited from the acquisition geometry of both PET and CT imaging. By en- forcing this property in the training loss, the proposed DD...
[5]

ACKNOWLEDGMENT This work involved human subjects or animals in its re- search. Approval of all ethical and experimental procedures and protocols was granted by the Research Ethics Commit- tee of McGill University Health Center (Protocol Number: MM-JGH-CR15-50) for the clinical study entitled ”HEad and neCK TumOR segmentation and outcome prediction: HECK- ...

2022
[6]

Multimodality imaging of structure and function,

D. W. Townsend, “Multimodality imaging of structure and function,”Physics in Medicine and Biology, vol. 53, no. 4, pp. R1–R39, 2008

2008
[7]

Deep embedding convolutional neural network for synthesizing PET im- ages from CT scans,

L. Xiang, Z. Xu, D. Guo, and et al., “Deep embedding convolutional neural network for synthesizing PET im- ages from CT scans,”IEEE Transactions on Medical Imaging, vol. 37, no. 3, pp. 982–993, 2018

2018
[8]

Medical image synthesis with context-aware generative adversarial net- works,

D. Nie, R. Trullo, J. Lian, and et al., “Medical image synthesis with context-aware generative adversarial net- works,” inInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MIC- CAI). Springer, 2017, pp. 417–425

2017
[9]

Group equivariant convo- lutional networks,

T. S. Cohen and M. Welling, “Group equivariant convo- lutional networks,” inInternational Conference on Ma- chine Learning (ICML), 2016, pp. 2990–2999

2016
[10]

3D steer- able CNNs: Learning rotationally equivariant features in volumetric data,

M. Weiler, F. A. Hamprecht, and M. Storath, “3D steer- able CNNs: Learning rotationally equivariant features in volumetric data,” inAdvances in Neural Informa- tion Processing Systems (NeurIPS), 2018, pp. 10 381– 10 392

2018
[11]

Image- to-image translation with conditional adversarial net- works,

P. Isola, J.-Y . Zhu, T. Zhou, and A. A. Efros, “Image- to-image translation with conditional adversarial net- works,” inProceedings of the IEEE Conference on Com- puter Vision and Pattern Recognition (CVPR), 2017, pp. 5967–5976

2017
[12]

Deep MR to CT synthesis using unpaired data,

J. M. Wolterink, T. Leiner, M. A. Viergever, and I. I ˇsgum, “Deep MR to CT synthesis using unpaired data,” inSimulation and Synthesis in Medical Imaging (SASHIMI), ser. Lecture Notes in Computer Science, vol. 10557. Springer, 2017, pp. 14–23

2017
[13]

Multimodal MR synthesis via modality-invariant latent representa- tion,

A. Chartsias, T. Joyce, and S. A. Tsaftaris, “Multimodal MR synthesis via modality-invariant latent representa- tion,” inInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MIC- CAI). Springer, 2018, pp. 371–379

2018
[14]

PETNet: Cross-modality PET image synthesis from CT using 3D GAN,

Y . Huang, L. Bi, K. Zhang, and et al., “PETNet: Cross-modality PET image synthesis from CT using 3D GAN,” inInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MIC- CAI). Springer, 2021, pp. 202–212

2021
[15]

Robust equiv- ariant imaging: a fully unsupervised framework for learning to image from noisy and partial measurements,

D. Chen, J. Tachella, and M. E. Davies, “Robust equiv- ariant imaging: a fully unsupervised framework for learning to image from noisy and partial measurements,” inProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, 2022, pp. 5647– 5656

2022
[16]

A generalized dual-domain gener- ative framework with hierarchical consistency for med- ical image reconstruction and synthesis,

J. Zhang, K. Sun, J. Yang, Y . Hu, Y . Gu, Z. Cui, X. Zong, F. Gao, and D. Shen, “A generalized dual-domain gener- ative framework with hierarchical consistency for med- ical image reconstruction and synthesis,”Communica- tions Engineering, vol. 2, no. 1, p. 72, 2023

2023
[17]

Overview of the HECKTOR challenge at MICCAI 2022: automatic head and neck tumor segmentation and outcome prediction in PET/CT,

V . Andrearczyk, V . Oreiller, M. Abobakr, A. Akhavanal- laf, P. Balermpas, S. Boughdad, L. Capriotti, J. Castelli, C. Cheze Le Rest, P. Decazeset al., “Overview of the HECKTOR challenge at MICCAI 2022: automatic head and neck tumor segmentation and outcome prediction in PET/CT,” in3D Head and Neck Tumor Segmentation in PET/CT Challenge. Springer, 2022, pp. 1–30

2022

[1] [1]

Despite its diagnostic power, dual-modality acquisition is associated with increased cost, scan time, and patient radiation exposure

INTRODUCTION Multimodal imaging, particularly Computed Tomography- Positron Emission Tomography (CT-PET), has become in- dispensable in modern clinical workflows, providing com- plementary metabolic and anatomical information essential for diagnosis, staging, and treatment planning in oncology, neurology, and cardiology [1]. Despite its diagnostic power, ...

Pith/arXiv arXiv 2026

[2] [2]

DUAL-DOMAIN EQUIV ARIANT GAN Generalised Dual-Domain Generative Framework [11] for reconstruction makes use of information in both the imaging and raw data domains to generate additional cost functions that introduce different consistency constraints allowing for incremental stages of model training. We definex s ∈R n the source image andx s ∈R n the targ...

[3] [3]

The datasets were composed of data from 10 different centres, each using different imaging procedures and machines

V ALIDA TION AND RESULTS The selected dataset for the study is sourced from the HEad and neCK TumOR segmentation in PET/CT images (HECK- TOR) 2022 [12]. The datasets were composed of data from 10 different centres, each using different imaging procedures and machines. The training and testing datasets were used, result- ing in the selection of CHUS, HMR a...

2022

[4] [4]

We extend the framework using the prop- erty of rotational equivariance which is inherited from the acquisition geometry of both PET and CT imaging

CONCLUSIONS In this work we proposed a new methods for synthesis of PET images from CT based on dual domain Generative Adver- sarial Network. We extend the framework using the prop- erty of rotational equivariance which is inherited from the acquisition geometry of both PET and CT imaging. By en- forcing this property in the training loss, the proposed DD...

[5] [5]

ACKNOWLEDGMENT This work involved human subjects or animals in its re- search. Approval of all ethical and experimental procedures and protocols was granted by the Research Ethics Commit- tee of McGill University Health Center (Protocol Number: MM-JGH-CR15-50) for the clinical study entitled ”HEad and neCK TumOR segmentation and outcome prediction: HECK- ...

2022

[6] [6]

Multimodality imaging of structure and function,

D. W. Townsend, “Multimodality imaging of structure and function,”Physics in Medicine and Biology, vol. 53, no. 4, pp. R1–R39, 2008

2008

[7] [7]

Deep embedding convolutional neural network for synthesizing PET im- ages from CT scans,

L. Xiang, Z. Xu, D. Guo, and et al., “Deep embedding convolutional neural network for synthesizing PET im- ages from CT scans,”IEEE Transactions on Medical Imaging, vol. 37, no. 3, pp. 982–993, 2018

2018

[8] [8]

Medical image synthesis with context-aware generative adversarial net- works,

D. Nie, R. Trullo, J. Lian, and et al., “Medical image synthesis with context-aware generative adversarial net- works,” inInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MIC- CAI). Springer, 2017, pp. 417–425

2017

[9] [9]

Group equivariant convo- lutional networks,

T. S. Cohen and M. Welling, “Group equivariant convo- lutional networks,” inInternational Conference on Ma- chine Learning (ICML), 2016, pp. 2990–2999

2016

[10] [10]

3D steer- able CNNs: Learning rotationally equivariant features in volumetric data,

M. Weiler, F. A. Hamprecht, and M. Storath, “3D steer- able CNNs: Learning rotationally equivariant features in volumetric data,” inAdvances in Neural Informa- tion Processing Systems (NeurIPS), 2018, pp. 10 381– 10 392

2018

[11] [11]

Image- to-image translation with conditional adversarial net- works,

P. Isola, J.-Y . Zhu, T. Zhou, and A. A. Efros, “Image- to-image translation with conditional adversarial net- works,” inProceedings of the IEEE Conference on Com- puter Vision and Pattern Recognition (CVPR), 2017, pp. 5967–5976

2017

[12] [12]

Deep MR to CT synthesis using unpaired data,

J. M. Wolterink, T. Leiner, M. A. Viergever, and I. I ˇsgum, “Deep MR to CT synthesis using unpaired data,” inSimulation and Synthesis in Medical Imaging (SASHIMI), ser. Lecture Notes in Computer Science, vol. 10557. Springer, 2017, pp. 14–23

2017

[13] [13]

Multimodal MR synthesis via modality-invariant latent representa- tion,

A. Chartsias, T. Joyce, and S. A. Tsaftaris, “Multimodal MR synthesis via modality-invariant latent representa- tion,” inInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MIC- CAI). Springer, 2018, pp. 371–379

2018

[14] [14]

PETNet: Cross-modality PET image synthesis from CT using 3D GAN,

Y . Huang, L. Bi, K. Zhang, and et al., “PETNet: Cross-modality PET image synthesis from CT using 3D GAN,” inInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MIC- CAI). Springer, 2021, pp. 202–212

2021

[15] [15]

Robust equiv- ariant imaging: a fully unsupervised framework for learning to image from noisy and partial measurements,

D. Chen, J. Tachella, and M. E. Davies, “Robust equiv- ariant imaging: a fully unsupervised framework for learning to image from noisy and partial measurements,” inProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, 2022, pp. 5647– 5656

2022

[16] [16]

A generalized dual-domain gener- ative framework with hierarchical consistency for med- ical image reconstruction and synthesis,

J. Zhang, K. Sun, J. Yang, Y . Hu, Y . Gu, Z. Cui, X. Zong, F. Gao, and D. Shen, “A generalized dual-domain gener- ative framework with hierarchical consistency for med- ical image reconstruction and synthesis,”Communica- tions Engineering, vol. 2, no. 1, p. 72, 2023

2023

[17] [17]

Overview of the HECKTOR challenge at MICCAI 2022: automatic head and neck tumor segmentation and outcome prediction in PET/CT,

V . Andrearczyk, V . Oreiller, M. Abobakr, A. Akhavanal- laf, P. Balermpas, S. Boughdad, L. Capriotti, J. Castelli, C. Cheze Le Rest, P. Decazeset al., “Overview of the HECKTOR challenge at MICCAI 2022: automatic head and neck tumor segmentation and outcome prediction in PET/CT,” in3D Head and Neck Tumor Segmentation in PET/CT Challenge. Springer, 2022, pp. 1–30

2022