pith. machine review for the scientific record. sign in

arxiv: 2604.10112 · v2 · submitted 2026-04-11 · 💻 cs.CV

Recognition: unknown

Dual-Branch Remote Sensing Infrared Image Super-Resolution

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:54 UTC · model grok-4.3

classification 💻 cs.CV
keywords infrared super-resolutionremote sensingdual-branch architecturetransformerstate-space modelimage fusionthermal imaging
0
0 comments X

The pith

A dual-branch system fusing a transformer branch and a state-space model branch outperforms either branch alone on infrared image super-resolution.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that pairing a HAT-L transformer model with a MambaIRv2-L state-space model, then fusing their outputs with fixed equal weights after test-time local conversion and eight-way self-ensemble, produces higher-quality super-resolved infrared images than using either model by itself. The authors demonstrate this improvement on twelve synthetic four-times upscaled thermal samples derived from aerial RGB-Thermal data, where the fused result records better PSNR, SSIM, and overall score. They argue that infrared super-resolution needs explicit local-global complementarity because thermal imagery is weakly textured and sensitive to local sharpening artifacts that can distort contours or radiometric values. A sympathetic reader would care because remote sensing applications rely on stable recovery of scene layout and target details from low-resolution thermal inputs.

Core claim

The fused output of the HAT-L and MambaIRv2-L branches outperforms either single branch in PSNR, SSIM, and the overall Score on 12 synthetic times-four thermal samples derived from Caltech Aerial RGB-Thermal. The results suggest that infrared super-resolution benefits from explicit complementarity between locally strong transformer restoration and globally stable state-space modeling.

What carries the argument

Dual-branch architecture that runs a HAT-L transformer branch and a MambaIRv2-L state-space branch in parallel, applies test-time local conversion to the first and eight-way self-ensemble to the second, then merges the results with fixed equal-weight image-space fusion.

If this is right

  • Explicit local-global complementarity between transformer and state-space modeling improves restoration of weakly textured thermal imagery.
  • Fixed equal-weight fusion after branch-specific augmentations is sufficient to beat either branch on the evaluated synthetic thermal data.
  • The pipeline preserves target contours, scene layout, and radiometric stability better than single-branch baselines.
  • The method supplies a concrete submission for the NTIRE 2026 Infrared Image Super-Resolution Challenge.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same complementarity principle could be tested on other low-texture super-resolution domains such as medical thermal or night-vision imagery.
  • Replacing the fixed fusion weights with a small learned fusion module might further lift performance once real infrared test data become available.
  • The approach underscores the practical value of mixing attention-based and linear-time state-space models when both local artifact control and global coherence matter.
  • Evaluating the method on additional downsampling factors or sensor-specific noise models would clarify the range of conditions under which the fusion advantage persists.

Load-bearing premise

That fixed equal-weight image-space fusion of the two branches after the chosen test-time tricks will generalize to real-world infrared data beyond the NTIRE challenge set and the twelve synthetic samples.

What would settle it

Running the identical dual-branch pipeline on a collection of genuinely captured low-resolution remote-sensing infrared images and verifying whether the fused output still exceeds the individual branches on PSNR, SSIM, and visual contour fidelity.

Figures

Figures reproduced from arXiv: 2604.10112 by Boyang Yao, Gengjia Chang, Shuhong Liu, Weijun Yuan, Xining Ge, Yifan Deng, Yihang Chen, Zhanglu Chen, Zhan Li.

Figure 1
Figure 1. Figure 1: Overview of the proposed dual-branch framework for the NTIRE 2026 remote sensing infrared super-resolution challenge. The [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Qualitative comparison on representative public thermal synthetic [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
read the original abstract

Remote sensing infrared image super-resolution aims to recover sharper thermal observations from low-resolution inputs while preserving target contours, scene layout, and radiometric stability. Unlike visible-image super-resolution, thermal imagery is weakly textured and more sensitive to unstable local sharpening, which makes complementary local and global modeling especially important. This paper presents our solution to the NTIRE 2026 Infrared Image Super-Resolution Challenge, a dual-branch system that combines a HAT-L branch and a MambaIRv2-L branch. The inference pipeline applies test-time local conversion on HAT, eight-way self-ensemble on MambaIRv2, and fixed equal-weight image-space fusion. We report both the official challenge score and a reproducible evaluation on 12 synthetic times-four thermal samples derived from Caltech Aerial RGB-Thermal, on which the fused output outperforms either single branch in PSNR, SSIM, and the overall Score. The results suggest that infrared super-resolution benefits from explicit complementarity between locally strong transformer restoration and globally stable state-space modeling.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes a dual-branch architecture for remote sensing infrared image super-resolution, combining a HAT-L transformer branch (with test-time local conversion) and a MambaIRv2-L state-space branch (with 8-way self-ensemble). Fixed equal-weight image-space fusion is applied at inference, and the fused output is claimed to outperform either branch alone in PSNR, SSIM, and overall Score on 12 synthetic 4× thermal images derived from the Caltech Aerial RGB-Thermal dataset, as well as on the official NTIRE 2026 challenge metrics. The work emphasizes the value of explicit local-global complementarity for weakly textured thermal imagery.

Significance. If the complementarity between the transformer and state-space branches can be shown to generalize beyond the synthetic test set and challenge data, the approach would provide a practical demonstration that combining locally strong restoration with globally stable modeling improves infrared super-resolution stability and contour preservation. The reproducible synthetic evaluation and challenge participation offer a concrete baseline for future dual-branch IR SR methods.

major comments (3)
  1. [Experimental Evaluation] Experimental Evaluation: The outperformance claim rests on only 12 synthetic 4× samples with no per-image results, standard deviation, or statistical significance tests reported. This small held-out set size makes it difficult to rule out sample-specific bias as the source of the PSNR/SSIM/Score margins rather than genuine branch complementarity.
  2. [Ablation Studies] Ablation Studies: No ablation isolates the contribution of the fixed equal-weight image-space fusion from the differing test-time pipelines (local conversion on HAT-L versus 8-way ensemble on MambaIRv2-L). Without such controls, the central empirical claim that fusion produces higher scores than either branch cannot be attributed to the dual-branch design.
  3. [Inference Pipeline] Inference Pipeline: The paper presents fixed equal-weight fusion without any comparison to alternative strategies (e.g., learned fusion weights, feature-space fusion, or per-branch weighting). This omission leaves open whether the reported gains depend on the specific fusion choice or would hold under other combination methods.
minor comments (2)
  1. [Abstract and Method] The abstract states that training details are omitted, but the main text should still include at minimum the loss functions, optimizer settings, and any fine-tuning procedure used for the HAT-L and MambaIRv2-L branches to support reproducibility of the reported challenge score.
  2. [Figures and Tables] Figure captions and table headers should explicitly state that all quantitative results are on synthetic bicubic-downsampled data rather than real-world infrared imagery to avoid overgeneralization.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and will revise the manuscript accordingly to strengthen the empirical support and clarity of our claims.

read point-by-point responses
  1. Referee: The outperformance claim rests on only 12 synthetic 4× samples with no per-image results, standard deviation, or statistical significance tests reported. This small held-out set size makes it difficult to rule out sample-specific bias as the source of the PSNR/SSIM/Score margins rather than genuine branch complementarity.

    Authors: We acknowledge the constraint of the 12-sample reproducible test set. In the revised manuscript we will report per-image PSNR/SSIM values, the mean and standard deviation across the 12 images, and the results of paired statistical tests (Wilcoxon signed-rank) between the fused output and each single branch. We also clarify that the primary performance claims are backed by the official NTIRE 2026 challenge results on a larger unseen test set. These additions will provide quantitative evidence against sample-specific bias while respecting the fixed size of the public synthetic evaluation. revision: yes

  2. Referee: No ablation isolates the contribution of the fixed equal-weight image-space fusion from the differing test-time pipelines (local conversion on HAT-L versus 8-way ensemble on MambaIRv2-L). Without such controls, the central empirical claim that fusion produces higher scores than either branch cannot be attributed to the dual-branch design.

    Authors: We agree that isolating the fusion step is essential. We will add a controlled ablation in which the 8-way self-ensemble is applied uniformly to both branches (treating HAT-L local conversion as an additional test-time augmentation) before performing equal-weight fusion. The revised results will compare the fused output against the individual branches under this standardized pipeline, allowing clearer attribution of gains to the dual-branch complementarity. revision: yes

  3. Referee: The paper presents fixed equal-weight fusion without any comparison to alternative strategies (e.g., learned fusion weights, feature-space fusion, or per-branch weighting). This omission leaves open whether the reported gains depend on the specific fusion choice or would hold under other combination methods.

    Authors: We will expand the experimental section with comparisons to alternative fusion strategies, including (i) learned scalar weights optimized on a small validation split and (ii) a lightweight feature-space fusion module. These results will be presented alongside the fixed equal-weight baseline to demonstrate that the reported gains are not an artifact of the particular fusion rule while preserving the simplicity and reproducibility of the original approach. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical outperformance measured on held-out synthetic samples with fixed fusion

full rationale

The paper's central claim is an empirical observation that fixed equal-weight fusion of HAT-L (with test-time local conversion) and MambaIRv2-L (with 8-way ensemble) yields higher PSNR/SSIM/Score than either branch alone on 12 synthetic 4× thermal images from Caltech Aerial RGB-Thermal. No equations or derivations are presented that reduce to their own inputs; the fusion weight is stated as fixed rather than fitted to the reported metrics, the architectures are drawn from prior external work without self-citation load-bearing the result, and the evaluation uses separate test samples with no indication that reported scores influenced model selection or weights. The method is a practical combination of existing components whose performance is directly measured rather than derived by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Empirical application of pre-existing models; no new theoretical constructs or fitted parameters beyond standard deep-learning practice.

pith-pipeline@v0.9.0 · 5496 in / 1063 out tokens · 33690 ms · 2026-05-10T15:54:44.178586+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 9 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. FluxFlow: Conservative Flow-Matching for Astronomical Image Super-Resolution

    cs.CV 2026-05 unverdicted novelty 7.0

    FluxFlow is a conservative pixel-space flow-matching framework for astronomical super-resolution that incorporates real atmospheric uncertainty and a training-free Wiener correction, outperforming baselines on a new 1...

  2. FluxFlow: Conservative Flow-Matching for Astronomical Image Super-Resolution

    cs.CV 2026-05 unverdicted novelty 5.0

    FluxFlow uses conservative pixel-space flow-matching with uncertainty weights and Wiener test-time correction to outperform baselines on photometric and scientific accuracy for ground-to-space super-resolution, valida...

  3. Dehaze-then-Splat: Generative Dehazing with Physics-Informed 3D Gaussian Splatting for Smoke-Free Novel View Synthesis

    cs.CV 2026-04 unverdicted novelty 5.0

    Dehaze-then-Splat uses per-frame generative dehazing followed by physics-regularized 3D Gaussian Splatting to achieve 20.98 dB PSNR and 0.683 SSIM on the Akikaze scene, a 1.5 dB gain over baseline by mitigating cross-...

  4. 3D Smoke Scene Reconstruction Guided by Vision Priors from Multimodal Large Language Models

    cs.CV 2026-04 unverdicted novelty 5.0

    A framework that combines MLLM-based image enhancement with a medium-aware 3D Gaussian Splatting model to reconstruct and render smoke scenes.

  5. CLIP-Guided Data Augmentation for Night-Time Image Dehazing

    cs.CV 2026-04 unverdicted novelty 5.0

    CLIP-guided selection of external data plus staged NAFNet training and inference fusion provides an effective pipeline for nighttime image dehazing in the NTIRE 2026 challenge.

  6. Training-Free Model Ensemble for Single-Image Super-Resolution via Strong-Branch Compensation

    cs.CV 2026-04 unverdicted novelty 4.0

    A dual-branch training-free ensemble fuses a hybrid attention network with a Mamba-based model via weighted combination to enhance super-resolution PSNR on DIV2K x4.

  7. SmokeGS-R: Physics-Guided Pseudo-Clean 3DGS for Real-World Multi-View Smoke Restoration

    cs.CV 2026-04 conditional novelty 4.0

    SmokeGS-R uses refined dark channel prior for pseudo-clean supervision to train 3DGS geometry, followed by ensemble-based appearance harmonization, achieving PSNR 15.21 and outperforming baselines on smoke restoration...

  8. Beyond Model Design: Data-Centric Training and Self-Ensemble for Gaussian Color Image Denoising

    cs.CV 2026-04 unverdicted novelty 3.0

    Expanding training data diversity, adopting two-stage optimization, and applying geometric self-ensemble raises Restormer performance on Gaussian color denoising at sigma=50 by 3.366 dB PSNR on the NTIRE 2026 validation set.

  9. NTIRE 2026 3D Restoration and Reconstruction in Real-world Adverse Conditions: RealX3D Challenge Results

    cs.CV 2026-04 unverdicted novelty 2.0

    The NTIRE 2026 challenge reports measurable progress in 3D reconstruction pipelines that handle real-world low-light and smoke degradation via the RealX3D benchmark.

Reference graph

Works this paper leans on

48 extracted references · 14 canonical work pages · cited by 8 Pith papers · 11 internal anchors

  1. [1]

    Transformers in remote sens- ing: A survey.Remote Sensing, 15(7):1860, 2023

    Abdulaziz Amer Aleissaee, Amandeep Kumar, Rao Muham- mad Anwer, Salman Khan, Hisham Cholakkal, Gui-Song Xia, and Fahad Shahbaz Khan. Transformers in remote sens- ing: A survey.Remote Sensing, 15(7):1860, 2023. 1

  2. [2]

    GenSmoke-GS: A Multi-Stage Method for Novel View Synthesis from Smoke-Degraded Images Using a Generative Model

    Qida Cao, Xinyuan Hu, Changyue Shi, Jiajun Ding, Zhou Yu, and Jun Yu. Gensmoke-gs: A multi-stage method for novel view synthesis from smoke-degraded images using a generative model.arXiv preprint arXiv:2604.03039, 2026. 2

  3. [3]

    Beyond Model Design: Data-Centric Training and Self-Ensemble for Gaussian Color Image Denoising

    Gengjia Chang, Xining Ge, Weijun Yuan, Zhan Li, Qiurong Song, Luen Zhu, and Shuhong Liu. Beyond model design: Data-centric training and self-ensemble for gaussian color image denoising.arXiv preprint arXiv:2604.11468, 2026. 2

  4. [4]

    Training-Free Model Ensemble for Single-Image Super-Resolution via Strong-Branch Compensation

    Gengjia Chang, Xining Ge, Weijun Yuan, Zhan Li, Qiurong Song, Luen Zhu, and Shuhong Liu. Training-free model en- semble for single-image super-resolution via strong-branch compensation.arXiv preprint arXiv:2604.11564, 2026. 2

  5. [5]

    Activating more pixels in image super- resolution transformer

    Xiangyu Chen, Xintao Wang, Jiantao Zhou, Yu Qiao, and Chao Dong. Activating more pixels in image super- resolution transformer. InProceedings of the IEEE/CVF con- ference on computer vision and pattern recognition, pages 22367–22377, 2023. 1, 2

  6. [6]

    Xiaohui Chen, Lin Chen, Lingjun Chen, Peng Chen, Guan- qun Sheng, Xiaosheng Yu, and Yaobin Zou. Modeling thermal infrared image degradation and real-world super- resolution under background thermal noise and streak in- terference.IEEE Transactions on Circuits and Systems for Video Technology, 34(7):6194–6206, 2024. 2

  7. [7]

    Dehaze-then-Splat: Generative Dehazing with Physics-Informed 3D Gaussian Splatting for Smoke-Free Novel View Synthesis

    Yuchao Chen and Hanqing Wang. Dehaze-then-splat: Gen- erative dehazing with physics-informed 3d gaussian splat- ting for smoke-free novel view synthesis.arXiv preprint arXiv:2604.13589, 2026. 2

  8. [8]

    The fourth challenge on image super-resolution (×4) at NTIRE 2026: Bench- mark results and method overview

    Zheng Chen, Kai Liu, Jingkai Wang, Xianglong Yan, Jianze Li, Ziqing Zhang, Jue Gong, Jiatong Li, Lei Sun, Xiaoyang Liu, Radu Timofte, Yulun Zhang, et al. The fourth challenge on image super-resolution (×4) at NTIRE 2026: Bench- mark results and method overview. InProceedings of the Computer Vision and Pattern Recognition Conference Work- shops, 2026. 2

  9. [9]

    Improving image restoration by revisiting global information aggregation

    Xiaojie Chu, Liangyu Chen, Chengpeng Chen, and Xin Lu. Improving image restoration by revisiting global information aggregation. InEuropean Conference on Computer Vision, pages 53–71. Springer, 2022. 2, 3

  10. [10]

    Unifying color and lightness correction with view-adaptive curve ad- justment for robust 3d novel view synthesis.arXiv preprint arXiv:2602.18322, 2026

    Ziteng Cui, Shuhong Liu, Xiaoyu Dong, Xuangeng Chu, Lin Gu, Ming-Hsuan Yang, and Tatsuya Harada. Unifying color and lightness correction with view-adaptive curve ad- justment for robust 3d novel view synthesis.arXiv preprint arXiv:2602.18322, 2026. 1

  11. [11]

    Rrsgan: Reference-based super-resolution for remote sensing im- age.IEEE Transactions on Geoscience and Remote Sensing, 60:1–17, 2021

    Runmin Dong, Lixian Zhang, and Haohuan Fu. Rrsgan: Reference-based super-resolution for remote sensing im- age.IEEE Transactions on Geoscience and Remote Sensing, 60:1–17, 2021. 2

  12. [12]

    SmokeGS-R: Physics-Guided Pseudo-Clean 3DGS for Real-World Multi-View Smoke Restoration

    Xueming Fu and Lixia Han. Smokegs-r: Physics-guided pseudo-clean 3dgs for real-world multi-view smoke restora- tion.arXiv preprint arXiv:2604.05301, 2026. 2

  13. [13]

    CLIP-Guided Data Augmentation for Night-Time Image Dehazing

    Xining Ge, Weijun Yuan, Gengjia Chang, Xuyang Li, and Shuhong Liu. Clip-guided data augmentation for night-time image dehazing.arXiv preprint arXiv:2604.05500, 2026. 1

  14. [14]

    Mambairv2: Atten- tive state space restoration

    Hang Guo, Yong Guo, Yaohua Zha, Yulun Zhang, Wenbo Li, Tao Dai, Shu-Tao Xia, and Yawei Li. Mambairv2: Atten- tive state space restoration. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 28124– 28133, 2025. 1, 2, 3

  15. [15]

    Mambair: A simple baseline for im- age restoration with state-space model

    Hang Guo, Jinmin Li, Tao Dai, Zhihao Ouyang, Xudong Ren, and Shu-Tao Xia. Mambair: A simple baseline for im- age restoration with state-space model. InEuropean confer- ence on computer vision, pages 222–241. Springer, 2024. 1, 2

  16. [16]

    Reliability-aware staged low-light gaussian splatting.ResearchGate preprint, 2026

    Haojie Guo and Ke Xian. Reliability-aware staged low-light gaussian splatting.ResearchGate preprint, 2026. 2

  17. [17]

    Transformer-based multistage enhancement for remote sensing image super- resolution.IEEE Transactions on Geoscience and Remote Sensing, 60:1–11, 2021

    Sen Lei, Zhenwei Shi, and Wenjing Mo. Transformer-based multistage enhancement for remote sensing image super- resolution.IEEE Transactions on Geoscience and Remote Sensing, 60:1–11, 2021. 2

  18. [18]

    Densesplat: Densifying gaussian splatting slam with neural radiance prior.IEEE Transactions on Visualization & Computer Graphics, (01):1–14, 2025

    Mingrui Li, Shuhong Liu, Tianchen Deng, and Hongyu Wang. Densesplat: Densifying gaussian splatting slam with neural radiance prior.IEEE Transactions on Visualization & Computer Graphics, (01):1–14, 2025. 1

  19. [19]

    Sgs-slam: Semantic gaussian splatting for neural dense slam

    Mingrui Li, Shuhong Liu, Heng Zhou, Guohao Zhu, Na Cheng, Tianchen Deng, and Hongyu Wang. Sgs-slam: Semantic gaussian splatting for neural dense slam. InEuro- pean Conference on Computer Vision, pages 163–179, 2025. 1

  20. [20]

    Swinir: Image restoration us- ing swin transformer

    Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, and Radu Timofte. Swinir: Image restoration us- ing swin transformer. InProceedings of the IEEE/CVF inter- national conference on computer vision, pages 1833–1844,

  21. [21]

    Dasr: Dual-attention transformer for infrared image super-resolution.Infrared Physics & Technology, 133:104837, 2023

    ShuBo Liang, Kechen Song, Wenli Zhao, Song Li, and Yunhui Yan. Dasr: Dual-attention transformer for infrared image super-resolution.Infrared Physics & Technology, 133:104837, 2023. 2

  22. [22]

    The first chal- lenge on remote sensing infrared image super-resolution at NTIRE 2026: Benchmark results and method overview

    Kai Liu, Haoyang Yue, Zeli Lin, Zheng Chen, Jingkai Wang, Jue Gong, Radu Timofte, Yulun Zhang, et al. The first chal- lenge on remote sensing infrared image super-resolution at NTIRE 2026: Benchmark results and method overview. In 5 Proceedings of the Computer Vision and Pattern Recognition Conference Workshops, 2026. 1

  23. [23]

    NTIRE 2026 3D Restoration and Reconstruction in Real-world Adverse Conditions: RealX3D Challenge Results

    Shuhong Liu, Chenyu Bao, Ziteng Cui, Xuangeng Chu, Bin Ren, Lin Gu, Xiang Chen, Mingrui Li, Long Ma, Marcos V Conde, et al. Ntire 2026 3d restoration and reconstruction in real-world adverse conditions: Realx3d challenge results. arXiv preprint arXiv:2604.04135, 2026. 1, 2

  24. [24]

    Realx3d: A physically-degraded 3d benchmark for multi-view visual restoration and recon- struction.arXiv preprint arXiv:2512.23437, 2025

    Shuhong Liu, Chenyu Bao, Ziteng Cui, Yun Liu, Xuangeng Chu, Lin Gu, Marcos V Conde, Ryo Umagami, Tomohiro Hashimoto, Zijian Hu, et al. Realx3d: A physically-degraded 3d benchmark for multi-view visual restoration and recon- struction.arXiv preprint arXiv:2512.23437, 2026. 1

  25. [25]

    Deraings: Gaussian splatting for enhanced scene reconstruction in rainy environments.Proceedings of the AAAI Conference on Artificial Intelligence, 39(5):5558– 5566, 2025

    Shuhong Liu, Xiang Chen, Hongming Chen, Quanfeng Xu, and Mingrui Li. Deraings: Gaussian splatting for enhanced scene reconstruction in rainy environments.Proceedings of the AAAI Conference on Artificial Intelligence, 39(5):5558– 5566, 2025. 1

  26. [26]

    Mg-slam: Structure gaussian splatting slam with manhattan world hy- pothesis.IEEE Transactions on Automation Science and En- gineering, 22:17034–17049, 2025

    Shuhong Liu, Tianchen Deng, Heng Zhou, Liuzhuozheng Li, Hongyu Wang, Danwei Wang, and Mingrui Li. Mg-slam: Structure gaussian splatting slam with manhattan world hy- pothesis.IEEE Transactions on Automation Science and En- gineering, 22:17034–17049, 2025. 1

  27. [27]

    De- noising the deep sky: Physics-based ccd noise formation for astronomical imaging.arXiv preprint arXiv:2601.23276,

    Shuhong Liu, Xining Ge, Ziying Gu, Lin Gu, Ziteng Cui, Xuangeng Chu, Jun Liu, Dong Li, and Tatsuya Harada. De- noising the deep sky: Physics-based ccd noise formation for astronomical imaging.arXiv preprint arXiv:2601.23276,

  28. [28]

    I2-nerf: Learning neural radiance fields un- der physically-grounded media interactions

    Shuhong Liu, Lin Gu, Ziteng Cui, Xuangeng Chu, and Tat- suya Harada. I2-nerf: Learning neural radiance fields un- der physically-grounded media interactions. InAdvances in Neural Information Processing Systems, 2025. 1

  29. [29]

    ELoG-GS: Dual-Branch Gaussian Splatting with Luminance-Guided Enhancement for Extreme Low-light 3D Reconstruction

    Yuhao Liu, Dingju Wang, and Ziyang Zheng. Elog-gs: Dual- branch gaussian splatting with luminance-guided enhance- ment for extreme low-light 3d reconstruction.arXiv preprint arXiv:2604.12592, 2026. 2

  30. [30]

    Thermisrnet: an efficient thermal image super- resolution network.Optical Engineering, 60(7):073101– 073101, 2021

    Heena Patel, Vishal Chudasama, Kalpesh Prajapati, Kishor P Upla, Kiran Raja, Raghavendra Ramachandra, and Christoph Busch. Thermisrnet: an efficient thermal image super- resolution network.Optical Engineering, 60(7):073101– 073101, 2021. 1, 2

  31. [31]

    Lkformer: Large kernel transformer for infrared image super-resolution.Multimedia Tools and Applications, 83(28):72063–72077, 2024

    Feiwei Qin, Kang Yan, Changmiao Wang, Ruiquan Ge, Yong Peng, and Kai Zhang. Lkformer: Large kernel transformer for infrared image super-resolution.Multimedia Tools and Applications, 83(28):72063–72077, 2024. 2

  32. [32]

    The Eleventh NTIRE 2026 Efficient Super-Resolution Challenge Report

    Bin Ren, Hang Guo, Yan Shu, Jiaqi Ma, Ziteng Cui, Shuhong Liu, Guofeng Mei, Lei Sun, Zongwei Wu, Fahad Shahbaz Khan, Salman Khan, Radu Timofte, Yawei Li, et al. The eleventh NTIRE 2026 efficient super-resolution challenge re- port.arXiv preprint arXiv:2604.03198, 2026. 2

  33. [33]

    Thermal image super-resolution challenge- pbvs 2021

    Rafael E Rivadeneira, Angel D Sappa, Boris X Vintimilla, Sabari Nathan, Priya Kansal, Armin Mehri, Parichehr Be- hjati Ardakani, Anurag Dalal, Aparna Akula, Darshika Sharma, et al. Thermal image super-resolution challenge- pbvs 2021. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4359–4367,

  34. [34]

    Thermal image super-resolution challenge results- pbvs 2024

    Rafael E Rivadeneira, Angel D Sappa, Chenyang Wang, Junjun Jiang, Zhiwei Zhong, Peilin Chen, and Shiqi Wang. Thermal image super-resolution challenge results- pbvs 2024. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3113–3122,

  35. [35]

    Multimodal super-resolution reconstruction of infrared and visible images via deep learning.Optics and Lasers in Engineering, 156:107078, 2022

    Bowen Wang, Yan Zou, Linfei Zhang, Yuhai Li, Qian Chen, and Chao Zuo. Multimodal super-resolution reconstruction of infrared and visible images via deep learning.Optics and Lasers in Engineering, 156:107078, 2022. 2

  36. [36]

    Lightweight remote-sensing image super-resolution via attention-based multilevel feature fusion network.IEEE Transactions on Geoscience and Remote Sensing, 61:1–15,

    Hongyuan Wang, Shuli Cheng, Yongming Li, and Anyu Du. Lightweight remote-sensing image super-resolution via attention-based multilevel feature fusion network.IEEE Transactions on Geoscience and Remote Sensing, 61:1–15,

  37. [37]

    A com- prehensive review on deep learning based remote sensing image super-resolution methods.Earth-Science Reviews, 232:104110, 2022

    Peijuan Wang, Bulent Bayram, and Elif Sertel. A com- prehensive review on deep learning based remote sensing image super-resolution methods.Earth-Science Reviews, 232:104110, 2022. 1

  38. [38]

    A review of image super-resolution approaches based on deep learning and applications in remote sensing

    Xuan Wang, Jinglei Yi, Jian Guo, Yongchao Song, Jun Lyu, Jindong Xu, Weiqing Yan, Jindong Zhao, Qing Cai, and Haigen Min. A review of image super-resolution approaches based on deep learning and applications in remote sensing. Remote Sensing, 14(21):5423, 2022. 1

  39. [39]

    Uformer: A general u-shaped transformer for image restoration

    Zhendong Wang, Xiaodong Cun, Jianmin Bao, Wengang Zhou, Jianzhuang Liu, and Houqiang Li. Uformer: A general u-shaped transformer for image restoration. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 17683–17693, 2022. 1, 2

  40. [40]

    Heterogeneous knowledge distillation for simultaneous infrared-visible image fusion and super-resolution.IEEE Transactions on Instrumentation and Measurement, 71:1– 15, 2022

    Wanxin Xiao, Yafei Zhang, Hongbin Wang, Fan Li, and Hua Jin. Heterogeneous knowledge distillation for simultaneous infrared-visible image fusion and super-resolution.IEEE Transactions on Instrumentation and Measurement, 71:1– 15, 2022. 2

  41. [41]

    Frequency-assisted mamba for remote sensing image super-resolution.IEEE Transactions on Multimedia, 27:1783–1796, 2024

    Yi Xiao, Qiangqiang Yuan, Kui Jiang, Yuzeng Chen, Qiang Zhang, and Chia-Wen Lin. Frequency-assisted mamba for remote sensing image super-resolution.IEEE Transactions on Multimedia, 27:1783–1796, 2024. 2

  42. [42]

    Ediffsr: An efficient diffusion prob- abilistic model for remote sensing image super-resolution

    Yi Xiao, Qiangqiang Yuan, Kui Jiang, Jiang He, Xianyu Jin, and Liangpei Zhang. Ediffsr: An efficient diffusion prob- abilistic model for remote sensing image super-resolution. IEEE Transactions on Geoscience and Remote Sensing, 62:1–14, 2023. 2

  43. [43]

    Restormer: Efficient transformer for high-resolution image restoration

    Syed Waqas Zamir, Aditya Arora, Salman Khan, Mu- nawar Hayat, Fahad Shahbaz Khan, and Ming-Hsuan Yang. Restormer: Efficient transformer for high-resolution image restoration. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5728–5739,

  44. [44]

    Infrared thermal imaging super-resolution via multiscale spatio-temporal feature fusion network.IEEE Sensors Journal, 21(17):19176–19185, 2021

    Wenhui Zhang, Xiubao Sui, Guohua Gu, Qian Chen, and Heyang Cao. Infrared thermal imaging super-resolution via multiscale spatio-temporal feature fusion network.IEEE Sensors Journal, 21(17):19176–19185, 2021. 1, 2

  45. [45]

    3D Smoke Scene Reconstruction Guided by Vision Priors from Multimodal Large Language Models

    Xinye Zheng, Fei Wang, Yiqi Nie, Kun Li, Junjie Chen, Ji- aqi Zhao, Yanyan Wei, and Zhiliang Wu. 3d smoke scene re- construction guided by vision priors from multimodal large language models.arXiv preprint arXiv:2604.05687, 2026. 2

  46. [46]

    Mod- slam: Monocular dense mapping for unbounded 3d scene 6 reconstruction.IEEE Robotics and Automation Letters, 10(1):484–491, 2024

    Heng Zhou, Zhetao Guo, Yuxiang Ren, Shuhong Liu, Lechen Zhang, Kaidi Zhang, and Mingrui Li. Mod- slam: Monocular dense mapping for unbounded 3d scene 6 reconstruction.IEEE Robotics and Automation Letters, 10(1):484–491, 2024. 1

  47. [47]

    Real-infraredsr: real-world infrared image super-resolution via thermal imager.Optics Express, 31(22):36171–36187, 2023

    Yicheng Zhou, Yuan Liu, Liyin Yuan, Qian Chen, Guohua Gu, and Xiubao Sui. Real-infraredsr: real-world infrared image super-resolution via thermal imager.Optics Express, 31(22):36171–36187, 2023. 2

  48. [48]

    Naka-GS: A Bionics-inspired Dual-Branch Naka Correction and Progressive Point Pruning for Low-Light 3DGS

    Runyu Zhu, SiXun Dong, Zhiqiang Zhang, Qingxia Ye, and Zhihua Xu. Naka-gs: A bionics-inspired dual-branch naka correction and progressive point pruning for low-light 3dgs. arXiv preprint arXiv:2604.11142, 2026. 2 7