Recognition: unknown
ZoomSpec: A Physics-Guided Coarse-to-Fine Framework for Wideband Spectrum Sensing
Pith reviewed 2026-05-10 13:44 UTC · model grok-4.3
The pith
ZoomSpec integrates log-space spectrogram transforms and adaptive signal purification into a coarse-to-fine network to lift wideband spectrum sensing accuracy.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ZoomSpec is a physics-guided coarse-to-fine framework in which a Log-Space STFT overcomes the geometric limits of linear spectrograms, a Coarse Proposal Net rapidly screens the full band, an Adaptive Heterodyne Low-Pass module purifies the signal by center-frequency alignment, bandwidth-matched filtering and safe decimation, and a Fine Recognition Net fuses purified time-domain I/Q with spectral magnitude through dual-domain attention to refine boundaries and classify modulations.
What carries the argument
The Adaptive Heterodyne Low-Pass (AHLP) module, which executes center-frequency aligning, bandwidth-matched filtering, and safe decimation to remove out-of-band interference before fine recognition.
If this is right
- The framework reaches 78.1 mAP@0.5:0.95 on real-world SpaceNet recordings, exceeding prior systems.
- Detection stability holds across diverse modulation bandwidths where earlier methods degrade.
- Narrowband visibility improves while constant relative resolution is preserved over wide bands.
- Out-of-band interference is suppressed before classification, reducing false boundaries.
- Dual-domain attention jointly optimizes temporal localization and modulation labels.
Where Pith is reading between the lines
- The same coarse-to-fine purification pattern could be tested on radar or underwater acoustic spectrograms that share similar leakage and resolution issues.
- Real-time deployment becomes feasible once the coarse proposal net is quantized, enabling dynamic spectrum access on resource-limited platforms.
- The log-space transform may generalize to other logarithmic frequency representations used in vibration analysis or biomedical signal processing.
Load-bearing premise
The assumption that the physics priors (log-space STFT and adaptive heterodyne filtering) fully close the domain gap between natural-image training data and real spectrograms without introducing new biases that hurt narrowband detection.
What would settle it
A direct ablation on the SpaceNet dataset in which removing the AHLP module or reverting to linear spectrograms causes mAP@0.5:0.95 to fall below the current leaderboard systems on narrowband modulations.
Figures
read the original abstract
Wideband spectrum sensing for low-altitude monitoring is critical yet challenging due to heterogeneous protocols,large bandwidths, and non-stationary SNR. Existing data-driven approaches treat spectrograms as natural images,suffering from domain mismatch: they neglect time-frequency resolution constraints and spectral leakage, leading topoor narrowband visibility. This paper proposes ZoomSpec, a physics-guided coarse-to-fine framework integrating signal processing priors with deep learning. We introduce a Log-Space STFT (LS-STFT) to overcome the geometric bottleneck of linear spectrograms, sharpening narrowband structures while maintaining constant relative resolution. A lightweight Coarse Proposal Net (CPN) rapidly screens the full band. To bridge coarse detection and fine recognition, we design an Adaptive Heterodyne Low-Pass (AHLP) module that executes center-frequency aligning, bandwidth-matched filtering, and safe decimation, purifying signals of out-of-band interference. A Fine Recognition Net (FRN) fuses purified time-domain I/Q with spectral magnitude via dual-domain attention to jointly refine temporal boundaries and modulation classification. Evaluations on the SpaceNet real-world dataset demonstrate state-of-the-art 78.1 mAP@0.5:0.95, surpassing existing leaderboard systems with superior stability across diverse modulation bandwidths.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes ZoomSpec, a physics-guided coarse-to-fine framework for wideband spectrum sensing. It introduces a Log-Space STFT (LS-STFT) to achieve constant relative resolution and sharpen narrowband structures in spectrograms, a Coarse Proposal Net (CPN) for initial screening, an Adaptive Heterodyne Low-Pass (AHLP) module for center-frequency alignment, bandwidth-matched filtering and safe decimation to purify signals, and a Fine Recognition Net (FRN) that fuses time-domain I/Q signals with spectral magnitude via dual-domain attention for refined boundary detection and modulation classification. The central empirical claim is state-of-the-art performance of 78.1 mAP@0.5:0.95 on the SpaceNet real-world dataset, with improved stability across diverse modulation bandwidths.
Significance. If the results hold under detailed scrutiny, the work has moderate significance for integrating signal-processing priors (LS-STFT and AHLP) with deep networks to address domain mismatch between natural-image training and spectrogram data in spectrum sensing. This could benefit applications in low-altitude monitoring with heterogeneous protocols and non-stationary conditions. The coarse-to-fine design and dual-domain fusion are conceptually coherent, but the absence of ablations and baselines in the provided text limits assessment of whether the physics components deliver the claimed gains without new biases.
major comments (2)
- Abstract: The central claim of 78.1 mAP@0.5:0.95 as state-of-the-art with superior stability is load-bearing for the paper's contribution, yet the text provides no baseline comparisons, ablation studies on LS-STFT/AHLP/FRN, implementation details, or error analysis, rendering the performance result unverifiable from the available manuscript.
- Abstract (physics priors section): The claim that LS-STFT and AHLP fully resolve domain mismatch and improve narrowband visibility without introducing artifacts is central to the motivation, but no quantitative evidence or analysis of potential biases from these modules (e.g., effects on narrowband detection) is supplied to support it.
minor comments (2)
- Abstract: Typo 'topoor' should be 'to poor'; missing space after 'protocols,'.
- Abstract: The mAP@0.5:0.95 metric is standard but would benefit from explicit definition of the IoU thresholds used for the SpaceNet evaluation.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each major comment below and will make revisions to enhance the verifiability of our claims and the supporting evidence for the physics-guided components.
read point-by-point responses
-
Referee: Abstract: The central claim of 78.1 mAP@0.5:0.95 as state-of-the-art with superior stability is load-bearing for the paper's contribution, yet the text provides no baseline comparisons, ablation studies on LS-STFT/AHLP/FRN, implementation details, or error analysis, rendering the performance result unverifiable from the available manuscript.
Authors: We acknowledge that the abstract's central performance claim requires clearer supporting evidence within the manuscript to ensure verifiability. The current text focuses on the high-level result without embedding or cross-referencing the necessary comparisons and analyses. In the revised manuscript, we will expand the Experiments section to include explicit baseline comparisons against existing leaderboard systems on the SpaceNet dataset, ablation studies isolating LS-STFT, CPN, AHLP, and FRN, detailed implementation parameters, and error analysis across modulation bandwidths. We will also revise the abstract to briefly reference the key baselines and add a compact results summary table in the introduction for immediate context. revision: yes
-
Referee: Abstract (physics priors section): The claim that LS-STFT and AHLP fully resolve domain mismatch and improve narrowband visibility without introducing artifacts is central to the motivation, but no quantitative evidence or analysis of potential biases from these modules (e.g., effects on narrowband detection) is supplied to support it.
Authors: We agree that the claims regarding LS-STFT and AHLP require quantitative backing to demonstrate resolution of domain mismatch, improved narrowband visibility, and absence of introduced artifacts or biases. The current manuscript motivates these modules but lacks dedicated metrics or bias analysis. In the revision, we will add ablation experiments quantifying narrowband detection performance (e.g., precision on narrowband signals) with and without LS-STFT/AHLP, along with visual and numerical analysis of potential artifacts or biases in spectrograms and detection outcomes. This will directly support the physics priors section. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The abstract and available claims describe an empirical architecture (LS-STFT for spectrograms, AHLP for decimation, dual-domain FRN) whose central result is an mAP score measured on the external SpaceNet dataset. No equations, fitted parameters renamed as predictions, self-citations, or uniqueness theorems are shown that would reduce the reported performance to the inputs by construction. The physics priors are presented as design choices whose value is validated externally rather than defined circularly.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Existing data-driven spectrogram methods suffer from domain mismatch due to neglected time-frequency resolution and spectral leakage
invented entities (2)
-
Log-Space STFT (LS-STFT)
no independent evidence
-
Adaptive Heterodyne Low-Pass (AHLP) module
no independent evidence
Reference graph
Works this paper leans on
-
[1]
On radio-frequency spectrum management,
R. Struzak, T. Tjelta, and J. P. Borrego, “On radio-frequency spectrum management,”URSI Radio Science Bulletin, vol. 2015, no. 354, pp. 11– 35, 2015
2015
-
[2]
Sensing capacity for integrated sensing and communication systems in low-altitude economy,
J. Wan, H. Ren, C. Pan, Z. Zhang, S. Gao, Y . Yu, and C. Wang, “Sensing capacity for integrated sensing and communication systems in low-altitude economy,”IEEE Communications Letters, 2025
2025
-
[3]
Wideband spectrum sensing in cognitive radio networks,
Z. Quan, S. Cui, A. H. Sayed, and H. V . Poor, “Wideband spectrum sensing in cognitive radio networks,” inProc. IEEE International Conference on Communications (ICC). IEEE, 2008, pp. 901–906
2008
-
[4]
Improved spectrum sensing for cognitive radio based on adaptive threshold,
R. K. Dubey and G. Verma, “Improved spectrum sensing for cognitive radio based on adaptive threshold,” in2015 Second International Con- ference on Advances in Computing and Communication Engineering. IEEE, 2015, pp. 253–256
2015
-
[5]
Exploitation of spectral redundancy in cyclostationary signals,
W. A. Gardner, “Exploitation of spectral redundancy in cyclostationary signals,”IEEE Signal Processing Magazine, vol. 8, no. 2, pp. 14–36, 2002
2002
-
[6]
Collaborative cyclostationary spectrum sensing for cognitive radio systems,
J. Lund ´en, V . Koivunen, A. Huttunen, and H. V . Poor, “Collaborative cyclostationary spectrum sensing for cognitive radio systems,”IEEE Transactions on Signal Processing, vol. 57, no. 11, pp. 4182–4195, 2009
2009
-
[7]
Spectrum sensing for cognitive radio: State-of-the-art and recent advances,
E. Axell, G. Leus, E. G. Larsson, and H. V . Poor, “Spectrum sensing for cognitive radio: State-of-the-art and recent advances,”IEEE Signal Processing Magazine, vol. 29, no. 3, pp. 101–116, 2012
2012
-
[8]
Covariance based signal detections for cognitive radio,
Y . Zeng and Y .-C. Liang, “Covariance based signal detections for cognitive radio,” in2007 2nd IEEE International Symposium on New Frontiers in Dynamic Spectrum Access Networks. IEEE, 2007, pp. 202–207
2007
-
[9]
A survey on spectrum sensing techniques for cognitive radio,
D. D. Ariananda, M. K. Lakshmanan, and H. Nikookar, “A survey on spectrum sensing techniques for cognitive radio,” in2009 Second International Workshop on Cognitive Radio and Advanced Spectrum Management, 2009, pp. 74–79
2009
-
[10]
MCNet: An efficient CNN architecture for robust automatic modulation classifica- tion,
T. Huynh-The, C.-H. Hua, Q.-V . Pham, and D.-S. Kim, “MCNet: An efficient CNN architecture for robust automatic modulation classifica- tion,”IEEE Communications Letters, vol. 24, no. 4, pp. 811–815, Apr. 2020
2020
-
[11]
CNN-based automatic modulation classification for beyond 5G communications,
A. P. Hermawan, R. R. Ginanjar, D.-S. Kim, and J.-M. Lee, “CNN-based automatic modulation classification for beyond 5G communications,” IEEE Communications Letters, vol. 24, no. 5, pp. 1038–1041, May 2020
2020
-
[12]
SigNet: A novel deep learning framework for radio signal classification,
Z. Chenet al., “SigNet: A novel deep learning framework for radio signal classification,”IEEE Transactions on Cognitive Communications and Networking, vol. 8, no. 2, pp. 529–541, Jun. 2022
2022
-
[13]
UniFormer: Unifying convolution and self-attention for visual recognition,
K. Liet al., “UniFormer: Unifying convolution and self-attention for visual recognition,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 10, pp. 12 581–12 600, Oct. 2023
2023
-
[14]
Deep learning models for wireless signal classification with distributed low- cost spectrum sensors,
S. Rajendran, W. Meert, D. Giustiniano, V . Lenders, and S. Pollin, “Deep learning models for wireless signal classification with distributed low- cost spectrum sensors,”IEEE Transactions on Cognitive Communica- tions and Networking, vol. 4, no. 3, pp. 433–445, Sep. 2018
2018
-
[15]
Real-time radio technology and modulation clas- sification via an LSTM auto-encoder,
Z. Ke and H. Vikalo, “Real-time radio technology and modulation clas- sification via an LSTM auto-encoder,”IEEE Transactions on Wireless Communications, vol. 21, no. 1, pp. 370–382, Jan. 2022
2022
-
[16]
Deep architectures for modulation recog- nition,
N. E. West and T. O’Shea, “Deep architectures for modulation recog- nition,” inProc. IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN), 2017, pp. 1–6
2017
-
[17]
Intra-pulse modulation radar signal recognition based on CLDN network,
S. Wei, Q. Qu, H. Su, M. Wang, J. Shi, and X. Hao, “Intra-pulse modulation radar signal recognition based on CLDN network,”IET Radar , Sonar & Navigation, vol. 14, no. 6, pp. 803–810, 2020
2020
-
[18]
A spatiotemporal multi-channel learning framework for automatic modulation recognition,
J. Xu, C. Luo, G. Parr, and Y . Luo, “A spatiotemporal multi-channel learning framework for automatic modulation recognition,”IEEE Wire- less Communications Letters, vol. 9, no. 10, pp. 1629–1632, Oct. 2020
2020
-
[19]
Spectrum monitoring for radar bands using deep convolutional neural networks,
A. Selim, F. Paisana, J. A. Arokkiam, Y . Zhang, L. Doyle, and L. A. DaSilva, “Spectrum monitoring for radar bands using deep convolutional neural networks,” inGLOBECOM 2017: 2017 IEEE Global Communi- cations Conference. IEEE, 2017, pp. 1–6
2017
-
[20]
Deepsense: Fast wideband spectrum sensing through real-time in-the-loop deep learning,
D. Uvaydov, S. D’Oro, F. Restuccia, and T. Melodia, “Deepsense: Fast wideband spectrum sensing through real-time in-the-loop deep learning,” inIEEE INFOCOM 2021: IEEE Conference on Computer Communications. IEEE, 2021, pp. 1–10
2021
-
[21]
Spectrum trans- former: An attention-based wideband spectrum detector,
W. Zhang, Y . Wang, X. Chen, Z. Cai, and Z. Tian, “Spectrum trans- former: An attention-based wideband spectrum detector,”IEEE Trans- actions on Wireless Communications, vol. 23, no. 9, pp. 12 343–12 353, 2024
2024
-
[22]
Spectrum sensing and signal identification with deep learning based on spectral correlation function,
K. Tekbıyık, ¨O. Akbunar, A. R. Ekti, A. G ¨orc ¸in, G. K. Kurt, and K. A. Qaraqe, “Spectrum sensing and signal identification with deep learning based on spectral correlation function,”IEEE Transactions on V ehicular Technology, vol. 70, no. 10, pp. 10 514–10 527, 2021
2021
-
[23]
A review of YOLO algorithm developments,
P. Jiang, D. Ergu, F. Liu, Y . Cai, and B. Ma, “A review of YOLO algorithm developments,”Procedia Computer Science, vol. 199, pp. 1066–1073, 2022
2022
-
[24]
YOLOv11: An Overview of the Key Architectural Enhancements
R. Khanam and M. Hussain, “YOLOv11: An overview of the key architectural enhancements,”arXiv preprint arXiv:2410.17725, 2024
work page internal anchor Pith review arXiv 2024
-
[25]
Joint detection and classification of RF signals using deep learning,
A. Vagollari, V . Schram, W. Wicke, M. Hirschbeck, and W. Gerstacker, “Joint detection and classification of RF signals using deep learning,” in 2021 IEEE 93rd V ehicular Technology Conference (VTC2021-Spring). IEEE, 2021, pp. 1–7
2021
-
[26]
An end-to-end deep learning framework for wideband signal recognition,
A. Vagollari, M. Hirschbeck, and W. Gerstacker, “An end-to-end deep learning framework for wideband signal recognition,”IEEE Access, vol. 11, pp. 52 899–52 922, 2023
2023
-
[27]
End-to-end object detection with transformers,
N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-end object detection with transformers,” in European Conference on Computer Vision (ECCV). Springer, 2020, pp. 213–229
2020
-
[28]
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
H. Zhang, F. Li, S. Liu, L. Zhang, H. Su, J. Zhu, L. M. Ni, and H.- Y . Shum, “DINO: DETR with improved denoising anchor boxes for end-to-end object detection,”arXiv preprint arXiv:2203.03605, 2022
work page internal anchor Pith review arXiv 2022
-
[29]
RT-DETR-based wideband signal detection and modulation classification,
M. Cao, P. Chu, P. Ma, and B. Fang, “RT-DETR-based wideband signal detection and modulation classification,”Frontiers in Computing and Intelligent Systems, 2025
2025
-
[30]
Heisenberg’s uncertainty princi- ple,
P. Busch, T. Heinonen, and P. Lahti, “Heisenberg’s uncertainty princi- ple,”Physics Reports, vol. 452, no. 6, pp. 155–176, 2007
2007
-
[31]
SpaceNet: A large-scale real-measurement bench- mark dataset for low-altitude spectrum sensing,
Fudan University Space Internet Research Institute and Shanghai Radio Monitoring Station, “SpaceNet: A large-scale real-measurement bench- mark dataset for low-altitude spectrum sensing,” [Online]. Available: https://www.chaspark.com/\#/s/SpaceNet, 2025, accessed: 2026-02-01
2025
-
[32]
IQFormer: A novel transformer-based model with multi-modality fusion for automatic mod- ulation recognition,
M. Shao, D. Li, S. Hong, J. Qi, and H. Sun, “IQFormer: A novel transformer-based model with multi-modality fusion for automatic mod- ulation recognition,”IEEE Transactions on Cognitive Communications and Networking, 2024
2024
-
[33]
A. V . Oppenheim and R. W. Schafer,Signals and Systems, 2nd ed. Prentice Hall, 2010
2010
-
[34]
YOLOv11: Next-generation object detection models,
Ultralytics, “YOLOv11: Next-generation object detection models,” [Online]. Available: https://github.com/ultralytics/ultralytics, 2024, YOLOv11-nano variant. Accessed: 2026-02-01
2024
-
[35]
SwiftFormer: Efficient additive attention for transformer-based real-time mobile vision applications,
A. Shaker, M. Maaz, H. Rasheed, S. Khan, M.-H. Yang, and F. S. Khan, “SwiftFormer: Efficient additive attention for transformer-based real-time mobile vision applications,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 17 425– 17 436
2023
-
[36]
Bidirectional recurrent neural net- works,
M. Schuster and K. K. Paliwal, “Bidirectional recurrent neural net- works,”IEEE Transactions on Signal Processing, vol. 45, no. 11, pp. 2673–2681, 1997
1997
-
[37]
AI+Radio
2025 World “AI+Radio” Challenge(W ARC) Organizing Committee, “2025 global “ai+radio” challenge (warc): Official website,” [Online]. Available: http://www.airadio2025.com/, 2025, accessed: 2026-02-01
2025
-
[38]
Y . Peng, H. Li, P. Wu, Y . Zhang, X. Sun, and F. Wu, “D-FINE: Redefine regression task in DETRs as fine-grained distribution refinement,”arXiv preprint arXiv:2410.13842, 2024
-
[39]
I. Robinson, P. Robicheaux, M. Popov, D. Ramanan, and N. Peri, “RF- DETR: Neural architecture search for real-time detection transformers,” arXiv preprint arXiv:2511.09554, 2025
-
[40]
Airadio2025 leaderboard,
Airadio2025, “Airadio2025 leaderboard,” [Online]. Available: http:// www.airadio2025.com/front/news?id=1988409955859873794, 2025, ac- cessed: 2026-02-01
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.