pith. machine review for the scientific record. sign in

arxiv: 2604.16854 · v1 · submitted 2026-04-18 · 💻 cs.CV

Recognition: unknown

CATP: Confidence-Aware Token Pruning for Camouflaged Object Detection

Bing Li, Shuhao Kang, Xin He, Xu Cheng, Yuhan Gao, Yun Liu

Authors on Pith no claims yet

Pith reviewed 2026-05-10 06:49 UTC · model grok-4.3

classification 💻 cs.CV
keywords camouflaged object detectiontoken pruningtransformer efficiencyboundary tokensfeature compensationmodel optimizationcomputer vision
0
0 comments X

The pith

Hierarchical pruning of confident tokens lets camouflaged object detectors focus computation on boundary regions while dual-path compensation preserves accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to reduce the heavy computation of transformer-based models for camouflaged object detection without sacrificing their accuracy. It achieves this by using confidence scores to hierarchically remove tokens that are easy to tell apart as either background or clear object interior. This leaves the model to process mainly the boundary tokens where the camouflage creates the most confusion. A dual-path mechanism then incorporates context from the pruned tokens back into the remaining features to avoid losing important details.

Core claim

The central claim is that hierarchically discarding easily distinguishable tokens from background and object interiors based on confidence allows transformer detectors to focus computations on critical boundary tokens for camouflaged object detection. Introducing a dual-path feature compensation mechanism aggregates contextual knowledge from the pruned tokens into enriched features. This results in significantly reduced computational complexity while maintaining high accuracy, as shown on multiple benchmarks.

What carries the argument

Hierarchical confidence-based selection of tokens to prune, directing focus to boundary tokens, supported by dual-path aggregation of context from discarded tokens.

Load-bearing premise

The load-bearing premise is that confidence scores reliably identify easy tokens for pruning and that dual-path compensation recovers boundary-critical information without introducing errors.

What would settle it

Observing whether the model's performance on camouflaged object benchmarks drops when the pruning is applied, especially if boundary precision decreases despite the compensation step.

Figures

Figures reproduced from arXiv: 2604.16854 by Bing Li, Shuhao Kang, Xin He, Xu Cheng, Yuhan Gao, Yun Liu.

Figure 1
Figure 1. Figure 1: Performance comparison of weighted F-measure [31] and GFLOPS for our CATP framework applied to two baselines across four COD datasets. Early COD approaches mainly rely on Convolutional Neural Network (CNN)- based architectures [9,10,23]. With the emergence of Vision Transformers (ViTs) [6, 28], transformer-based COD methods [44, 49] have achieved superior perfor￾mance. This improvement largely stems from t… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of CATP framework. A scoring head predicts token confidence, with dual thresholds identifying high confidence tokens for pruning and ambiguous ones for retention. Progressively updated binary masks guide token computation across stages. Pruned tokens are aggregated into prototypes via dual-path compensation, and back￾refilling restores full-resolution features before multi-level decoding. Epos ∈ R… view at source ↗
Figure 3
Figure 3. Figure 3: Confidence score heatmaps for three pruning stages. Darker values denote pruned tokens and brighter values indicate retained tokens. With progressive prun￾ing, retained tokens gradually concentrate around object boundaries. at the initial stage. The parameter τ is a temperature factor used to control the smoothness of the predicted probability distribution, we empirically set τ = 10. Tokens with high confi… view at source ↗
Figure 4
Figure 4. Figure 4: Qualitative visualization of prediction results. From left to right: input images, GT, and predictions of CFRN and SENet w/o CATP. rises from 0.838 to 0.863, and on CHAMELEON from 0.897 to 0.911. In contrast, although ViT-s is lightweight, its accuracy drops considerably due to restricted representational capacity. These results demonstrate that strategically pruning a high-capacity model via CATP achieves… view at source ↗
Figure 5
Figure 5. Figure 5: Visualization of pruning masks at different stages. As hierarchical progressive pruning proceeds, confident background and foreground tokens are gradually removed, while regions near object boundaries are preserved for further computation [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
read the original abstract

Camouflaged Object Detection (COD) aims to segment targets that share extreme textural and structural similarities with their complex environments. Leveraging their capacity for long-range dependency modeling, Transformer-based detectors have become the mainstream approach and achieve state-of-the-art (SoTA) accuracy, yet their substantial computational overhead severely limits practical deployment. To address this, we propose a hierarchical Confidence-Aware Token Pruning framework (CATP) tailored for COD. Our approach hierarchically identifies and discards easily distinguishable tokens from both background and object interiors, focusing computations on critical boundary tokens. To compensate for information loss from pruning, we introduce a dual-path feature compensation mechanism that aggregates contextual knowledge from pruned tokens into enriched features. Extensive experiments on multiple COD benchmarks demonstrate that our method significantly reduces computational complexity while maintaining high accuracy, offering a promising research direction for the efficient deployment of COD models in real-world scenarios. The code will be released.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes CATP, a hierarchical Confidence-Aware Token Pruning framework for Camouflaged Object Detection (COD) in transformer-based models. It discards easily distinguishable tokens from backgrounds and object interiors based on confidence scores to focus computation on boundary tokens, and introduces a dual-path feature compensation mechanism to recover contextual information from pruned tokens. Experiments on multiple COD benchmarks are reported to show substantial complexity reduction while preserving high accuracy.

Significance. If the empirical claims hold, the work addresses a practical bottleneck in deploying transformer-based COD detectors by reducing computational overhead without major accuracy degradation. The targeted focus on boundary tokens combined with explicit compensation for pruning loss represents a domain-specific efficiency technique that could influence efficient vision transformer designs for other fine-grained segmentation tasks.

major comments (2)
  1. [Abstract and §3] Abstract and §3 (method description): The central claim that hierarchical confidence-based pruning of 'easily distinguishable' interior/background tokens can be performed while 'maintaining high accuracy' rests on the unverified assumption that confidence scores reliably isolate boundary-critical information in COD. In camouflage scenarios, object interiors frequently share textural statistics with the background, so tokens labeled high-confidence at one layer may still encode subtle boundary context; discarding them risks irreducible error that the dual-path compensation (which aggregates from already-pruned features) cannot guarantee to restore. Concrete ablation results across camouflage difficulty levels and boundary-specific metrics (e.g., boundary F-score) are required to substantiate this.
  2. [Experiments section] Experiments section (presumably §4): The abstract asserts 'significantly reduces computational complexity while maintaining high accuracy' yet provides no quantitative figures (FLOPs, MACs, latency, or mIoU deltas versus baselines such as SINet, ZoomNet, or prior token-pruning transformers). Without these numbers, tables, or statistical significance tests, the efficiency-accuracy trade-off cannot be evaluated, and it is impossible to determine whether the dual-path compensation fully offsets any accuracy drop induced by pruning.
minor comments (2)
  1. [Abstract] Abstract: The final sentence states 'The code will be released' without a link or repository identifier; adding a footnote or GitHub URL would improve reproducibility.
  2. [Abstract] Notation: The terms 'dual-path feature compensation' and 'hierarchical' pruning are introduced without an accompanying diagram or pseudocode in the abstract; a high-level figure would clarify the data flow for readers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment point by point below and have revised the manuscript to strengthen the empirical support for our claims.

read point-by-point responses
  1. Referee: [Abstract and §3] Abstract and §3 (method description): The central claim that hierarchical confidence-based pruning of 'easily distinguishable' interior/background tokens can be performed while 'maintaining high accuracy' rests on the unverified assumption that confidence scores reliably isolate boundary-critical information in COD. In camouflage scenarios, object interiors frequently share textural statistics with the background, so tokens labeled high-confidence at one layer may still encode subtle boundary context; discarding them risks irreducible error that the dual-path compensation (which aggregates from already-pruned features) cannot guarantee to restore. Concrete ablation results across camouflage difficulty levels and boundary-specific metrics (e.g., boundary F-score) are required to substantiate this.

    Authors: We appreciate the referee's insightful concern about the reliability of confidence scores for isolating boundary-critical tokens in camouflage settings. While our hierarchical pruning is motivated by the observation that high-confidence tokens often correspond to uniform interior or background regions, we agree that additional targeted evidence is needed to address potential information loss. In the revised manuscript, we have added ablation studies that break down performance across camouflage difficulty levels (using stratified subsets from COD10K and CAMO). We also report boundary-specific metrics including boundary F-score and boundary IoU to show that pruning preserves critical edge information and that the dual-path compensation recovers contextual details from pruned tokens. These results provide empirical support for the approach without claiming an absolute guarantee. revision: yes

  2. Referee: [Experiments section] Experiments section (presumably §4): The abstract asserts 'significantly reduces computational complexity while maintaining high accuracy' yet provides no quantitative figures (FLOPs, MACs, latency, or mIoU deltas versus baselines such as SINet, ZoomNet, or prior token-pruning transformers). Without these numbers, tables, or statistical significance tests, the efficiency-accuracy trade-off cannot be evaluated, and it is impossible to determine whether the dual-path compensation fully offsets any accuracy drop induced by pruning.

    Authors: We thank the referee for noting the need for explicit quantitative details. Although comparative results appear in the original experiments, we acknowledge that specific complexity metrics and direct deltas were not presented with sufficient clarity. In the revised manuscript, we have expanded the experiments section with new tables detailing FLOPs, MACs, and measured latency, along with precise mIoU (and other metric) deltas versus SINet, ZoomNet, and prior token-pruning transformers. Statistical significance tests have also been added to substantiate the efficiency-accuracy trade-off. These additions demonstrate that the dual-path compensation enables substantial complexity reduction while fully offsetting any minor accuracy impact. revision: yes

Circularity Check

0 steps flagged

No circularity detected in CATP derivation

full rationale

The paper introduces an original hierarchical confidence-aware token pruning framework with a dual-path compensation mechanism for efficient COD. No equations or parameters are shown that reduce any claimed result to its inputs by construction, no fitted quantities are renamed as predictions, and no self-citations are load-bearing for the central claims. The approach is presented as a self-contained engineering construction rather than a re-expression of prior fitted results or ansatzes.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that Transformer token representations contain separable confidence signals for background versus boundary regions and that contextual aggregation from pruned tokens can be performed without loss of critical segmentation cues. No free parameters or invented physical entities are described.

axioms (1)
  • domain assumption Transformer-based detectors are the mainstream approach for COD due to long-range dependency modeling
    Stated directly in the abstract as background for the efficiency problem.

pith-pipeline@v0.9.0 · 5464 in / 1178 out tokens · 33817 ms · 2026-05-10T06:49:30.513040+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

52 extracted references · 11 canonical work pages · 5 internal anchors

  1. [1]

    Token Merging: Your ViT But Faster

    Bolya, D., Fu, C.Y., Dai, X., Zhang, P., Feichtenhofer, C., Hoffman, J.: Token merging: Your vit but faster. arXiv preprint arXiv:2210.09461 (2022) 5

  2. [2]

    In: Proceedings of the IEEE/CVF international conference on computer vision

    Chen, M., Shao, W., Xu, P., Lin, M., Zhang, K., Chao, F., Ji, R., Qiao, Y., Luo, P.: Diffrate: Differentiable compression rate for efficient vision transformers. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 17164–17174 (2023) 5

  3. [3]

    In: ICCV

    Chen, X., Ren, G., Dai, T., Stathaki, T., Liu, H.: Enhancing prompt generation with adaptive refinement for camouflaged object detection. In: ICCV. pp. 20672– 20682 (2025) 4

  4. [4]

    In: Proceedings of the 31st ACM international conference on multimedia

    Cong, R., Sun, M., Zhang, S., Zhou, X., Zhang, W., Zhao, Y.: Frequency percep- tion network for camouflaged object detection. In: Proceedings of the 31st ACM international conference on multimedia. pp. 1179–1189 (2023) 4

  5. [5]

    In: 2009 IEEE conference on computer vision and pattern recognition

    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large- scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. pp. 248–255. Ieee (2009) 2

  6. [6]

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020) 2, 4, 5, 9

  7. [7]

    In: Proceedings of the IEEE international conference on computer vision

    Fan, D.P., Cheng, M.M., Liu, Y., Li, T., Borji, A.: Structure-measure: A new way to evaluate foreground maps. In: Proceedings of the IEEE international conference on computer vision. pp. 4548–4557 (2017) 9

  8. [8]

    Enhanced-alignment measure for binary foreground map evaluation,

    Fan,D.P.,Gong,C.,Cao,Y.,Ren,B.,Cheng,M.M.,Borji,A.:Enhanced-alignment measure for binary foreground map evaluation. arXiv preprint arXiv:1805.10421 (2018) 9

  9. [9]

    IEEE TPAMI44(10), 6024–6042 (2022).https://doi.org/10.1109/TPAMI.2021

    Fan, D.P., Ji, G.P., Cheng, M.M., Shao, L.: Concealed object detection. IEEE TPAMI44(10), 6024–6042 (2022).https://doi.org/10.1109/TPAMI.2021. 30857662 16 Gao et al

  10. [10]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Fan, D.P., Ji, G.P., Sun, G., Cheng, M.M., Shen, J., Shao, L.: Camouflaged object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 2777–2787 (2020) 1, 2, 4, 9

  11. [11]

    In: International confer- ence on medical image computing and computer-assisted intervention

    Fan, D.P., Ji, G.P., Zhou, T., Chen, G., Fu, H., Shen, J., Shao, L.: Pranet: Par- allel reverse attention network for polyp segmentation. In: International confer- ence on medical image computing and computer-assisted intervention. pp. 263–273. Springer (2020) 1

  12. [12]

    In: Proceedings of the fifth international conference on internet multimedia com- puting and service

    Feng, X., Guoying, C., Wei, S.: Camouflage texture evaluation using saliency map. In: Proceedings of the fifth international conference on internet multimedia com- puting and service. pp. 93–96 (2013) 4

  13. [13]

    In: Proceedings Ninth IEEE International Conference on Computer Vision

    Galun, Sharon, Basri, Brandt: Texture segmentation by multiscale aggregation of filter responses and shape elements. In: Proceedings Ninth IEEE International Conference on Computer Vision. pp. 716–723. IEEE (2003) 4

  14. [14]

    IEEE Transactions on Image Processing (2025) 4, 9

    Hao, C., Yu, Z., Liu, X., Xu, J., Yue, H., Yang, J.: A simple yet effective network based on vision transformer for camouflaged object and salient object detection. IEEE Transactions on Image Processing (2025) 4, 9

  15. [15]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recog- nition

    Huang, Z., Dai, H., Xiang, T.Z., Wang, S., Chen, H.X., Qin, J., Xiong, H.: Fea- ture shrinkage pyramid for camouflaged object detection with transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recog- nition. pp. 5557–5566 (2023) 4

  16. [16]

    In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Jacob, B., Kligys, S., Chen, B., Zhu, M., Tang, M., Howard, A., Adam, H., Kalenichenko,D.:Quantizationandtrainingofneuralnetworksforefficientinteger- arithmetic-only inference. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 2704–2713 (2018) 2

  17. [17]

    Kavitha, C., Rao, B.P., Govardhan, A.: An efficient content based image retrieval using color and texture of image sub blocks. Int. J. Eng. Sci. Technol3(2), 1060– 1068 (2011) 4

  18. [18]

    Compression of deep convolutional neural networks for fast and low power mobile applications,

    Kim, Y.D., Park, E., Yoo, S., Choi, T., Yang, L., Shin, D.: Compression of deep convolutional neural networks for fast and low power mobile applications. arXiv preprint arXiv:1511.06530 (2015) 2

  19. [19]

    Adam: A Method for Stochastic Optimization

    Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) 9

  20. [20]

    Computer vision and image understanding 184, 45–56 (2019) 9

    Le, T.N., Nguyen, T.V., Nie, Z., Tran, M.T., Sugimoto, A.: Anabranch network for camouflaged object segmentation. Computer vision and image understanding 184, 45–56 (2019) 9

  21. [21]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Lei, C., Li, A., Yao, H., Zhu, C., Zhang, L.: Rethinking token reduction with parameter-efficient fine-tuning in vit for pixel-level tasks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 14954– 14964 (2025) 5, 14

  22. [22]

    Advances in Neural Information Processing Systems36, 8152–8172 (2023) 5

    Lei, T., Bai, J., Brahma, S., Ainslie, J., Lee, K., Zhou, Y., Du, N., Zhao, V., Wu, Y., Li, B., et al.: Conditional adapters: Parameter-efficient transfer learning with fast inference. Advances in Neural Information Processing Systems36, 8152–8172 (2023) 5

  23. [23]

    In: Ijcai

    Li, X., Yang, J., Li, S., Lei, J., Zhang, J., Chen, D.: Locate, refine and restore: A progressive enhancement network for camouflaged object detection. In: Ijcai. pp. 1116–1124 (2023) 2

  24. [24]

    Not all patches are what you need: Expediting vision transformers via token reorganizations,

    Liang, Y., Ge, C., Tong, Z., Song, Y., Wang, J., Xie, P.: Not all patches are what you need: Expediting vision transformers via token reorganizations. arXiv preprint arXiv:2202.07800 (2022) 2, 5

  25. [25]

    Neurocomputing549, 126466 (2023) 1 CATP 17

    Liu, M., Di, X.: Extraordinary mhnet: Military high-level camouflage object de- tection network and dataset. Neurocomputing549, 126466 (2023) 1 CATP 17

  26. [26]

    arXiv preprint arXiv:2405.14700 (2024) 5

    Liu, T., Liu, X., Shi, L., Xu, Z., Hu, Y., Huang, S., Xin, Y., Zhong, B., Wang, D.: Sparse-tuning: Adapting vision transformers with efficient fine-tuning and in- ference. arXiv preprint arXiv:2405.14700 (2024) 5

  27. [27]

    IEEE Transactions on Information Forensics and Security16, 5154–5166 (2021) 4

    Liu, Y., Zhang, D., Zhang, Q., Han, J.: Integrating part-object relationship and contrast for camouflaged object detection. IEEE Transactions on Information Forensics and Security16, 5154–5166 (2021) 4

  28. [28]

    Machine Intelligence Research21(4), 670–683 (2024) 2, 4

    Liu, Y., Wu, Y.H., Sun, G., Zhang, L., Chhatkuli, A., Van Gool, L.: Vision trans- formers with hierarchical attention. Machine Intelligence Research21(4), 670–683 (2024) 2, 4

  29. [29]

    Decoupled Weight Decay Regularization

    Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017) 9

  30. [30]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Lv, Y., Zhang, J., Dai, Y., Li, A., Liu, B., Barnes, N., Fan, D.P.: Simultane- ously localize, segment and rank the camouflaged objects. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 11591– 11601 (2021) 9

  31. [31]

    Margolin, R., Zelnik-Manor, L., Tal, A.: How to evaluate foreground maps? In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 248–255 (2014) 2, 9

  32. [32]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Mei, H., Ji, G.P., Wei, Z., Yang, X., Wei, X., Fan, D.P.: Camouflaged object seg- mentation with distraction mining. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 8772–8781 (2021) 4

  33. [33]

    International Journal of Computer Vision 131(11), 3019–3034 (2023) 4

    Mei, H., Xu, K., Zhou, Y., Wang, Y., Piao, H., Wei, X., Yang, X.: Camouflaged ob- ject segmentation with omni perception. International Journal of Computer Vision 131(11), 3019–3034 (2023) 4

  34. [34]

    DINOv2: Learning Robust Visual Features without Supervision

    Oquab, M., Darcet, T., Moutakanni, T., Vo, H., Szafraniec, M., Khalidov, V., Fernandez, P., Haziza, D., Massa, F., El-Nouby, A., et al.: Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193 (2023) 5

  35. [35]

    Pang, Y., Zhao, X., Xiang, T.Z., Zhang, L., Lu, H.: Zoom in and out: A mixed-scale tripletnetworkforcamouflagedobjectdetection.In:ProceedingsoftheIEEE/CVF Conference on computer vision and pattern recognition. pp. 2160–2170 (2022) 4, 9

  36. [36]

    IEEE TPAMI46(12), 9205–9220 (2024) 4

    Pang, Y., Zhao, X., Xiang, T.Z., Zhang, L., Lu, H.: ZoomNeXt: A unified collab- orative pyramid network for camouflaged object detection. IEEE TPAMI46(12), 9205–9220 (2024) 4

  37. [37]

    In: 2012 IEEE conference on computer vision and pattern recognition

    Perazzi, F., Krähenbühl, P., Pritch, Y., Hornung, A.: Saliency filters: Contrast based filtering for salient region detection. In: 2012 IEEE conference on computer vision and pattern recognition. pp. 733–740. IEEE (2012) 9

  38. [38]

    Advances in neural infor- mation processing systems34, 13937–13949 (2021) 2, 4

    Rao, Y., Zhao, W., Liu, B., Lu, J., Zhou, J., Hsieh, C.J.: Dynamicvit: Efficient vision transformers with dynamic token sparsification. Advances in neural infor- mation processing systems34, 13937–13949 (2021) 2, 4

  39. [39]

    In: ICCV

    Ren, G., Liu, H., Lazarou, M., Stathaki, T.: Multi-modal segment anything model for camouflaged scene segmentation. In: ICCV. pp. 19882–19892 (2025) 4

  40. [40]

    Journal of Asia-Pacific Entomology23(1), 17–28 (2020) 1

    Rustia, D.J.A., Lin, C.E., Chung, J.Y., Zhuang, Y.J., Hsu, J.C., Lin, T.T.: Appli- cation of an image and environmental sensor network for automated greenhouse insect pest monitoring. Journal of Asia-Pacific Entomology23(1), 17–28 (2020) 1

  41. [41]

    In: The 2010 international conference on green circuits and systems

    Siricharoen, P., Aramvith, S., Chalidabhongse, T.H., Siddhichai, S.: Robust out- door human segmentation based on color-based statistical approach and edge com- bination. In: The 2010 international conference on green circuits and systems. pp. 463–468. IEEE (2010) 4 18 Gao et al

  42. [42]

    Unpublished manuscript2(6), 7 (2018) 9

    Skurowski, P., Abdulameer, H., Błaszczyk, J., Depta, T., Kornacki, A., Kozieł, P.: Animal camouflage analysis: Chameleon database. Unpublished manuscript2(6), 7 (2018) 9

  43. [43]

    IEEE Transactions on Image Processing32, 2267–2278 (2023) 4, 9

    Song, Z., Kang, X., Wei, X., Liu, H., Dian, R., Li, S.: Fsnet: Focus scanning network for camouflaged object detection. IEEE Transactions on Image Processing32, 2267–2278 (2023) 4, 9

  44. [44]

    IEEE Transactions on Image Processing (2025) 2, 4, 9

    Song, Z., Kang, X., Wei, X., Liu, J., Lin, Z., Li, S.: Continuous feature represen- tation for camouflaged object detection. IEEE Transactions on Image Processing (2025) 2, 4, 9

  45. [45]

    Machine Intelligence Research21(4), 640–651 (2024) 2

    Sun, G., Liu, Y., Probst, T., Paudel, D.P., Popovic, N., Van Gool, L.: Rethinking global context in crowd counting. Machine Intelligence Research21(4), 640–651 (2024) 2

  46. [46]

    Wang, L., Yang, J., Zhang, Y., Wang, F., Zheng, F.: Depth-aware concealed crop detectionindenseagriculturalscenes.In:ProceedingsoftheIEEE/CVFconference on computer vision and pattern recognition. pp. 17201–17211 (2024) 1

  47. [47]

    A survey of camouflaged object detection and beyond.arXiv preprint arXiv:2408.14562, 2024

    Xiao, F., Hu, S., Shen, Y., Fang, C., Huang, J., He, C., Tang, L., Yang, Z., Li, X.: A survey of camouflaged object detection and beyond. arXiv preprint arXiv:2408.14562 (2024) 1

  48. [48]

    In: Proceedings of the IEEE/CVF international conference on computer vision

    Yang, F., Zhai, Q., Li, X., Huang, R., Luo, A., Cheng, H., Fan, D.P.: Uncertainty- guided transformer reasoning for camouflaged object detection. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 4146–4155 (2021) 4

  49. [49]

    IEEE Transactions on Pattern Analysis and Machine Intelligence (2024) 2, 4

    Yin, B., Zhang, X., Fan, D.P., Jiao, S., Cheng, M.M., Van Gool, L., Hou, Q.: Camoformer: Masked separable attention for camouflaged object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (2024) 2, 4

  50. [50]

    IEEETransactionsonCircuitsandSystemsforVideoTechnology34(5),3286–3298 (2023) 4

    Yue, G., Xiao, H., Xie, H., Zhou, T., Zhou, W., Yan, W., Zhao, B., Wang, T., Jiang, Q.: Dual-constraint coarse-to-fine network for camouflaged object detection. IEEETransactionsonCircuitsandSystemsforVideoTechnology34(5),3286–3298 (2023) 4

  51. [51]

    Advances in Neural Information Processing Systems37, 114765–114796 (2024) 5

    Zhao, W., Tang, J., Han, Y., Song, Y., Wang, K., Huang, G., Wang, F., You, Y.: Dynamic tuning towards parameter and inference efficiency for vit adaptation. Advances in Neural Information Processing Systems37, 114765–114796 (2024) 5

  52. [52]

    In: Proceedings of the IEEE/CVF conference on com- puter vision and pattern recognition

    Zhong, Y., Li, B., Tang, L., Kuang, S., Wu, S., Ding, S.: Detecting camouflaged object in frequency domain. In: Proceedings of the IEEE/CVF conference on com- puter vision and pattern recognition. pp. 4504–4513 (2022) 4