pith. machine review for the scientific record. sign in

arxiv: 2605.12451 · v1 · submitted 2026-05-12 · 💻 cs.CV

Recognition: 2 theorem links

· Lean Theorem

FuTCR: Future-Targeted Contrast and Repulsion for Continual Panoptic Segmentation

Bryan A. Plummer, Deepti Ghadiyaram, Keanu Nichols, Nicholas Ikechukwu

Pith reviewed 2026-05-13 06:17 UTC · model grok-4.3

classification 💻 cs.CV
keywords continual panoptic segmentationcontrastive learningfuture class discoveryunlabeled region groupingrepresentation repulsionbackground prototype learningnew class adaptation
0
0 comments X

The pith

FuTCR identifies future-like regions in background pixels and uses contrast plus repulsion to reserve space for new classes in continual panoptic segmentation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper targets the core difficulty in continual panoptic segmentation: training images contain unlabeled objects that existing methods collapse into one background class, which later interferes with learning those objects as distinct categories. FuTCR instead scans the model's own predictions to find background pixels that carry non-background logit signals, groups them into candidate regions, and applies pixel-to-region contrast to form prototypes for those regions. At the same time it explicitly repels background features away from known-class prototypes. The result is a representation that has already carved out space before any new class arrives. Experiments on six different continual settings show this raises new-class panoptic quality by as much as 28 percent relative to prior methods while base-class performance stays the same or improves slightly.

Core claim

FuTCR discovers confident future-like regions by grouping model-predicted masks whose pixels are consistently classified as background but exhibit non-background logits, builds coherent prototypes from these unlabeled regions via pixel-to-region contrast, and simultaneously repels background features from known-class prototypes to reserve representational space, thereby improving adaptation when new categories are introduced.

What carries the argument

Future-targeted contrastive and repulsive (FuTCR) mechanism that groups background pixels with non-background logits into prototypes and pushes background features away from existing class centers.

If this is right

  • New categories can be added with less interference from prior background training signals.
  • Base-class performance is preserved or slightly improved while new-class quality rises substantially.
  • The approach scales across different dataset sizes and multiple continual learning protocols.
  • Representation space is proactively prepared for unknown future objects instead of being overwritten.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same region-discovery plus repulsion pattern could be tested in continual semantic segmentation or instance segmentation without panoptic heads.
  • Varying the logit threshold used to flag future-like pixels would reveal how sensitive performance is to the discovery step.
  • The method might reduce forgetting in other dense-prediction continual tasks where unlabeled content is common.

Load-bearing premise

Grouping pixels that the model labels background yet assigns non-background logits will reliably produce coherent regions that match actual future classes rather than noise or misclassified known objects.

What would settle it

If ablating the future-region grouping step or the repulsion term produces no gain in new-class panoptic quality, or if the grouped regions show low overlap with ground-truth future objects across multiple datasets, the central claim would be falsified.

Figures

Figures reproduced from arXiv: 2605.12451 by Bryan A. Plummer, Deepti Ghadiyaram, Keanu Nichols, Nicholas Ikechukwu.

Figure 1
Figure 1. Figure 1: From background noise to structured supervision. (a) Prior continual panoptic methods treat background regions as non-informative [21, 23, 29, 30], allowing future-class evidence to be absorbed into existing decision regions in feature space over time. (b) Our framework instead converts background activations into future-aware structural cues that organize scene composition before new labels arrive, reduci… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of FuTCR, a query-based continual panoptic model produces dense region features and mask predictions. Our future-targeted module leverages unlabeled regions together with ground-truth labeled regions to perform unlabeled region discovery Sec. 3.1, region-level contrastive learning Sec. 3.2, and known-class repulsion Sec. 3.3, encouraging structured rep￾resentations for future categories. The final… view at source ↗
Figure 3
Figure 3. Figure 3: Qualitative comparison of segmentation produced by FuTCR and SimCIS [ [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: Future-aware error dynamics. FuTCR reduces the fraction of future-class pixels that are misclassified as base classes across incremental steps compared to Sim￾CIS [23], indicating less future–old class con￾fusion at the base step. 2 4 6 8 10 Incremental Step 10 20 30 40 50 60 SemSeg mIoU 2 4 6 8 10 Incremental Step 10 20 30 40 Panoptic PQ Ours (FuTCR) SimCIS Old Classes New Classes All Classes 2 3 4 5 6 7 … view at source ↗
Figure 6
Figure 6. Figure 6: Left/middle: FuTCR consistently outperforms SimCIS [23] in mIoU and PQ on ADE20K 100–5, with especially pronounced gains on newly introduced classes and improved performance on previously learned classes. Right: cross-step prototype similarity for old classes, where FuTCR permits moderate drift (down to ≈ 0.89) instead of the baseline’s near-rigid prototypes (> 0.97), and this adaptive evolution coincides … view at source ↗
Figure 7
Figure 7. Figure 7: Stability–plasticity trajectory of FuTCR versus SimCIS on ADE20K 100–5 (steps 2–11). [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Additional qualitative comparisons between FuTCR and SimCIS [ [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗
read the original abstract

Continual Panoptic Segmentation (CPS) requires methods that can quickly adapt to new categories over time. The nature of this dense prediction task means that training images may contain a mix of labeled and unlabeled objects. As nothing is known about these unlabeled objects a priori, existing methods often simply group any unlabeled pixel into a single "background" class during training. In effect, during training, they repeatedly tell the model that all the different background categories are the same (even when they aren't). This makes learning to identify different background categories as they are added challenging since these new categories may require using information the model was previously told was unimportant and ignored. Thus, we propose a Future-Targeted Contrastive and Repulsive (FuTCR) framework that addresses this limitation by restructuring representations before new classes are introduced. FuTCR first discovers confident future-like regions by grouping model-predicted masks whose pixels are consistently classified as background but exhibit non-background logits. Next, FuTCR applies pixel-to-region contrast to build coherent prototypes from these unlabeled regions, while simultaneously repelling background features away from known-class prototypes to explicitly reserve representational space for future categories. Experiments across six CPS settings and a range of dataset sizes show FuTCR improves relative new-class panoptic quality over the state-of-the-art by up to 28%, while preserving or improving base-class performance with gains up to 4%.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 3 minor

Summary. The manuscript proposes FuTCR, a Future-Targeted Contrast and Repulsion framework for Continual Panoptic Segmentation. It first discovers future-like regions by grouping model-predicted masks classified as background yet exhibiting non-background logits, then applies pixel-to-region contrastive learning to form prototypes from these regions while repelling background features from known-class prototypes to reserve space for future categories. Experiments across six CPS settings and varying dataset sizes report relative improvements in new-class panoptic quality of up to 28% over state-of-the-art methods, with base-class performance preserved or improved by up to 4%.

Significance. If the gains are robust, the work would advance continual learning for dense prediction by proactively structuring representations around unknown future objects rather than collapsing them into a single background class. The approach could influence incremental segmentation pipelines in robotics and autonomous systems where new object categories emerge over time.

major comments (3)
  1. [§3.2] §3.2 (Region Discovery): The grouping of background-classified masks with elevated non-background logits is presented as reliably producing coherent future-like regions, yet no ablation, quantitative coherence metric, or visualization against ground-truth future objects is provided to rule out noise or over-grouping; this step is load-bearing for the subsequent contrastive and repulsive objectives and the reported 28% new-class gains.
  2. [§4] §4 (Experiments): The abstract and results claim consistent gains across six settings, but the manuscript omits exact baseline implementations, statistical significance tests, error bars or standard deviations over multiple runs, and isolated ablations of the region-grouping threshold and the contrast/repulsion terms; without these, the magnitude and reliability of the improvements cannot be verified.
  3. [§3.3] §3.3 (Repulsion Loss): The claim that repelling background features from known-class prototypes reserves space for future categories lacks supporting analysis of feature-space geometry, t-SNE visualizations, or a controlled study showing reduced interference with new-class learning; this mechanism is central to the base-class preservation result.
minor comments (3)
  1. [Abstract] Abstract: Replace the vague 'up to 28%' and 'up to 4%' with the specific dataset/setting and baseline for each reported maximum.
  2. [§3.3] Notation: Explicitly define all symbols in the contrastive and repulsion loss equations in the main text rather than deferring to the supplement.
  3. [Figures] Figures: Add side-by-side comparisons of discovered future-like regions against ground-truth annotations in at least one figure to illustrate coherence.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thorough and constructive review. We address each major comment below and will incorporate the requested additions and clarifications in the revised manuscript.

read point-by-point responses
  1. Referee: [§3.2] §3.2 (Region Discovery): The grouping of background-classified masks with elevated non-background logits is presented as reliably producing coherent future-like regions, yet no ablation, quantitative coherence metric, or visualization against ground-truth future objects is provided to rule out noise or over-grouping; this step is load-bearing for the subsequent contrastive and repulsive objectives and the reported 28% new-class gains.

    Authors: We agree that the region discovery step is central and would benefit from stronger empirical support. In the revision we will add an ablation on the non-background logit threshold used for grouping, a quantitative coherence metric (average IoU between discovered regions and ground-truth future-class pixels where available), and visualizations that overlay the grouped regions on future-object annotations to show they capture coherent structures rather than noise. revision: yes

  2. Referee: [§4] §4 (Experiments): The abstract and results claim consistent gains across six settings, but the manuscript omits exact baseline implementations, statistical significance tests, error bars or standard deviations over multiple runs, and isolated ablations of the region-grouping threshold and the contrast/repulsion terms; without these, the magnitude and reliability of the improvements cannot be verified.

    Authors: We acknowledge these omissions limit verifiability. In the revised manuscript we will (i) provide precise reproduction details for all baselines, (ii) report mean and standard deviation over three random seeds together with paired t-test significance results, and (iii) present isolated ablations that separately disable the region-grouping threshold, the contrast term, and the repulsion term to quantify each component's contribution. revision: yes

  3. Referee: [§3.3] §3.3 (Repulsion Loss): The claim that repelling background features from known-class prototypes reserves space for future categories lacks supporting analysis of feature-space geometry, t-SNE visualizations, or a controlled study showing reduced interference with new-class learning; this mechanism is central to the base-class preservation result.

    Authors: We agree that direct evidence for the geometric effect of the repulsion loss would strengthen the central claim. In the revision we will include t-SNE plots of feature embeddings before and after repulsion, quantitative measurements of the distance between background features and known-class prototypes, and a controlled ablation that isolates the repulsion term to demonstrate its role in preserving base-class performance while facilitating new-class learning. revision: yes

Circularity Check

0 steps flagged

No significant circularity; method is self-contained

full rationale

The paper defines FuTCR as a new framework that first groups model-predicted background masks with non-background logits to form future-like regions, then applies standard pixel-to-region contrastive and repulsive losses on those regions. No equations, parameters, or self-citations reduce the reported performance gains to quantities defined by construction within the same paper. The central steps rely on externally standard contrastive objectives applied to newly introduced region definitions, with no load-bearing self-citation chains or fitted-input renamings. The derivation chain is independent of its own outputs.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The framework rests on the domain assumption that current-model non-background logits within background pixels can serve as reliable signals for future categories; no free parameters or invented entities are explicitly quantified in the abstract.

free parameters (1)
  • region grouping threshold
    Used to select masks with consistent background classification yet non-background logits; value not stated.
axioms (1)
  • domain assumption Model logits on background pixels can indicate latent future classes
    Invoked when selecting confident future-like regions from background predictions.

pith-pipeline@v0.9.0 · 5563 in / 1180 out tokens · 40551 ms · 2026-05-13T06:17:22.265371+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Reference graph

Works this paper leans on

62 extracted references · 62 canonical work pages

  1. [1]

    Ferret: An efficient online continual learning framework under varying memory constraints

    Yuhao Zhou, Yuxin Tian, Jindi Lv, Mingjia Shi, Yuanxi Li, Qing Ye, Shuhao Zhang, and Jiancheng Lv. Ferret: An efficient online continual learning framework under varying memory constraints. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 4850–4861, 2025

  2. [2]

    Do your best and get enough rest for continual learning

    Hankyul Kang, Gregor Seifer, Donghyun Lee, and Jongbin Ryu. Do your best and get enough rest for continual learning. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 10077–10086, 2025

  3. [3]

    End-to-end incremental learning

    Francisco M Castro, Manuel J Marín-Jiménez, Nicolás Guil, Cordelia Schmid, and Karteek Alahari. End-to-end incremental learning. InProceedings of the European conference on computer vision (ECCV), pages 233–248, 2018

  4. [4]

    icarl: Incremental classifier and representation learning

    Sylvestre-Alvise Rebuffi, Alexander Kolesnikov, Georg Sperl, and Christoph H Lampert. icarl: Incremental classifier and representation learning. InProceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 2001–2010, 2017

  5. [5]

    Riemannian walk for incremental learning: Understanding forgetting and intransigence

    Arslan Chaudhry, Puneet K Dokania, Thalaiyasingam Ajanthan, and Philip HS Torr. Riemannian walk for incremental learning: Understanding forgetting and intransigence. InProceedings of the European conference on computer vision (ECCV), pages 532–547, 2018

  6. [6]

    Continual learning with deep generative replay.Advances in neural information processing systems, 30, 2017

    Hanul Shin, Jung Kwon Lee, Jaehong Kim, and Jiwon Kim. Continual learning with deep generative replay.Advances in neural information processing systems, 30, 2017

  7. [7]

    Der: Dynamically expandable representation for class incremental learning

    Shipeng Yan, Jiangwei Xie, and Xuming He. Der: Dynamically expandable representation for class incremental learning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3014–3023, 2021

  8. [8]

    Few-shot class-incremental learning

    Xiaoyu Tao, Xiaopeng Hong, Xinyuan Chang, Songlin Dong, Xing Wei, and Yihong Gong. Few-shot class-incremental learning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12183–12192, 2020

  9. [9]

    Vector quantization prompting for continual learning

    Li Jiao, Qiuxia Lai, Yu Li, and Qiang Xu. Vector quantization prompting for continual learning. Advances in Neural Information Processing Systems, 37:34056–34076, 2024

  10. [10]

    Ordisco: Effective and efficient usage of incremental unlabeled data for semi-supervised continual learning

    Liyuan Wang, Kuo Yang, Chongxuan Li, Lanqing Hong, Zhenguo Li, and Jun Zhu. Ordisco: Effective and efficient usage of incremental unlabeled data for semi-supervised continual learning. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5383–5392, 2021

  11. [11]

    PAC-bayes bounds for cumulative loss in continual learning

    Lior Friedman and Ron Meir. PAC-bayes bounds for cumulative loss in continual learning. In The Fourteenth International Conference on Learning Representations, 2026. URL https: //openreview.net/forum?id=hWw269fPov

  12. [12]

    Bors, Jingling Sun, Rongyao Hu, and Shijie Zhou

    Fei Ye, Yulong Zhao, Qihe Liu, Junlin Chen, Adrian G. Bors, Jingling Sun, Rongyao Hu, and Shijie Zhou. Dynamic siamese expansion framework for improving robustness in online continual learning. InAdvances in Neural Information Processing Systems, 2025

  13. [13]

    Exploiting task relationships in continual learning via transferability-aware task embeddings

    Yanru Wu, Jianning Wang, Xiangyu Chen, Enming Zhang, Yang Tan, Hanbing Liu, and Yang Li. Exploiting task relationships in continual learning via transferability-aware task embeddings. InAdvances in Neural Information Processing Systems, 2025. URL https: //openreview.net/forum?id=V8FnYzDX35. 10

  14. [14]

    Anacp: Toward upper-bound continual learning via analytic contrastive projection

    Saleh Momeni, Changnan Xiao, and Bing Liu. Anacp: Toward upper-bound continual learning via analytic contrastive projection. InAdvances in Neural Information Processing Systems,

  15. [15]

    URLhttps://openreview.net/forum?id=qQbvLU34F1

  16. [16]

    Hippotune: A hippocampal associative loop–inspired fine-tuning method for continual learning

    chenyanxi, Xiuxing Li, Han Yuyang, Zhuo Wang, Qing Li, Ziyu Li, Xiang Li, Chen Wei, and Xia Wu. Hippotune: A hippocampal associative loop–inspired fine-tuning method for continual learning. InThe Fourteenth International Conference on Learning Representations, 2026. URL https://openreview.net/forum?id=MtDiLnnYgm

  17. [17]

    ADAPT: Attentive self- distillation and dual-decoder prediction fusion for continual panoptic segmentation

    Ze Yang, Shichao Dong, Ruibo Li, Nan Song, and Guosheng Lin. ADAPT: Attentive self- distillation and dual-decoder prediction fusion for continual panoptic segmentation. InThe Thirteenth International Conference on Learning Representations, 2025. URL https:// openreview.net/forum?id=HF1UmIVv6a

  18. [18]

    Beyond background shift: Rethinking instance replay in continual semantic segmentation

    Hongmei Yin, Tingliang Feng, Fan Lyu, Fanhua Shang, Hongying Liu, Wei Feng, and Liang Wan. Beyond background shift: Rethinking instance replay in continual semantic segmentation. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 9839–9848, 2025

  19. [19]

    Falcon: Fairness learning via contrastive attention approach to continual semantic scene understanding

    Thanh-Dat Truong, Utsav Prabhu, Bhiksha Raj, Jackson Cothren, and Khoa Luu. Falcon: Fairness learning via contrastive attention approach to continual semantic scene understanding. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 15065– 15075, 2025

  20. [20]

    Continual gaussian mixture distribution modeling for class incremental semantic segmentation

    Guilin Zhu, Runmin Wang, Yuanjie Shao, Wei dong Yang, Nong Sang, and Changxin Gao. Continual gaussian mixture distribution modeling for class incremental semantic segmentation. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. URL https://openreview.net/forum?id=dtYKDOBkc7

  21. [21]

    Parameter release and knowledge reuse for class-incremental semantic segmentation, 2025

    Xinyue Zhang, Xu Zou, Liqun Chen, Jiahuan Zhou, Guodong Wang, Sheng Zhong, and Luxin Yan. Parameter release and knowledge reuse for class-incremental semantic segmentation, 2025. URLhttps://openreview.net/forum?id=9qbKOaF8YJ

  22. [22]

    Combo: Conflict mitigation via branched optimization for class incremental segmentation

    Kai Fang, Anqi Zhang, Guangyu Gao, Jianbo Jiao, Chi Harold Liu, and Yunchao Wei. Combo: Conflict mitigation via branched optimization for class incremental segmentation. InPro- ceedings of the Computer Vision and Pattern Recognition Conference, pages 25667–25676, 2025

  23. [23]

    Continual semantic segmentation with automatic memory sample selection

    Lanyun Zhu, Tianrun Chen, Jianxiong Yin, Simon See, and Jun Liu. Continual semantic segmentation with automatic memory sample selection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3082–3092, 2023

  24. [24]

    Rethinking query-based transformer for continual image segmentation

    Yuchen Zhu, Cheng Shi, Dingyou Wang, Jiajin Tang, Zhengxuan Wei, Yu Wu, Guanbin Li, and Sibei Yang. Rethinking query-based transformer for continual image segmentation. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 4595–4606, 2025

  25. [25]

    Contrastive grouping with transformer for referring image segmentation

    Jiajin Tang, Ge Zheng, Cheng Shi, and Sibei Yang. Contrastive grouping with transformer for referring image segmentation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 23570–23580, 2023

  26. [26]

    The devil is in the object boundary: Towards annotation-free instance segmentation using foundation models.arXiv preprint arXiv:2404.11957, 2024

    Cheng Shi and Sibei Yang. The devil is in the object boundary: Towards annotation-free instance segmentation using foundation models.arXiv preprint arXiv:2404.11957, 2024

  27. [27]

    Coinseg: Contrast inter-and intra-class representations for incremental segmentation

    Zekang Zhang, Guangyu Gao, Jianbo Jiao, Chi Harold Liu, and Yunchao Wei. Coinseg: Contrast inter-and intra-class representations for incremental segmentation. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 843–853, 2023

  28. [28]

    Towards continual universal segmentation

    Zihan Lin, Zilei Wang, and Xu Wang. Towards continual universal segmentation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 29417–29427, June 2025. 11

  29. [29]

    Eclipse: Efficient continual learning in panoptic segmentation with visual prompt tuning

    Beomyoung Kim, Joonsang Yu, and Sung Ju Hwang. Eclipse: Efficient continual learning in panoptic segmentation with visual prompt tuning. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3346–3356, 2024

  30. [30]

    Comformer: Continual learning in semantic and panoptic segmentation

    Fabio Cermelli, Matthieu Cord, and Arthur Douillard. Comformer: Continual learning in semantic and panoptic segmentation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3010–3020, 2023

  31. [31]

    Strike a balance in continual panoptic segmentation

    Jinpeng Chen, Runmin Cong, Yuxuan Luo, Horace Ho Shing Ip, and Sam Kwong. Strike a balance in continual panoptic segmentation. InEuropean Conference on Computer Vision, pages 126–142. Springer, 2024

  32. [32]

    Fixmatch: Simplifying semi- supervised learning with consistency and confidence.Advances in neural information processing systems, 33:596–608, 2020

    Kihyuk Sohn, David Berthelot, Nicholas Carlini, Zizhao Zhang, Han Zhang, Colin A Raf- fel, Ekin Dogus Cubuk, Alexey Kurakin, and Chun-Liang Li. Fixmatch: Simplifying semi- supervised learning with consistency and confidence.Advances in neural information processing systems, 33:596–608, 2020

  33. [33]

    A simple framework for contrastive learning of visual representations

    Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. InInternational conference on machine learning, pages 1597–1607. PmLR, 2020

  34. [34]

    Momentum contrast for unsupervised visual representation learning

    Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. Momentum contrast for unsupervised visual representation learning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738, 2020

  35. [35]

    Classmix: Segmentation-based data augmentation for semi-supervised learning

    Viktor Olsson, Wilhelm Tranheden, Juliano Pinto, and Lennart Svensson. Classmix: Segmentation-based data augmentation for semi-supervised learning. InProceedings of the IEEE/CVF winter conference on applications of computer vision, pages 1369–1378, 2021

  36. [36]

    Preparing the future for continual semantic segmenta- tion

    Zihan Lin, Zilei Wang, and Yixin Zhang. Preparing the future for continual semantic segmenta- tion. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 11910–11920, 2023

  37. [37]

    Continual panoptic perception: Towards multi-modal incremental interpretation of remote sensing images

    Bo Yuan, Danpei Zhao, Zhuoran Liu, Wentao Li, and Tian Li. Continual panoptic perception: Towards multi-modal incremental interpretation of remote sensing images. InProceedings of the 32nd ACM International Conference on Multimedia, pages 2117–2126, 2024

  38. [38]

    Ssul: Semantic segmentation with unknown label for exemplar-based class-incremental learning.Advances in neural information processing systems, 34:10919–10930, 2021

    Sungmin Cha, YoungJoon Yoo, Taesup Moon, et al. Ssul: Semantic segmentation with unknown label for exemplar-based class-incremental learning.Advances in neural information processing systems, 34:10919–10930, 2021

  39. [39]

    Vista-clip: Visual incre- mental self-tuned adaptation for efficient continual panoptic segmentation

    D Manjunath, Shrikar Madhu, Aniruddh Sikdar, and Suresh Sundaram. Vista-clip: Visual incre- mental self-tuned adaptation for efficient continual panoptic segmentation. In2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 6557–

  40. [40]

    Catastrophic forgetting in connectionist networks.Trends in cognitive sciences, 3(4):128–135, 1999

    Robert M French. Catastrophic forgetting in connectionist networks.Trends in cognitive sciences, 3(4):128–135, 1999

  41. [41]

    Catastrophic forgetting, rehearsal and pseudorehearsal.Connection Science, 7 (2):123–146, 1995

    Anthony Robins. Catastrophic forgetting, rehearsal and pseudorehearsal.Connection Science, 7 (2):123–146, 1995

  42. [42]

    Lifelong learning algorithms

    Sebastian Thrun. Lifelong learning algorithms. InLearning to learn, pages 181–209. Springer, 1998

  43. [43]

    A survey on continual semantic segmentation: Theory, challenge, method and application.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024

    Bo Yuan and Danpei Zhao. A survey on continual semantic segmentation: Theory, challenge, method and application.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024

  44. [44]

    Mitigating background shift in class-incremental semantic segmentation

    Gilhan Park, WonJun Moon, SuBeen Lee, Tae-Young Kim, and Jae-Pil Heo. Mitigating background shift in class-incremental semantic segmentation. InEuropean Conference on Computer Vision, pages 71–88. Springer, 2024. 12

  45. [45]

    Modeling the background for incremental learning in semantic segmentation

    Fabio Cermelli, Massimiliano Mancini, Samuel Rota Bulo, Elisa Ricci, and Barbara Caputo. Modeling the background for incremental learning in semantic segmentation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9233–9242, 2020

  46. [46]

    KIT Scientific Publishing, 2024

    Tobias Michael Kalb.Principles of Catastrophic Forgetting for Continual Semantic Segmenta- tion in Automated Driving. KIT Scientific Publishing, 2024

  47. [47]

    Continual semantic segmentation via structure preserving and projected feature alignment

    Zihan Lin, Zilei Wang, and Yixin Zhang. Continual semantic segmentation via structure preserving and projected feature alignment. InEuropean Conference on Computer Vision, pages 345–361. Springer, 2022

  48. [48]

    Scale-hybrid group distillation with knowl- edge disentangling for continual semantic segmentation.Sensors, 23(18):7820, 2023

    Zichen Song, Xiaoliang Zhang, and Zhaofeng Shi. Scale-hybrid group distillation with knowl- edge disentangling for continual semantic segmentation.Sensors, 23(18):7820, 2023

  49. [49]

    Bacs: Background aware continual semantic segmentation.arXiv preprint arXiv:2404.13148, 2024

    Mostafa ElAraby, Ali Harakeh, and Liam Paull. Bacs: Background aware continual semantic segmentation.arXiv preprint arXiv:2404.13148, 2024

  50. [50]

    Trace back and go ahead: Completing partial annotation for continual semantic segmentation.Pattern Recognition, 165:111613, 2025

    Yuxuan Luo, Jinpeng Chen, Runmin Cong, Horace Ho Shing Ip, and Sam Kwong. Trace back and go ahead: Completing partial annotation for continual semantic segmentation.Pattern Recognition, 165:111613, 2025

  51. [51]

    Exemplar-based open- set panoptic segmentation network

    Jaedong Hwang, Seoung Wug Oh, Joon-Young Lee, and Bohyung Han. Exemplar-based open- set panoptic segmentation network. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1175–1184, 2021

  52. [52]

    Revis- iting open-set panoptic segmentation

    Yufei Yin, Hao Chen, Wengang Zhou, Jiajun Deng, Haiming Xu, and Houqiang Li. Revis- iting open-set panoptic segmentation. InProceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 6747–6754, 2024

  53. [53]

    Region-aware metric learning for open world semantic segmentation via meta-channel aggrega- tion.arXiv preprint arXiv:2205.08083, 2022

    Hexin Dong, Zifan Chen, Mingze Yuan, Yutong Xie, Jie Zhao, Fei Yu, Bin Dong, and Li Zhang. Region-aware metric learning for open world semantic segmentation via meta-channel aggrega- tion.arXiv preprint arXiv:2205.08083, 2022

  54. [54]

    Dual decision improves open-set panoptic segmentation.arXiv preprint arXiv:2207.02504, 2022

    Hai-Ming Xu, Hao Chen, Lingqiao Liu, and Yufei Yin. Dual decision improves open-set panoptic segmentation.arXiv preprint arXiv:2207.02504, 2022

  55. [55]

    Panoptic segmentation

    Alexander Kirillov, Kaiming He, Ross Girshick, Carsten Rother, and Piotr Dollár. Panoptic segmentation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9404–9413, 2019

  56. [56]

    Contrastive attraction and contrastive repulsion for representation learning.arXiv preprint arXiv:2105.03746, 2021

    Huangjie Zheng, Xu Chen, Jiangchao Yao, Hongxia Yang, Chunyuan Li, Ya Zhang, Hao Zhang, Ivor Tsang, Jingren Zhou, and Mingyuan Zhou. Contrastive attraction and contrastive repulsion for representation learning.arXiv preprint arXiv:2105.03746, 2021

  57. [57]

    Attraction-repulsion spectrum in neighbor embeddings.Journal of Machine Learning Research, 23(95):1–32, 2022

    Jan Niklas Böhm, Philipp Berens, and Dmitry Kobak. Attraction-repulsion spectrum in neighbor embeddings.Journal of Machine Learning Research, 23(95):1–32, 2022

  58. [58]

    Yu and Jianbo Shi

    Stella X. Yu and Jianbo Shi. Segmentation with pairwise attraction and repulsion. InProceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, volume 1, pages 52–58. IEEE, 2001

  59. [59]

    Scene parsing through ade20k dataset

    Bolei Zhou, Hang Zhao, Xavier Puig, Sanja Fidler, Adela Barriuso, and Antonio Torralba. Scene parsing through ade20k dataset. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017

  60. [60]

    Masked-attention mask transformer for universal image segmentation

    Bowen Cheng, Ishan Misra, Alexander G Schwing, Alexander Kirillov, and Rohit Girdhar. Masked-attention mask transformer for universal image segmentation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1290–1299, 2022

  61. [61]

    Deep residual learning for image recognition

    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. 13 A Additional Analysis A.1 Stability–plasticity Trajectory Fig. 7 summarizes the stability–plasticity behavior of FuTCR and SimCIS on ADE20K 100–5. Each...

  62. [62]

    Aux. cls

    The two panels depict diverse scenes where FuTCR recovers more accurate panoptic masks, particularly on newly introduced classes. and a balance term: Laux = 1 |Rfut| X r CE(gr, ℓr) +λ bal KL ¯p∥u ,(5) where ¯p is the mean predicted distribution over clusters and u is the uniform distribution. This head is intended to encourage diverse usage of latent slot...