pith. machine review for the scientific record. sign in

arxiv: 2605.11904 · v1 · submitted 2026-05-12 · 💻 cs.CV · cs.AI

Recognition: unknown

Beyond Point-wise Neural Collapse: A Topology-Aware Hierarchical Classifier for Class-Incremental Learning

Baile Xu, Dunwei Tu, Furao Shen, Huiyu Yi, Zhicheng Wang, Zhiming Xu

Authors on Pith no claims yet

Pith reviewed 2026-05-13 06:30 UTC · model grok-4.3

classification 💻 cs.CV cs.AI
keywords class-incremental learningneural collapsemanifold topologycatastrophic forgettinghierarchical clusteringfeature driftcontinual learning
0
0 comments X

The pith

A hierarchical topology-aware classifier outperforms point-based nearest-mean methods in class-incremental learning by modeling feature manifolds and tracking non-linear drift.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Standard nearest class mean classifiers in class-incremental learning rely on neural collapse theory that treats features as single points, yet non-linear drift in practice spreads classes into complex manifolds. The paper introduces HC-SOINN to build local-to-global topological representations of these manifolds and STAR to deform the structure via pointwise residual tracking so it follows the actual drift. Replacing the classifier in existing continual learning methods produces consistent gains across seven baselines, showing that preserving manifold geometry can limit forgetting better than point approximations. A reader would care because the approach directly tackles the mismatch between idealized collapse assumptions and real incremental data behavior.

Core claim

Class features in incremental settings form complex manifolds rather than collapsed points, so a point-wise nearest class mean classifier is suboptimal. HC-SOINN captures the manifold topology through hierarchical clustering in a local-to-global manner, while STAR uses fine-grained residual trajectories to actively deform that topology and maintain alignment under non-linear feature drift. Procrustes distance analysis and integration experiments confirm that the resulting structure remains stable and improves performance when swapped into state-of-the-art class-incremental pipelines.

What carries the argument

HC-SOINN (Hierarchical-Cluster SOINN), a self-organizing network that builds hierarchical local-to-global topology representations of class manifolds, paired with the STAR residual-based deformation mechanism that tracks pointwise trajectories to adapt the topology to drift.

If this is right

  • Replacing the classifier in any existing CIL method with HC-SOINN plus STAR yields measurable accuracy gains while preserving resistance to forgetting.
  • The topology representation remains resilient to manifold deformations as measured by Procrustes distance between successive feature distributions.
  • The local-to-global hierarchy allows the model to handle both fine-grained within-class structure and global separation across classes added over time.
  • STAR's pointwise residual tracking enables precise adaptation to non-linear drift without requiring full retraining of earlier classes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same hierarchical tracking could be tested in non-classification continual settings such as incremental object detection where spatial manifolds also drift.
  • Residual deformation might combine with explicit drift detectors to trigger updates only when manifold misalignment exceeds a threshold.
  • If the topology model generalizes, it could reduce reliance on replay buffers by letting the classifier itself encode distribution history.

Load-bearing premise

Class features reliably form complex manifolds that hierarchical clustering can capture and that residual-based deformation can realign without creating new instabilities or extra forgetting.

What would settle it

A controlled CIL experiment on a dataset engineered for extreme non-linear manifold drift in which HC-SOINN plus STAR produces lower accuracy or higher forgetting than a standard nearest class mean classifier.

Figures

Figures reproduced from arXiv: 2605.11904 by Baile Xu, Dunwei Tu, Furao Shen, Huiyu Yi, Zhicheng Wang, Zhiming Xu.

Figure 1
Figure 1. Figure 1: Evolution of the Average Procrustes Distance (d (t) P ) on the initial classes across incremental tasks. The red dashed line (y = 0.1) indicates the empirical threshold for quasi-linear drift. training causes the feature space to deviate fundamentally from the initial structure. Consequently, class distributions evolve into complex manifolds rather than maintaining the compact structure predicted by Neural… view at source ↗
Figure 2
Figure 2. Figure 2: Overall pipeline of HC-SOINN and STAR. The method first constructs a topology-aware hierarchical classifier on fixed feature embeddings to model complex class manifolds that arise during class-incremental learning. It then employs the STAR mechanism to actively adapt this topology to non-linear feature drift via pointwise tracking, enabling dual-view inference without retraining. inter-cluster pairs: d(Ci … view at source ↗
Figure 3
Figure 3. Figure 3: t-SNE visualization of the feature distributions for the initial 10 classes. Small dots represent query samples, large circles denote HC-SOINN topological nodes, and ‘×’ marks represent NCM prototypes. (a) At Task 1, HC-SOINN nodes perfectly capture the initial class manifolds. (b) By Task 6, without active alignment, the standard HC-SOINN nodes fail to cover the drifted features (highlighted by the spatia… view at source ↗
Figure 4
Figure 4. Figure 4: Comprehensive hyperparameter sensitivity analysis on Split CIFAR-100 (integrated with CODA-Prompt). The results demonstrate that our framework maintains highly stable average accuracy (AAvg) and last-task accuracy (ALast) across a wide range of values. 20 [PITH_FULL_IMAGE:figures/full_fig_p020_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Comprehensive hyperparameter sensitivity analysis on Split ImageNet-R. Parameters (a)-(d) are evaluated with SimpleCIL, while (e) is evaluated with DualPrompt. Similar to CIFAR-100, the performance remains remarkably robust across diverse configurations. 21 [PITH_FULL_IMAGE:figures/full_fig_p021_5.png] view at source ↗
read the original abstract

The Nearest Class Mean (NCM) classifier is widely favored in Class-Incremental Learning (CIL) for its superior resistance to catastrophic forgetting compared to Fully Connected layers. While Neural Collapse (NC) theory supports NCM's optimality by assuming features collapse into single points, non-linear feature drift and insufficient training in CIL often prevent this ideal state. Consequently, classes manifest as complex manifolds rather than collapsed points, rendering the single-point NCM suboptimal. To address this, we propose Hierarchical-Cluster SOINN (HC-SOINN), a novel classifier that captures the topological structure of these manifolds via a ``local-to-global'' representation. Furthermore, we introduce Structure-Topology Alignment via Residuals (STAR) method, which employs a fine-grained pointwise trajectory tracking mechanism to actively deform the learned topology, allowing it to adapt precisely to complex non-linear feature drift. Theoretical analysis and Procrustes distance experiments validate our framework's resilience to manifold deformations. We integrated HC-SOINN into seven state-of-the-art methods by replacing their original classifiers, achieving consistent improvements that highlight the effectiveness and robustness of our approach. Code is available at https://github.com/yhyet/HC_SOINN.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript argues that in class-incremental learning, non-linear feature drift prevents the ideal point-wise collapse assumed by Neural Collapse theory, so classes form complex manifolds and Nearest Class Mean classifiers become suboptimal. It introduces Hierarchical-Cluster SOINN (HC-SOINN) to capture manifold topology through a local-to-global hierarchical representation and the Structure-Topology Alignment via Residuals (STAR) method, which uses fine-grained pointwise trajectory tracking to deform the learned topology and adapt to drift. Theoretical analysis and Procrustes distance experiments are claimed to validate resilience to deformations, and replacing the classifier in seven existing CIL methods yields consistent empirical gains. Code is released at the provided GitHub link.

Significance. If the central claims hold, the work would offer a practical topology-aware alternative to point-wise classifiers in CIL, potentially improving robustness to non-linear drift while preserving the forgetting resistance that motivates NCM. The explicit code release supports reproducibility and allows direct integration testing.

major comments (3)
  1. [Abstract / §3] Abstract and §3 (STAR description): the claim that residual-based pointwise trajectory tracking 'actively deforms the learned topology' without introducing instabilities or accelerating forgetting is load-bearing for the no-forgetting guarantee, yet no explicit stability analysis, curvature-mismatch bound, or ablation on post-deformation forgetting metrics is supplied; the skeptic concern that residuals may overfit noise rather than true drift therefore remains unaddressed.
  2. [§4] §4 (theoretical analysis): the abstract asserts that 'theoretical analysis' validates resilience to manifold deformations, but no equations, proof sketches, or formal statements appear; without these it is impossible to verify whether the local-to-global structure is preserved under the STAR deformation operator.
  3. [Table 1 / §5.1] Table 1 / §5.1 (integration experiments): while consistent improvements are reported when HC-SOINN+STAR replaces the original classifier in seven SOTA methods, the paper does not isolate the contribution of the residual deformation step versus the hierarchical clustering alone, leaving open whether the topology-alignment mechanism is the source of the gains.
minor comments (2)
  1. [§3] Notation for the hierarchical clustering levels and residual vectors should be defined once in a dedicated notation table or subsection to avoid repeated inline definitions.
  2. [§5.2] The Procrustes distance plots in §5.2 would benefit from error bars or multiple random seeds to demonstrate statistical reliability of the reported alignment resilience.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major point below and will incorporate revisions to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Abstract / §3] Abstract and §3 (STAR description): the claim that residual-based pointwise trajectory tracking 'actively deforms the learned topology' without introducing instabilities or accelerating forgetting is load-bearing for the no-forgetting guarantee, yet no explicit stability analysis, curvature-mismatch bound, or ablation on post-deformation forgetting metrics is supplied; the skeptic concern that residuals may overfit noise rather than true drift therefore remains unaddressed.

    Authors: We agree that the current version lacks explicit stability analysis to support the deformation claim. In the revision we will expand §3 with a formal bound on residual-induced deformation error (under the assumption of bounded non-linear drift) and add an ablation reporting forgetting metrics before and after STAR application, directly addressing the concern that residuals may fit noise. revision: yes

  2. Referee: [§4] §4 (theoretical analysis): the abstract asserts that 'theoretical analysis' validates resilience to manifold deformations, but no equations, proof sketches, or formal statements appear; without these it is impossible to verify whether the local-to-global structure is preserved under the STAR deformation operator.

    Authors: We acknowledge the omission of explicit theoretical content in §4. We will add a new subsection containing the formal definition of the STAR deformation operator, the relevant equations, and a proof sketch demonstrating preservation of the local-to-global hierarchical structure under bounded residuals. revision: yes

  3. Referee: [Table 1 / §5.1] Table 1 / §5.1 (integration experiments): while consistent improvements are reported when HC-SOINN+STAR replaces the original classifier in seven SOTA methods, the paper does not isolate the contribution of the residual deformation step versus the hierarchical clustering alone, leaving open whether the topology-alignment mechanism is the source of the gains.

    Authors: We agree that component isolation is necessary. We will add an ablation study in §5.1 that compares HC-SOINN (hierarchical clustering only) against the full HC-SOINN+STAR on the same benchmarks, thereby quantifying the incremental benefit of the residual-based topology alignment. revision: yes

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no explicit free parameters, axioms, or invented entities are stated. The framework implicitly relies on the existence of recoverable topological structure in feature manifolds and on the effectiveness of residual-based deformation, but these are not formalized here.

pith-pipeline@v0.9.0 · 5530 in / 1091 out tokens · 52947 ms · 2026-05-13T06:30:17.658485+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages

  1. [1]

    Neural Networks , volume=

    A comprehensive study of class incremental learning algorithms for visual tasks , author=. Neural Networks , volume=. 2021 , publisher=

  2. [2]

    arXiv preprint arXiv:2401.16386 , year=

    Continual learning with pre-trained models: A survey , author=. arXiv preprint arXiv:2401.16386 , year=

  3. [3]

    , title =

    Rebuffi, Sylvestre-Alvise and Kolesnikov, Alexander and Sperl, Georg and Lampert, Christoph H. , title =. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , month =

  4. [4]

    Proceedings of the National Academy of Sciences , volume=

    Prevalence of neural collapse during the terminal phase of deep learning training , author=. Proceedings of the National Academy of Sciences , volume=. 2020 , publisher=

  5. [5]

    Neural Networks , pages=

    Embedding Space Allocation with Angle-Norm Joint Classifiers for few-shot class-incremental learning , author=. Neural Networks , pages=. 2025 , publisher=

  6. [6]

    2024 International Joint Conference on Neural Networks (IJCNN) , pages=

    Few-Shot Class-Incremental Learning with Class Centers and Contrastive Learning for Incremental Vehicle Recognition , author=. 2024 International Joint Conference on Neural Networks (IJCNN) , pages=. 2024 , organization=

  7. [7]

    International Journal of Computer Vision , volume=

    Revisiting class-incremental learning with pre-trained models: Generalizability and adaptivity are all you need , author=. International Journal of Computer Vision , volume=. 2025 , publisher=

  8. [8]

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

    Expandable subspace ensemble for pre-trained model-based class-incremental learning , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

  9. [9]

    Neural Networks , volume=

    An enhanced self-organizing incremental neural network for online unsupervised learning , author=. Neural Networks , volume=. 2007 , publisher=

  10. [10]

    2025 , publisher=

    Pilot: A pre-trained model-based continual learning toolbox , author=. 2025 , publisher=

  11. [11]

    Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

    Kac: Kolmogorov-arnold classifier for continual learning , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

  12. [12]

    Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

    CL-LoRA: Continual Low-Rank Adaptation for Rehearsal-Free Class-Incremental Learning , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

  13. [13]

    Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

    Self-expansion of pre-trained models with mixture of adapters for continual learning , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

  14. [14]

    Neural networks , volume=

    An incremental network for on-line unsupervised classification and topology learning , author=. Neural networks , volume=. 2006 , publisher=

  15. [15]

    European Conference on Computer Vision , pages=

    Dualprompt: Complementary prompting for rehearsal-free continual learning , author=. European Conference on Computer Vision , pages=. 2022 , organization=

  16. [16]

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

    Coda-prompt: Continual decomposed attention-based prompting for rehearsal-free continual learning , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

  17. [17]

    arXiv preprint arXiv:2302.03004 , year=

    Neural collapse inspired feature-classifier alignment for few-shot class incremental learning , author=. arXiv preprint arXiv:2302.03004 , year=

  18. [18]

    Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

    Semantic drift compensation for class-incremental learning , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

  19. [19]

    International conference on machine learning , pages=

    Infinite mixture prototypes for few-shot learning , author=. International conference on machine learning , pages=. 2019 , organization=

  20. [20]

    European Conference on Computer Vision , pages=

    Exemplar-free continual representation learning via learnable drift compensation , author=. European Conference on Computer Vision , pages=. 2024 , organization=

  21. [21]

    Machine learning , volume=

    Local procrustes for manifold embedding: a measure of embedding quality and embedding algorithms , author=. Machine learning , volume=. 2009 , publisher=

  22. [22]

    Proceedings of the 33rd ACM International Conference on Multimedia , pages=

    Multiple Queries with Multiple Keys: A Precise Prompt Matching Paradigm for Prompt-based Continual Learning , author=. Proceedings of the 33rd ACM International Conference on Multimedia , pages=

  23. [23]

    Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

    Few-shot class-incremental learning , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

  24. [24]

    Neural networks , volume=

    A fast nearest neighbor classifier based on self-organizing incremental neural network , author=. Neural networks , volume=. 2008 , publisher=

  25. [25]

    Psychology of learning and motivation , volume=

    Catastrophic interference in connectionist networks: The sequential learning problem , author=. Psychology of learning and motivation , volume=. 1989 , publisher=

  26. [26]

    2009 , publisher=

    Learning multiple layers of features from tiny images , author=. 2009 , publisher=

  27. [27]

    Proceedings of the IEEE/CVF international conference on computer vision , pages=

    The many faces of robustness: A critical analysis of out-of-distribution generalization , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=

  28. [28]

    2011 , publisher=

    The caltech-ucsd birds-200-2011 dataset , author=. 2011 , publisher=

  29. [29]

    neural-gas

    A "neural-gas" network learns topologies , author=. 1991 , publisher=

  30. [30]

    Proceedings of the 12th annual conference on Computer graphics and interactive techniques , pages=

    Animating rotation with quaternion curves , author=. Proceedings of the 12th annual conference on Computer graphics and interactive techniques , pages=

  31. [31]

    Psychometrika , volume=

    Hierarchical clustering schemes , author=. Psychometrika , volume=. 1967 , publisher=

  32. [32]

    Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

    Der: Dynamically expandable representation for class incremental learning , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

  33. [33]

    European conference on computer vision , pages=

    Foster: Feature boosting and compression for class-incremental learning , author=. European conference on computer vision , pages=. 2022 , organization=

  34. [34]

    arXiv preprint arXiv:2205.13218 (2022)

    A model or 603 exemplars: Towards memory-efficient class-incremental learning , author=. arXiv preprint arXiv:2205.13218 , year=

  35. [35]

    arXiv preprint arXiv:1908.01091 , year=

    Toward understanding catastrophic forgetting in continual learning , author=. arXiv preprint arXiv:1908.01091 , year=

  36. [36]

    2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) , pages=

    Advancing Ultrasound Medical Continuous Learning with Task-Specific Generalization and Adaptability , author=. 2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) , pages=. 2024 , organization=