Continual Learning of Domain-Invariant Representations

Pascal Janetzky; Stefan Feuerriegel; Tobias Schlagenhauf

arxiv: 2605.15775 · v1 · pith:OE3OFVEJnew · submitted 2026-05-15 · 💻 cs.LG

Continual Learning of Domain-Invariant Representations

Pascal Janetzky , Tobias Schlagenhauf , Stefan Feuerriegel This is my paper

Pith reviewed 2026-05-20 21:12 UTC · model grok-4.3

classification 💻 cs.LG

keywords continual learningdomain-invariant representationsinvariant structuresshortcut learningreplay-based trainingout-of-domain generalization

0 comments

The pith

Continual learning methods that combine replay with sequential invariance alignment learn and preserve domain-invariant representations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to show that continual learning can be extended to capture domain-invariant structures by pairing replay-based training with a tailored sequential invariance alignment step. This matters to a sympathetic reader because standard continual learning optimizes for in-domain accuracy and therefore picks up spurious domain-specific cues that hurt performance once the model is deployed on new domains. The authors evaluate the resulting methods under a protocol that measures generalization to previously unseen target domains and report consistent gains over existing continual learning baselines across six datasets drawn from vision, medicine, manufacturing, and ecology.

Core claim

A broad class of continual learning methods can sequentially learn representations that capture invariant structures across domains by combining replay-based training with tailored sequential invariance alignment; these invariants are motivated by the idea that they preserve underlying causal mechanisms and thereby reduce overfitting to domain-specific shortcuts, yielding improved generalization to unseen target domains after deployment.

What carries the argument

Replay-based training paired with a tailored sequential invariance alignment that learns and preserves invariant structures over successive domains.

If this is right

The methods outperform existing continual learning baselines on generalization to unseen target domains.
Naive sequential extensions of existing domain-invariant representation learning techniques yield only limited benefits.
The approach applies across vision, medicine, manufacturing, and ecology tasks.
It mitigates shortcut learning by focusing on structures that are stable across domains.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same replay-plus-alignment pattern could be tested in settings where domain order is not fixed in advance.
It suggests that explicitly protecting causal mechanisms during continual updates may improve robustness when environments change gradually.
The deployment-oriented evaluation protocol could be applied to other continual learning problems that currently measure only in-domain accuracy.

Load-bearing premise

Invariant structures across domains often preserve the underlying causal mechanisms and thereby reduce overfitting to domain-specific cues.

What would settle it

A controlled experiment on a dataset in which domain shifts do not preserve causal mechanisms and the proposed methods fail to outperform standard replay baselines on unseen target domains.

Figures

Figures reproduced from arXiv: 2605.15775 by Pascal Janetzky, Stefan Feuerriegel, Tobias Schlagenhauf.

**Figure 1.** Figure 1: Deployment-centric setup of CL. (a): Training domains (shown in different colors) arrive sequentially; upon deployment, the model is evaluated on an arbitrary target domain. (b): Standard CL methods are prone to learning spurious, domain-specific cues (e.g., the color) and thus fail to classify data from unseen domains at deployment time. (c): Our methods learn domain-invariant representation (e.g., the sh… view at source ↗

**Figure 3.** Figure 3: Reduced buffer sizes. Average results for 50 % (left) and 25 % (right) memory capacity, providing less domain information. Our methods can nonetheless learn domain-invariant representations and outperform strong replay-baselines. methods retain or retrospectively improve the performance on previous domains. Takeaway: Our methods retrospectively improve the performance. • Runtime analysis. We provide run… view at source ↗

**Figure 4.** Figure 4: Improvement from ⋆-CL over Naïve-CL. Our proposed methods outperform the naïve extensions across all underlying methodologies for computing domain-invariant representations. Same datasets as in [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Average per-step runtimes over time. We plot the average per-step runtime for methods COPE, STAR, SARL, ⋆-CL-Fishr and ⋆-CL-CORAL across datasets RotatedMNIST (left), CIFAR10C (middle), and TinyImageNetC (right). 0 250 500 750 1000 1250 1500 1750 2000 Global training step 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 Step time (s) COPE STAR SARL -CL-Fishr -CL-CORAL 0 500 1000 1500 2000 2500 3000 3500 4000 G… view at source ↗

**Figure 6.** Figure 6: Average per-step runtimes over time. We plot the average per-step runtime for methods COPE, STAR, SARL, ⋆-CL-Fishr and ⋆-CL-CORAL across datasets RotatedMNIST (left), CIFAR10C (middle), and TinyImageNetC (right). 26 [PITH_FULL_IMAGE:figures/full_fig_p026_6.png] view at source ↗

**Figure 7.** Figure 7: Average per-step runtimes over time (extended datasets). We plot the average per-step runtime for methods COPE, STAR, SARL, ⋆-CL-Fishr and ⋆-CL-CORAL across extended datasets (15 source domains) RotatedMNISTExtended (left), WM811KExtended (middle), and Camelyon17Extended (right). G.6. Ablation of the invariance alignment We here present the results for two experiments: (1) We disable the invariance alignme… view at source ↗

**Figure 8.** Figure 8: Hyperparameter sensitivity plots (I). We ablate the effect of varying λ and β parameters for ⋆-CL-VREX (left) and ⋆-CL-Fishr (right). 74 75 76 77 78 RotatedMNIST Target accuracy (%) -CL-CORAL -CL-CORAL 55 60 65 70 75 WM811K Target accuracy (%) 10 1 10 0 10 1 coral_lambda 10 15 20 25 30 35 40 Covertype Target accuracy (%) 10 1 10 0 10 1 coral_beta 66 68 70 72 RotatedMNIST Target accuracy (%) -CL-MMD -CL-MMD… view at source ↗

**Figure 9.** Figure 9: Hyperparameter sensitivity plots (II). We ablate the effect of varying λ and β parameters for ⋆-CL-CORAL (left) and ⋆-CL-MMD (right). 32 [PITH_FULL_IMAGE:figures/full_fig_p032_9.png] view at source ↗

**Figure 10.** Figure 10: Feature space visualizations (RotatedMNIST). We visualize the feature space of Finetune (left), COPE (middle), and ⋆-CL-CORAL (right) using UMAP (McInnes et al., 2018). Colored numbers denote cluster centroids for the respective class. Camelyon17 Finetune Domains Source Target 0 1 Camelyon17 -CL-CORAL Domains Source Target 0 1 [PITH_FULL_IMAGE:figures/full_fig_p033_10.png] view at source ↗

**Figure 11.** Figure 11: Feature space visualizations (Camelyon17). We visualize the feature space of Finetune (left) and ⋆-CL-CORAL (right) using UMAP (McInnes et al., 2018). Colored numbers denote cluster centroids for the respective class. 33 [PITH_FULL_IMAGE:figures/full_fig_p033_11.png] view at source ↗

read the original abstract

Continual learning (CL) aims to train models sequentially over multiple domains without forgetting previously learned knowledge. However, existing CL methods optimize for in-domain performance and are therefore prone to learning spurious, domain-specific cues (``shortcut learning''), which limits generalization to unseen domains after deployment. In this paper, we address this limitation through continual learning of domain-invariant representation. We introduce a broad class of CL methods that sequentially learn representations capturing invariant structures across domains. Our methods are motivated by the observation that such invariant structures often preserve the underlying causal mechanisms, which can reduce the risk of overfitting to domain-specific cues and thus offer better out-of-domain generalization. Our proposed CL methods combine replay-based training with a tailored sequential invariance alignment to learn -- and preserve -- invariant structures over time. We evaluate our methods under a deployment-oriented protocol that measures performance on unseen target domains. Across six benchmark and real-world datasets spanning vision, medicine, manufacturing, and ecology, our methods consistently outperform existing CL baselines in terms of generalization to unseen target domains. As an ablation, we further show that na\"ive extensions of sequential training with existing domain-invariant representation learning (DIRL) methods provide only limited benefits. To the best of our knowledge, this is the first work to develop domain-invariant representation methods for CL.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adds sequential invariance alignment to replay-based continual learning and reports better out-of-domain generalization on six datasets, but the causal-mechanism story stays untested.

read the letter

The core contribution is a set of continual learning methods that combine replay with a tailored sequential alignment step to keep representations invariant across domains seen so far. The authors show that simply extending existing domain-invariant representation learning techniques to the continual setting gives only modest gains, which suggests the sequential preservation step is doing real work. They evaluate under a protocol that measures performance on completely unseen target domains after training finishes, and they include datasets from medicine, manufacturing, and ecology in addition to standard vision benchmarks. That deployment-oriented framing is useful and the ablation is a clear plus. The results are presented as consistent outperformance, which is the kind of evidence that matters for practical robustness claims. The main limitation is that the motivation rests on the idea that invariant structures capture causal mechanisms and therefore reduce shortcut learning, yet the experiments only report accuracy lifts on target domains. There are no additional checks, such as probing for known causal factors or comparing against non-causal invariants, to confirm that the performance edge comes from the intended property rather than extra regularization or other side effects. The abstract also leaves out statistical tests, exact baseline details, and hyperparameter search procedures, so the strength of the empirical support is hard to judge from what is shown. This work sits at the intersection of continual learning and domain generalization. Readers already working on either topic will find the framing and the negative result on naive extensions worth their time. The problem it targets is real for any setting where models must keep working after deployment under shifting conditions. I would send it to peer review. The experimental protocol and the ablation give enough substance to justify referee attention, even if the causal link needs tighter verification in revision.

Referee Report

2 major / 2 minor

Summary. The paper proposes a class of continual learning (CL) methods that combine replay-based training with sequential invariance alignment to learn and preserve domain-invariant representations over time. Motivated by the idea that such invariants often capture causal mechanisms and reduce shortcut learning, the methods are evaluated under a deployment-oriented protocol measuring generalization to unseen target domains. Across six benchmark and real-world datasets in vision, medicine, manufacturing, and ecology, the proposed methods are reported to consistently outperform existing CL baselines; an ablation shows that naive sequential extensions of prior domain-invariant representation learning (DIRL) methods yield only limited gains. This is positioned as the first work developing DIRL methods specifically for CL.

Significance. If the empirical claims hold under fuller scrutiny, the work is significant for bridging continual learning with domain-invariant representation techniques, shifting focus from in-domain retention to out-of-domain robustness in sequential settings. The deployment-oriented evaluation protocol is a positive step toward practical relevance in applications with evolving domains.

major comments (2)

[Motivation / Abstract] The central motivation (abstract and likely §1) states that invariant structures 'often preserve the underlying causal mechanisms' to explain reduced overfitting and better OOD generalization, yet the experiments report only aggregate performance gains on target domains without any verification step, such as measuring representation alignment against known causal factors or comparing against non-causal invariant baselines. This assumption is load-bearing for interpreting why the sequential alignment helps beyond standard regularization.
[Experiments] §5 (Experiments): The claim of consistent outperformance across six datasets under the deployment-oriented protocol is presented without reported statistical tests, number of random seeds, full baseline specifications, or confirmation that hyperparameter choices were not post-hoc. Given the reader's note on missing experimental details, this undermines verification of the generalization advantage.

minor comments (2)

[Abstract] The ablation description in the abstract refers to 'naive extensions of sequential training with existing DIRL methods' but does not name the specific DIRL methods or detail how the naive extension was implemented.
[Method] Notation for 'invariant structures' versus 'domain-invariant representations' should be unified or explicitly distinguished in the method section to improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback on our manuscript. We address each major comment point by point below and indicate the revisions we will incorporate to improve clarity and reproducibility.

read point-by-point responses

Referee: [Motivation / Abstract] The central motivation (abstract and likely §1) states that invariant structures 'often preserve the underlying causal mechanisms' to explain reduced overfitting and better OOD generalization, yet the experiments report only aggregate performance gains on target domains without any verification step, such as measuring representation alignment against known causal factors or comparing against non-causal invariant baselines. This assumption is load-bearing for interpreting why the sequential alignment helps beyond standard regularization.

Authors: We appreciate the referee highlighting the role of this motivational hypothesis. The phrasing draws from established literature on domain-invariant representations and causal mechanisms but is presented as an intuition rather than an empirically verified claim within this work. Our experiments focus on measuring generalization performance under the deployment-oriented protocol rather than direct causal analysis. We will revise the abstract and introduction to explicitly frame the causal connection as a motivating hypothesis from prior work, clarify the scope of our contributions, and add a short discussion of limitations and future directions for causal verification. We do not plan to add new experiments comparing against non-causal baselines, as that would substantially expand the scope beyond the current focus on continual learning methods. revision: partial
Referee: [Experiments] §5 (Experiments): The claim of consistent outperformance across six datasets under the deployment-oriented protocol is presented without reported statistical tests, number of random seeds, full baseline specifications, or confirmation that hyperparameter choices were not post-hoc. Given the reader's note on missing experimental details, this undermines verification of the generalization advantage.

Authors: We agree that additional details are required to support the empirical claims and enable full verification. In the revised manuscript we will: report results over 5 random seeds with mean and standard deviation; include statistical significance tests (e.g., paired t-tests or Wilcoxon signed-rank tests with p-values) comparing our methods to baselines; provide complete hyperparameter specifications and training details for all baselines; and explicitly describe the hyperparameter selection protocol, confirming that tuning was performed on validation splits without reference to target-domain test performance. These changes will be incorporated into §5 and the supplementary material. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation or claims

full rationale

The paper introduces a new combination of replay-based continual learning with sequential invariance alignment, motivated by the observation that invariant structures often preserve causal mechanisms. This motivation is presented as an empirical assumption rather than a derived result from equations. Performance claims are supported by evaluation on six external benchmark and real-world datasets, with ablations against naive extensions of existing DIRL methods. No load-bearing steps reduce by construction to fitted parameters, self-citations, or imported uniqueness theorems; the central method and generalization results remain independent of the paper's own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that invariant structures preserve causal mechanisms and on the empirical effectiveness of sequential invariance alignment combined with replay.

axioms (1)

domain assumption Invariant structures often preserve the underlying causal mechanisms
This is invoked to motivate why domain-invariant representations should reduce shortcut learning and improve out-of-domain generalization.

pith-pipeline@v0.9.0 · 5759 in / 1237 out tokens · 38962 ms · 2026-05-20T21:12:25.851930+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Our proposed CL methods combine replay-based training with a tailored sequential invariance alignment to learn -- and preserve -- invariant structures over time.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We consider multiple notions of domain invariance, namely, (i) risk-based, (ii) gradient-based, and (iii) feature-based—domain invariances.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

121 extracted references · 121 canonical work pages · 9 internal anchors

[1]

Invariant Risk Minimization

Invariant risk minimization , author=. arXiv preprint arXiv:1907.02893 , year=

work page internal anchor Pith review Pith/arXiv arXiv 1907
[2]

2013 , organization=

Domain generalization via invariant feature representation , author=. 2013 , organization=

work page 2013
[3]

Psychology of learning and motivation , volume=

Catastrophic interference in connectionist networks: The sequential learning problem , author=. Psychology of learning and motivation , volume=. 1989 , publisher=

work page 1989
[4]

arXiv preprint arXiv:2009.00329 , year=

Learning explanations that are hard to vary , author=. arXiv preprint arXiv:2009.00329 , year=

work page arXiv 2009
[5]

IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=

A comprehensive survey of continual learning: Theory, method and application , author=. IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=. 2024 , publisher=

work page 2024
[6]

Proceedings of the IEEE Conference on Computer Vision and Pattern recognition , pages=

Deep residual learning for image recognition , author=. Proceedings of the IEEE Conference on Computer Vision and Pattern recognition , pages=

work page
[7]

Advances in Neural Information Processing Systems , volume=

Dark experience for general continual learning: a strong, simple baseline , author=. Advances in Neural Information Processing Systems , volume=

work page
[8]

Advances in Neural Information Processing Systems , volume=

Loss decoupling for task-agnostic continual learning , author=. Advances in Neural Information Processing Systems , volume=

work page
[9]

IEEE Sensors Journal , volume=

Contrastive generative replay method of remaining useful life prediction for rolling bearings , author=. IEEE Sensors Journal , volume=. 2023 , publisher=

work page 2023
[10]

European Conference on Computer Vision , pages=

An incremental unified framework for small defect inspection , author=. European Conference on Computer Vision , pages=

work page
[11]

Nature Communications , volume=

A clinical deep learning framework for continually learning from cardiac signals across diseases, time, modalities, and institutions , author=. Nature Communications , volume=. 2021 , publisher=

work page 2021
[12]

2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) , pages=

Multi-label continual learning for the medical domain: A novel benchmark , author=. 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) , pages=. 2025 , organization=

work page 2025
[13]

2021 , organization=

Federated continual learning with weighted inter-client transfer , author=. 2021 , organization=

work page 2021
[14]

Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization

Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization , author=. arXiv preprint arXiv:1911.08731 , year=

work page internal anchor Pith review Pith/arXiv arXiv 1911
[15]

Distilling the Knowledge in a Neural Network

Distilling the knowledge in a neural network , author=. arXiv preprint arXiv:1503.02531 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[16]

arXiv preprint arXiv:2104.05025 , year=

New insights on reducing abrupt representation change in online continual learning , author=. arXiv preprint arXiv:2104.05025 , year=

work page arXiv
[17]

The Thirteenth International Conference on Learning Representations (ICLR) , year=

Semantic Aware Representation Learning for Lifelong Learning , author=. The Thirteenth International Conference on Learning Representations (ICLR) , year=

work page
[18]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Sparse coding in a dual memory system for lifelong learning , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

work page
[19]

Continual prototype evolution: Learning online from non-stationary data streams , author=

work page
[20]

Proceedings of the IEEE Conference on Computer Vision and Pattern recognition , pages=

icarl: Incremental classifier and representation learning , author=. Proceedings of the IEEE Conference on Computer Vision and Pattern recognition , pages=

work page
[21]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Bring evanescent representations to life in lifelong class incremental learning , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page
[22]

European Conference on Computer Vision , pages=

Memory-efficient incremental learning through feature adaptation , author=. European Conference on Computer Vision , pages=

work page
[23]

AAAI Bridge Program on Continual Causality , pages=

Towards causal replay for knowledge rehearsal in continual learning , author=. AAAI Bridge Program on Continual Causality , pages=. 2023 , organization=

work page 2023
[24]

On Tiny Episodic Memories in Continual Learning

On tiny episodic memories in continual learning , author=. arXiv preprint arXiv:1902.10486 , year=

work page internal anchor Pith review Pith/arXiv arXiv 1902
[25]

arXiv preprint arXiv:2402.03917 , year=

Elastic feature consolidation for cold start exemplar-free incremental learning , author=. arXiv preprint arXiv:2402.03917 , year=

work page arXiv
[26]

Lifelong Learning with Dynamically Expandable Networks

Lifelong learning with dynamically expandable networks , author=. arXiv preprint arXiv:1708.01547 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[27]

Advances in Neural Information Processing Systems , volume=

Learning from failure: De-biasing classifier from biased classifier , author=. Advances in Neural Information Processing Systems , volume=

work page
[28]

Do image classifiers generalize across time? , author=

work page
[29]

Nature Machine Intelligence , volume=

Shortcut learning in deep neural networks , author=. Nature Machine Intelligence , volume=. 2020 , publisher=

work page 2020
[30]

arXiv preprint arXiv:2310.16228 , year=

On the foundations of shortcut learning , author=. arXiv preprint arXiv:2310.16228 , year=

work page arXiv
[31]

Mechanical Systems and Signal Processing , volume=

Domain-invariant feature exploration for intelligent fault diagnosis under unseen and time-varying working conditions , author=. Mechanical Systems and Signal Processing , volume=. 2025 , publisher=

work page 2025
[32]

Reliability Engineering & System Safety , volume=

Remaining useful lifetime prediction via deep domain adaptation , author=. Reliability Engineering & System Safety , volume=. 2020 , publisher=

work page 2020
[33]

Conference on Robot Learning , pages=

DIRL: Domain-invariant representation learning for sim-to-real transfer , author=. Conference on Robot Learning , pages=. 2021 , organization=

work page 2021
[34]

Science , volume =

Cynthia Dwork and Vitaly Feldman and Moritz Hardt and Toniann Pitassi and Omer Reingold and Aaron Roth , title =. Science , volume =. 2015 , doi =. https://www.science.org/doi/pdf/10.1126/science.aaa9375 , abstract =

work page doi:10.1126/science.aaa9375 2015
[35]

ICLR , year=

Efficient Lifelong Learning with A-GEM , author=. ICLR , year=

work page
[36]

Advances in Neural Information Processing Systems , volume=

Gradient episodic memory for continual learning , author=. Advances in Neural Information Processing Systems , volume=

work page
[37]

2016 , organization=

Train faster, generalize better: Stability of stochastic gradient descent , author=. 2016 , organization=

work page 2016
[38]

The Twelfth International Conference on Learning Representations (ICLR) , year=

Addressing Loss of Plasticity and Catastrophic Forgetting in Continual Learning , author=. The Twelfth International Conference on Learning Representations (ICLR) , year=

work page
[39]

Advances in Neural Information Processing Systems , volume=

Online continual learning with maximal interfered retrieval , author=. Advances in Neural Information Processing Systems , volume=

work page
[40]

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

Umap: Uniform manifold approximation and projection for dimension reduction , author=. arXiv preprint arXiv:1802.03426 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[41]

Advances in Neural Information Processing Systems , volume=

Gradient based sample selection for online continual learning , author=. Advances in Neural Information Processing Systems , volume=

work page
[42]

and Chen, Jui-Long , journal=

Wu, Ming-Ju and Jang, Jyh-Shing R. and Chen, Jui-Long , journal=. Wafer Map Failure Pattern Recognition and Similarity Ranking for Large-Scale Data Sets , year=

work page
[43]

, title =

Jang, Jyh-Shing R. , title =. 2015 , howpublished = "

work page 2015
[44]

Proceedings of the 3rd International Workshop on Software Engineering and AI for Data Quality in Cyber-Physical Systems/Internet of Things , pages=

A nearest neighbor-based concept drift detection strategy for reliable tool condition monitoring , author=. Proceedings of the 3rd International Workshop on Software Engineering and AI for Data Quality in Cyber-Physical Systems/Internet of Things , pages=

work page
[45]

arXiv preprint arXiv:1902.09432 , year=

Scalable and order-robust continual learning with additive parameter decomposition , author=. arXiv preprint arXiv:1902.09432 , year=

work page arXiv 1902
[46]

International Conference on Learning Representations (ICLR) , year =

Measuring and Regularizing Networks in Function Space , author =. International Conference on Learning Representations (ICLR) , year =

work page
[47]

arXiv preprint arXiv:2309.02195 , year=

Sparse function-space representation of neural networks , author=. arXiv preprint arXiv:2309.02195 , year=

work page arXiv
[48]

1998 , howpublished =

Blackard, Jock , title =. 1998 , howpublished =

work page 1998
[49]

2017 , organization=

Continual learning through synaptic intelligence , author=. 2017 , organization=

work page 2017
[50]

2021 , organization=

Wilds: A benchmark of in-the-wild distribution shifts , author=. 2021 , organization=

work page 2021
[51]

From detection of individual metastases to classification of lymph node status at the patient level: the

Bandi, Peter and Geessink, Oscar and Manson, Quirine and Van Dijk, Marcory and Balkenhol, Maschenka and Hermsen, Meyke and Bejnordi, Babak Ehteshami and Lee, Byungjae and Paeng, Kyunghyun and Zhong, Aoxiao and others , journal=. From detection of individual metastases to classification of lymph node status at the patient level: the. 2019 , publisher=

work page 2019
[52]

2020 , publisher=

Clinical applications of continual learning machine learning , author=. 2020 , publisher=

work page 2020
[53]

Nature Machine Intelligence , volume=

Three types of incremental learning , author=. Nature Machine Intelligence , volume=. 2022 , publisher=

work page 2022
[54]

2022 , organization=

Forget-free continual learning with winning subnetworks , author=. 2022 , organization=

work page 2022
[55]

Proceedings of the IEEE Conference on Computer Vision and Pattern recognition , pages=

Packnet: Adding multiple tasks to a single network by iterative pruning , author=. Proceedings of the IEEE Conference on Computer Vision and Pattern recognition , pages=

work page
[56]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Not just selection, but exploration: Online class-incremental continual learning via dual view consistency , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page
[57]

2022 , publisher=

Class-incremental continual learning into the extended der-verse , author=. 2022 , publisher=

work page 2022
[58]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Domain decorrelation with potential energy ranking , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

work page
[59]

Domaindrop: Suppressing domain-sensitive channels for domain generalization , author=

work page
[60]

Blackard and Denis J

Jock A. Blackard and Denis J. Dean , booktitle =. Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables , year =

work page
[61]

International Conference on Learning Representations (ICLR) , year=

In Search of Lost Domain Generalization , author=. International Conference on Learning Representations (ICLR) , year=

work page
[62]

Transactions on Machine Learning Research , year=

Uniformly distributed feature representations for fair and robust learning , author=. Transactions on Machine Learning Research , year=

work page
[63]

Proceedings of the IEEE Conference on Computer Vision and Pattern recognition , pages=

Domain generalization with adversarial feature learning , author=. Proceedings of the IEEE Conference on Computer Vision and Pattern recognition , pages=

work page
[64]

The Thirteenth International Conference on Learning Representations (ICLR) , year=

Self-Normalized Resets for Plasticity in Continual Learning , author=. The Thirteenth International Conference on Learning Representations (ICLR) , year=

work page
[65]

Conference on robot learning , pages=

Core50: a new dataset and benchmark for continuous object recognition , author=. Conference on robot learning , pages=. 2017 , organization=

work page 2017
[66]

Revisiting Batch Normalization For Practical Domain Adaptation

Revisiting batch normalization for practical domain adaptation , author=. arXiv preprint arXiv:1603.04779 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[67]

2020 , organization=

Test-time training with self-supervision for generalization under distribution shifts , author=. 2020 , organization=

work page 2020
[68]

International Conference on Learning Representations (ICLR) , year=

Tent: Fully Test-Time Adaptation by Entropy Minimization , author=. International Conference on Learning Representations (ICLR) , year=

work page
[69]

AR-TTA: A Simple Method for Real-World Continual Test-Time Adaptation , booktitle =

S\'ojka, Damian and Cygert, Sebastian and Twardowski, Bart. AR-TTA: A Simple Method for Real-World Continual Test-Time Adaptation , booktitle =. 2023 , pages =

work page 2023
[70]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Continual test-time domain adaptation , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page
[71]

Proceedings of the 30th ACM international conference on Multimedia , pages=

Delving into the continuous domain adaptation , author=. Proceedings of the 30th ACM international conference on Multimedia , pages=

work page
[72]

arXiv preprint arXiv:2007.01807 , year=

Continuously indexed domain adaptation , author=. arXiv preprint arXiv:2007.01807 , year=

work page arXiv 2007
[73]

CS 231N , volume=

Tiny imagenet visual recognition challenge , author=. CS 231N , volume=

work page
[74]

Proceedings of the IEEE International Conference on Computer Vision , pages=

Domain generalization for object recognition with multi-task autoencoders , author=. Proceedings of the IEEE International Conference on Computer Vision , pages=

work page
[75]

Alex Krizhevsky , title =

work page
[76]

Benchmarking Neural Network Robustness to Common Corruptions and Perturbations

Benchmarking neural network robustness to common corruptions and perturbations , author=. arXiv preprint arXiv:1903.12261 , year=

work page internal anchor Pith review Pith/arXiv arXiv 1903
[77]

European Conference on Computer Vision , pages=

Learning to balance specificity and invariance for in and out of domain generalization , author=. European Conference on Computer Vision , pages=. 2020 , organization=

work page 2020
[78]

2021 IEEE , author=

The many faces of robustness: A critical analysis of out-of-distribution generalization. 2021 IEEE , author=. CVF International Conference on Computer Vision (ICCV) , volume=

work page 2021
[79]

Eskandar, Masih and Imtiaz, Tooba and Hill, Davin and Wang, Zifeng and Dy, Jennifer , booktitle =

work page
[80]

Advances in Neural Information Processing Systems , volume=

A unified approach to domain incremental learning with memory: Theory and algorithm , author=. Advances in Neural Information Processing Systems , volume=

work page

Showing first 80 references.

[1] [1]

Invariant Risk Minimization

Invariant risk minimization , author=. arXiv preprint arXiv:1907.02893 , year=

work page internal anchor Pith review Pith/arXiv arXiv 1907

[2] [2]

2013 , organization=

Domain generalization via invariant feature representation , author=. 2013 , organization=

work page 2013

[3] [3]

Psychology of learning and motivation , volume=

Catastrophic interference in connectionist networks: The sequential learning problem , author=. Psychology of learning and motivation , volume=. 1989 , publisher=

work page 1989

[4] [4]

arXiv preprint arXiv:2009.00329 , year=

Learning explanations that are hard to vary , author=. arXiv preprint arXiv:2009.00329 , year=

work page arXiv 2009

[5] [5]

IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=

A comprehensive survey of continual learning: Theory, method and application , author=. IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=. 2024 , publisher=

work page 2024

[6] [6]

Proceedings of the IEEE Conference on Computer Vision and Pattern recognition , pages=

Deep residual learning for image recognition , author=. Proceedings of the IEEE Conference on Computer Vision and Pattern recognition , pages=

work page

[7] [7]

Advances in Neural Information Processing Systems , volume=

Dark experience for general continual learning: a strong, simple baseline , author=. Advances in Neural Information Processing Systems , volume=

work page

[8] [8]

Advances in Neural Information Processing Systems , volume=

Loss decoupling for task-agnostic continual learning , author=. Advances in Neural Information Processing Systems , volume=

work page

[9] [9]

IEEE Sensors Journal , volume=

Contrastive generative replay method of remaining useful life prediction for rolling bearings , author=. IEEE Sensors Journal , volume=. 2023 , publisher=

work page 2023

[10] [10]

European Conference on Computer Vision , pages=

An incremental unified framework for small defect inspection , author=. European Conference on Computer Vision , pages=

work page

[11] [11]

Nature Communications , volume=

A clinical deep learning framework for continually learning from cardiac signals across diseases, time, modalities, and institutions , author=. Nature Communications , volume=. 2021 , publisher=

work page 2021

[12] [12]

2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) , pages=

Multi-label continual learning for the medical domain: A novel benchmark , author=. 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) , pages=. 2025 , organization=

work page 2025

[13] [13]

2021 , organization=

Federated continual learning with weighted inter-client transfer , author=. 2021 , organization=

work page 2021

[14] [14]

Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization

Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization , author=. arXiv preprint arXiv:1911.08731 , year=

work page internal anchor Pith review Pith/arXiv arXiv 1911

[15] [15]

Distilling the Knowledge in a Neural Network

Distilling the knowledge in a neural network , author=. arXiv preprint arXiv:1503.02531 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[16] [16]

arXiv preprint arXiv:2104.05025 , year=

New insights on reducing abrupt representation change in online continual learning , author=. arXiv preprint arXiv:2104.05025 , year=

work page arXiv

[17] [17]

The Thirteenth International Conference on Learning Representations (ICLR) , year=

Semantic Aware Representation Learning for Lifelong Learning , author=. The Thirteenth International Conference on Learning Representations (ICLR) , year=

work page

[18] [18]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Sparse coding in a dual memory system for lifelong learning , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

work page

[19] [19]

Continual prototype evolution: Learning online from non-stationary data streams , author=

work page

[20] [20]

Proceedings of the IEEE Conference on Computer Vision and Pattern recognition , pages=

icarl: Incremental classifier and representation learning , author=. Proceedings of the IEEE Conference on Computer Vision and Pattern recognition , pages=

work page

[21] [21]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Bring evanescent representations to life in lifelong class incremental learning , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page

[22] [22]

European Conference on Computer Vision , pages=

Memory-efficient incremental learning through feature adaptation , author=. European Conference on Computer Vision , pages=

work page

[23] [23]

AAAI Bridge Program on Continual Causality , pages=

Towards causal replay for knowledge rehearsal in continual learning , author=. AAAI Bridge Program on Continual Causality , pages=. 2023 , organization=

work page 2023

[24] [24]

On Tiny Episodic Memories in Continual Learning

On tiny episodic memories in continual learning , author=. arXiv preprint arXiv:1902.10486 , year=

work page internal anchor Pith review Pith/arXiv arXiv 1902

[25] [25]

arXiv preprint arXiv:2402.03917 , year=

Elastic feature consolidation for cold start exemplar-free incremental learning , author=. arXiv preprint arXiv:2402.03917 , year=

work page arXiv

[26] [26]

Lifelong Learning with Dynamically Expandable Networks

Lifelong learning with dynamically expandable networks , author=. arXiv preprint arXiv:1708.01547 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[27] [27]

Advances in Neural Information Processing Systems , volume=

Learning from failure: De-biasing classifier from biased classifier , author=. Advances in Neural Information Processing Systems , volume=

work page

[28] [28]

Do image classifiers generalize across time? , author=

work page

[29] [29]

Nature Machine Intelligence , volume=

Shortcut learning in deep neural networks , author=. Nature Machine Intelligence , volume=. 2020 , publisher=

work page 2020

[30] [30]

arXiv preprint arXiv:2310.16228 , year=

On the foundations of shortcut learning , author=. arXiv preprint arXiv:2310.16228 , year=

work page arXiv

[31] [31]

Mechanical Systems and Signal Processing , volume=

Domain-invariant feature exploration for intelligent fault diagnosis under unseen and time-varying working conditions , author=. Mechanical Systems and Signal Processing , volume=. 2025 , publisher=

work page 2025

[32] [32]

Reliability Engineering & System Safety , volume=

Remaining useful lifetime prediction via deep domain adaptation , author=. Reliability Engineering & System Safety , volume=. 2020 , publisher=

work page 2020

[33] [33]

Conference on Robot Learning , pages=

DIRL: Domain-invariant representation learning for sim-to-real transfer , author=. Conference on Robot Learning , pages=. 2021 , organization=

work page 2021

[34] [34]

Science , volume =

Cynthia Dwork and Vitaly Feldman and Moritz Hardt and Toniann Pitassi and Omer Reingold and Aaron Roth , title =. Science , volume =. 2015 , doi =. https://www.science.org/doi/pdf/10.1126/science.aaa9375 , abstract =

work page doi:10.1126/science.aaa9375 2015

[35] [35]

ICLR , year=

Efficient Lifelong Learning with A-GEM , author=. ICLR , year=

work page

[36] [36]

Advances in Neural Information Processing Systems , volume=

Gradient episodic memory for continual learning , author=. Advances in Neural Information Processing Systems , volume=

work page

[37] [37]

2016 , organization=

Train faster, generalize better: Stability of stochastic gradient descent , author=. 2016 , organization=

work page 2016

[38] [38]

The Twelfth International Conference on Learning Representations (ICLR) , year=

Addressing Loss of Plasticity and Catastrophic Forgetting in Continual Learning , author=. The Twelfth International Conference on Learning Representations (ICLR) , year=

work page

[39] [39]

Advances in Neural Information Processing Systems , volume=

Online continual learning with maximal interfered retrieval , author=. Advances in Neural Information Processing Systems , volume=

work page

[40] [40]

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

Umap: Uniform manifold approximation and projection for dimension reduction , author=. arXiv preprint arXiv:1802.03426 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[41] [41]

Advances in Neural Information Processing Systems , volume=

Gradient based sample selection for online continual learning , author=. Advances in Neural Information Processing Systems , volume=

work page

[42] [42]

and Chen, Jui-Long , journal=

Wu, Ming-Ju and Jang, Jyh-Shing R. and Chen, Jui-Long , journal=. Wafer Map Failure Pattern Recognition and Similarity Ranking for Large-Scale Data Sets , year=

work page

[43] [43]

, title =

Jang, Jyh-Shing R. , title =. 2015 , howpublished = "

work page 2015

[44] [44]

Proceedings of the 3rd International Workshop on Software Engineering and AI for Data Quality in Cyber-Physical Systems/Internet of Things , pages=

A nearest neighbor-based concept drift detection strategy for reliable tool condition monitoring , author=. Proceedings of the 3rd International Workshop on Software Engineering and AI for Data Quality in Cyber-Physical Systems/Internet of Things , pages=

work page

[45] [45]

arXiv preprint arXiv:1902.09432 , year=

Scalable and order-robust continual learning with additive parameter decomposition , author=. arXiv preprint arXiv:1902.09432 , year=

work page arXiv 1902

[46] [46]

International Conference on Learning Representations (ICLR) , year =

Measuring and Regularizing Networks in Function Space , author =. International Conference on Learning Representations (ICLR) , year =

work page

[47] [47]

arXiv preprint arXiv:2309.02195 , year=

Sparse function-space representation of neural networks , author=. arXiv preprint arXiv:2309.02195 , year=

work page arXiv

[48] [48]

1998 , howpublished =

Blackard, Jock , title =. 1998 , howpublished =

work page 1998

[49] [49]

2017 , organization=

Continual learning through synaptic intelligence , author=. 2017 , organization=

work page 2017

[50] [50]

2021 , organization=

Wilds: A benchmark of in-the-wild distribution shifts , author=. 2021 , organization=

work page 2021

[51] [51]

From detection of individual metastases to classification of lymph node status at the patient level: the

Bandi, Peter and Geessink, Oscar and Manson, Quirine and Van Dijk, Marcory and Balkenhol, Maschenka and Hermsen, Meyke and Bejnordi, Babak Ehteshami and Lee, Byungjae and Paeng, Kyunghyun and Zhong, Aoxiao and others , journal=. From detection of individual metastases to classification of lymph node status at the patient level: the. 2019 , publisher=

work page 2019

[52] [52]

2020 , publisher=

Clinical applications of continual learning machine learning , author=. 2020 , publisher=

work page 2020

[53] [53]

Nature Machine Intelligence , volume=

Three types of incremental learning , author=. Nature Machine Intelligence , volume=. 2022 , publisher=

work page 2022

[54] [54]

2022 , organization=

Forget-free continual learning with winning subnetworks , author=. 2022 , organization=

work page 2022

[55] [55]

Proceedings of the IEEE Conference on Computer Vision and Pattern recognition , pages=

Packnet: Adding multiple tasks to a single network by iterative pruning , author=. Proceedings of the IEEE Conference on Computer Vision and Pattern recognition , pages=

work page

[56] [56]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Not just selection, but exploration: Online class-incremental continual learning via dual view consistency , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page

[57] [57]

2022 , publisher=

Class-incremental continual learning into the extended der-verse , author=. 2022 , publisher=

work page 2022

[58] [58]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Domain decorrelation with potential energy ranking , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

work page

[59] [59]

Domaindrop: Suppressing domain-sensitive channels for domain generalization , author=

work page

[60] [60]

Blackard and Denis J

Jock A. Blackard and Denis J. Dean , booktitle =. Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables , year =

work page

[61] [61]

International Conference on Learning Representations (ICLR) , year=

In Search of Lost Domain Generalization , author=. International Conference on Learning Representations (ICLR) , year=

work page

[62] [62]

Transactions on Machine Learning Research , year=

Uniformly distributed feature representations for fair and robust learning , author=. Transactions on Machine Learning Research , year=

work page

[63] [63]

Proceedings of the IEEE Conference on Computer Vision and Pattern recognition , pages=

Domain generalization with adversarial feature learning , author=. Proceedings of the IEEE Conference on Computer Vision and Pattern recognition , pages=

work page

[64] [64]

The Thirteenth International Conference on Learning Representations (ICLR) , year=

Self-Normalized Resets for Plasticity in Continual Learning , author=. The Thirteenth International Conference on Learning Representations (ICLR) , year=

work page

[65] [65]

Conference on robot learning , pages=

Core50: a new dataset and benchmark for continuous object recognition , author=. Conference on robot learning , pages=. 2017 , organization=

work page 2017

[66] [66]

Revisiting Batch Normalization For Practical Domain Adaptation

Revisiting batch normalization for practical domain adaptation , author=. arXiv preprint arXiv:1603.04779 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[67] [67]

2020 , organization=

Test-time training with self-supervision for generalization under distribution shifts , author=. 2020 , organization=

work page 2020

[68] [68]

International Conference on Learning Representations (ICLR) , year=

Tent: Fully Test-Time Adaptation by Entropy Minimization , author=. International Conference on Learning Representations (ICLR) , year=

work page

[69] [69]

AR-TTA: A Simple Method for Real-World Continual Test-Time Adaptation , booktitle =

S\'ojka, Damian and Cygert, Sebastian and Twardowski, Bart. AR-TTA: A Simple Method for Real-World Continual Test-Time Adaptation , booktitle =. 2023 , pages =

work page 2023

[70] [70]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Continual test-time domain adaptation , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page

[71] [71]

Proceedings of the 30th ACM international conference on Multimedia , pages=

Delving into the continuous domain adaptation , author=. Proceedings of the 30th ACM international conference on Multimedia , pages=

work page

[72] [72]

arXiv preprint arXiv:2007.01807 , year=

Continuously indexed domain adaptation , author=. arXiv preprint arXiv:2007.01807 , year=

work page arXiv 2007

[73] [73]

CS 231N , volume=

Tiny imagenet visual recognition challenge , author=. CS 231N , volume=

work page

[74] [74]

Proceedings of the IEEE International Conference on Computer Vision , pages=

Domain generalization for object recognition with multi-task autoencoders , author=. Proceedings of the IEEE International Conference on Computer Vision , pages=

work page

[75] [75]

Alex Krizhevsky , title =

work page

[76] [76]

Benchmarking Neural Network Robustness to Common Corruptions and Perturbations

Benchmarking neural network robustness to common corruptions and perturbations , author=. arXiv preprint arXiv:1903.12261 , year=

work page internal anchor Pith review Pith/arXiv arXiv 1903

[77] [77]

European Conference on Computer Vision , pages=

Learning to balance specificity and invariance for in and out of domain generalization , author=. European Conference on Computer Vision , pages=. 2020 , organization=

work page 2020

[78] [78]

2021 IEEE , author=

The many faces of robustness: A critical analysis of out-of-distribution generalization. 2021 IEEE , author=. CVF International Conference on Computer Vision (ICCV) , volume=

work page 2021

[79] [79]

Eskandar, Masih and Imtiaz, Tooba and Hill, Davin and Wang, Zifeng and Dy, Jennifer , booktitle =

work page

[80] [80]

Advances in Neural Information Processing Systems , volume=

A unified approach to domain incremental learning with memory: Theory and algorithm , author=. Advances in Neural Information Processing Systems , volume=

work page