pith. sign in

arxiv: 2606.26457 · v1 · pith:P3VZYAZSnew · submitted 2026-06-24 · 📊 stat.ML · cs.LG

A probabilistic framework for online test-time adaptation

Pith reviewed 2026-06-26 00:23 UTC · model grok-4.3

classification 📊 stat.ML cs.LG
keywords online test-time adaptationprobabilistic frameworkstate-space modeldistributional shiftparameter evolution
0
0 comments X

The pith

A state-space model unifies parameter learning and prediction for online test-time adaptation under distributional shifts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a probabilistic framework for adapting a model trained on labeled data to unlabeled test data when distributions may have shifted. It uses a state-space architecture to model how parameters evolve over time, allowing for learning, prior tuning, and predictions in an online setting. This matters because many real-world applications involve data arriving sequentially with possible changes in underlying distributions, requiring models to adapt without labels. The framework characterizes all key aspects of the adaptation process through this modeling approach.

Core claim

The framework is based on a state-space modelling architecture from which parameter learning, parameter time evolution, prior tuning, and prediction can be characterized for online test-time adaptation under potential distributional shifts.

What carries the argument

state-space modelling architecture that tracks parameter dynamics over time

If this is right

  • Parameters can be learned and updated sequentially as new unlabeled data arrives.
  • Prior distributions can be tuned based on the state evolution.
  • Predictions account for the uncertainty in parameter changes due to shifts.
  • Adaptation becomes a filtering problem in the state-space model.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Such a model could integrate with existing Bayesian online learning techniques for more robust adaptation.
  • Extensions might include handling multiple possible shift types within the state transitions.

Load-bearing premise

That the dynamics of model parameters during adaptation can be adequately represented by a state-space model.

What would settle it

A comparison where the state-space predictions fail to match observed adaptation performance on datasets with known distributional shifts.

Figures

Figures reproduced from arXiv: 2606.26457 by Daniel Corrales, David R\'ios Insua.

Figure 1
Figure 1. Figure 1: TTA probabilistic model. Grey nodes represent observed variables. White nodes, [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Structural relationships between source parameters [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Prior parameter dynamics induced by four choices of transition matrix [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Classification boundary evolution [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Classification boundary evolution [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Classification boundary evolution loss curvature information of the current step and discards the curvatures of previous steps. We define the parameters, following the previously used notation, as µt = µt|t−1 − βtΣtgt and Σ −1 t = W−1 r + βtHt. The covariance matrix does not use previous step information but uses noise-inflated curvature information of the current step. This approach may be seen as a curva… view at source ↗
read the original abstract

This paper presents a probabilistic framework for online test-time adaptation problems. In them, a model is trained on labeled data but must adapt to unlabeled data at test time under the assumption that training and test distributions potentially differ, that is, there might have been a distributional shift. The framework is based on a state-space modelling architecture from which parameter learning, parameter time evolution, prior tuning, and prediction can be characterized.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper presents a probabilistic framework for online test-time adaptation problems. A model is trained on labeled data but must adapt to unlabeled data at test time under potential distributional shift. The framework is based on a state-space modelling architecture from which parameter learning, parameter time evolution, prior tuning, and prediction can be characterized.

Significance. If rigorously developed with explicit derivations and validated empirically, such a framework could provide a unified probabilistic treatment of online TTA, enabling principled handling of distributional shift via state-space dynamics. The abstract alone supplies no such development, so significance cannot be assessed.

major comments (1)
  1. [Abstract] Abstract: no equations, state-space model definition, learning rules, or experimental results are supplied, so the central claim that the architecture 'characterizes' parameter learning, time evolution, prior tuning, and prediction cannot be evaluated for soundness or novelty.

Simulated Author's Rebuttal

1 responses · 1 unresolved

We thank the referee for their review. We address the single major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: no equations, state-space model definition, learning rules, or experimental results are supplied, so the central claim that the architecture 'characterizes' parameter learning, time evolution, prior tuning, and prediction cannot be evaluated for soundness or novelty.

    Authors: We agree that the provided manuscript consists solely of the abstract, which contains no equations, state-space model definition, learning rules, or experimental results. Consequently, the central claim cannot be evaluated for soundness or novelty from the given text. revision: no

standing simulated objections not resolved
  • Only the abstract is available, so we cannot supply the state-space model, derivations, or results needed to allow evaluation of the framework.

Circularity Check

0 steps flagged

No circularity detectable; abstract-only text provides no derivation chain

full rationale

Only the abstract is available, which states the existence of a state-space modelling architecture for characterizing parameter learning, time evolution, prior tuning, and prediction but supplies no equations, self-citations, fitted inputs, or ansatzes. No load-bearing steps exist to inspect for reduction to inputs by construction, self-definition, or self-citation chains. This matches the default case of honest non-finding when the paper is self-contained against external benchmarks and no evidence of circularity is present.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no free parameters, axioms, or invented entities are specified or can be extracted.

pith-pipeline@v0.9.1-grok · 5554 in / 1116 out tokens · 30763 ms · 2026-06-26T00:23:33.596382+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

36 extracted references · 4 canonical work pages

  1. [1]

    P. G. Arce, R. Naveiro, and D. R. Insua. Evasion attacks against bayesian predictive models. InProceedings of the Forty-First Conference on Uncertainty in Artificial Intel- ligence, pages 184–202, 2025

  2. [2]

    C. M. Bishop and N. M. Nasrabadi.Pattern recognition and machine learning, volume 4. Springer, 2006

  3. [3]

    P. G. Bissiri, C. C. Holmes, and S. G. Walker. A general framework for updating belief distributions.Journal of the Royal Statistical Society Series B: Statistical Methodology, 78(5):1103–1130, 2016

  4. [4]

    Chapelle, B

    O. Chapelle, B. Sch¨ olkopf, and A. Zien, editors.Semi-supervised learning. Adap- tive computation and machine learning. MIT Press, Cambridge, Mass, 2006. ISBN 9780262033589

  5. [5]

    Daxberger, A

    E. Daxberger, A. Kristiadi, A. Immer, R. Eschenhagen, M. Bauer, and P. Hennig. Laplace redux-effortless bayesian deep learning.Advances in neural information processing sys- tems, 34:20089–20103, 2021

  6. [6]

    Duran-Martin, L

    G. Duran-Martin, L. S´ anchez-Betancourt, A. Y. Shestopaloff, and K. Murphy. A unifying framework for generalised bayesian online learning in non-stationary environments.arXiv preprint arXiv:2411.10153, 2024

  7. [7]

    Duran-Martin, L

    G. Duran-Martin, L. S´ anchez-Betancourt,´A. Cartea, and K. Murphy. Martingale poste- rior neural networks for fast sequential decision making.Advances in Neural Information Processing Systems, 38:87940–87988, 2026

  8. [8]

    Goyal, M

    S. Goyal, M. Sun, A. Raghunathan, and J. Z. Kolter. Test time adaptation via conjugate pseudo-labels.Advances in Neural Information Processing Systems, 35:6204–6218, 2022

  9. [9]

    Grandvalet and Y

    Y. Grandvalet and Y. Bengio. Semi-supervised learning by entropy minimization.Ad- vances in neural information processing systems, 17, 2004

  10. [10]

    E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, W. Chen, et al. Lora: Low-rank adaptation of large language models.Iclr, 1(2):3, 2022

  11. [11]

    Iwasawa and Y

    Y. Iwasawa and Y. Matsuo. Test-time classifier adjustment module for model-agnostic domain generalization.Advances in Neural Information Processing Systems, 34:2427– 2440, 2021

  12. [12]

    Jones, P

    M. Jones, P. Chang, and K. Murphy. Bayesian online natural gradient (bong).Advances in Neural Information Processing Systems, 37:131104–131153, 2024

  13. [13]

    Kirkpatrick, R

    J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Mi- lan, J. Quan, T. Ramalho, A. Grabska-Barwinska, et al. Overcoming catastrophic for- getting in neural networks.Proceedings of the national academy of sciences, 114(13): 3521–3526, 2017

  14. [14]

    Knoblauch, J

    J. Knoblauch, J. Jewson, and T. Damoulas. An optimization-centric view on bayes’ rule: Reviewing and generalizing variational inference.Journal of Machine Learning Research, 23(132):1–109, 2022. 16

  15. [15]

    P. W. Koh, S. Sagawa, H. Marklund, S. M. Xie, M. Zhang, A. Balsubramani, W. Hu, M. Yasunaga, R. L. Phillips, I. Gao, et al. Wilds: A benchmark of in-the-wild distribution shifts. InInternational conference on machine learning, pages 5637–5664. PMLR, 2021

  16. [16]

    Kojima, J

    Y. Kojima, J. Xu, X. Zou, and X. Wang. Lora-ttt: Low-rank test-time training for vision-language models.arXiv preprint arXiv:2502.02069, 2025

  17. [17]

    J.-H. Lee. Bayesian weight enhancement with steady-state adaptation for test-time adap- tation in dynamic environments. InForty-second International Conference on Machine Learning, 2025

  18. [18]

    Lee and J.-H

    J.-H. Lee and J.-H. Chang. Continual momentum filtering on parameter space for online test-time adaptation. InThe Twelfth International Conference on Learning Representa- tions, 2024

  19. [19]

    Lee and J.-H

    J.-H. Lee and J.-H. Chang. Stationary latent weight inference for unreliable observations from online test-time adaptation. InForty-first International Conference on Machine Learning, 2024

  20. [20]

    Liang, D

    J. Liang, D. Hu, and J. Feng. Do we really need to access the source data? source hypothesis transfer for unsupervised domain adaptation. InInternational conference on machine learning, pages 6028–6039. PMLR, 2020

  21. [21]

    Liang, R

    J. Liang, R. He, and T. Tan. A comprehensive survey on test-time adaptation under distribution shifts.International Journal of Computer Vision, 133(1):31–64, 2025

  22. [22]

    Y. Liu, P. Kothari, B. Van Delft, B. Bellot-Gurlet, T. Mordan, and A. Alahi. Ttt++: When does self-supervised test-time training fail or thrive?Advances in Neural Infor- mation Processing Systems, 34:21808–21820, 2021

  23. [23]

    D. J. MacKay. A practical bayesian framework for backpropagation networks.Neural computation, 4(3):448–472, 1992

  24. [24]

    R. A. Marsden, M. D¨ obler, and B. Yang. Universal test-time adaptation through weight ensembling, diversity weighting, and prior correction. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2555–2565, 2024

  25. [25]

    J. Martens. New insights and perspectives on the natural gradient method.Journal of Machine Learning Research, 21(146):1–76, 2020

  26. [26]

    K. P. Murphy.Probabilistic machine learning: Advanced topics. MIT press, 2023

  27. [27]

    S. Niu, J. Wu, Y. Zhang, Y. Chen, S. Zheng, P. Zhao, and M. Tan. Efficient test-time model adaptation without forgetting. InInternational conference on machine learning, pages 16888–16905. PMLR, 2022

  28. [28]

    Schirmer, D

    M. Schirmer, D. Zhang, and E. Nalisnick. Temporal Test-Time Adaptation with State- Space Models, Nov. 2025. URLhttp://arxiv.org/abs/2407.12492. arXiv:2407.12492 [cs]

  29. [29]

    M. Seeger. Learning with labeled and unlabeled data. Technical report, Institute for Adaptive and Neural Computation, University of Edinburgh, 2000

  30. [30]

    Y. Sun, X. Wang, Z. Liu, J. Miller, A. Efros, and M. Hardt. Test-time training with self-supervision for generalization under distribution shifts. InInternational conference on machine learning, pages 9229–9248. PMLR, 2020. 17

  31. [31]

    D. Wang, E. Shelhamer, S. Liu, B. Olshausen, and T. Darrell. Tent: Fully test-time adaptation by entropy minimization. InInternational Conference on Learning Represen- tations, 2021

  32. [32]

    Q. Wang, O. Fink, L. Van Gool, and D. Dai. Continual test-time domain adaptation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7201–7211, 2022

  33. [33]

    West and J

    M. West and J. Harrison.Bayesian forecasting and dynamic models. Springer, 1997

  34. [34]

    Xiao and C

    Z. Xiao and C. G. Snoek. Beyond model adaptation at test time: A survey.arXiv preprint arXiv:2411.03687, 2024

  35. [35]

    Zhang, S

    M. Zhang, S. Levine, and C. Finn. Memo: Test time robustness via adaptation and augmentation.Advances in neural information processing systems, 35:38629–38642, 2022

  36. [36]

    Zhou and S

    A. Zhou and S. Levine. Bayesian adaptation for covariate shift.Advances in neural information processing systems, 34:914–927, 2021. 18