pith. sign in

arxiv: 2509.23183 · v3 · pith:DGAYPAVEnew · submitted 2025-09-27 · 💻 cs.LG · cs.NI

ZeroSiam: An Efficient Asymmetry for Test-Time Entropy Optimization without Collapse

Pith reviewed 2026-05-21 21:31 UTC · model grok-4.3

classification 💻 cs.LG cs.NI
keywords test-time adaptationentropy minimizationcollapse preventionSiamese architectureasymmetric divergencevision adaptationLLM reasoning
0
0 comments X

The pith

An asymmetric Siamese architecture with a learnable predictor and stop-gradient prevents collapse in test-time entropy minimization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Pure entropy minimization at test time can push models toward collapsed outputs like constant one-hot predictions that minimize the objective without real adaptation. ZeroSiam counters this by building an efficient asymmetric Siamese structure that aligns divergences between branches. The asymmetry comes from adding a learnable predictor on one side and a stop-gradient before the classifier on the other. This setup not only blocks trivial collapse but also regularizes biased learning signals, which improves results even on runs that would not have collapsed anyway. Experiments across vision tasks and large language model reasoning show stable gains with almost no added cost.

Core claim

ZeroSiam prevents collapse through asymmetric divergence alignment, efficiently achieved by a learnable predictor and a stop-gradient operator before the classifier. We provide empirical and theoretical evidence that ZeroSiam not only prevents collapse, but also regularizes biased learning signals, enhancing performance even when no collapse occurs. Despite its simplicity, extensive results show that ZeroSiam performs more stably over prior methods using negligible overhead, demonstrating efficacy on both vision adaptation and large language model reasoning tasks across challenging test scenarios and diverse models, including particularly collapse-prone tiny models.

What carries the argument

ZeroSiam, an asymmetric Siamese architecture that creates divergence alignment using a learnable predictor paired with a stop-gradient operator before the classifier.

If this is right

  • Test-time adaptation becomes more stable on tiny models that previously collapsed under entropy minimization.
  • Performance gains appear on both vision adaptation and large language model reasoning without extra compute.
  • Biased learning signals get regularized even in regimes where collapse would not have occurred.
  • The method works with negligible overhead compared with earlier regularization approaches.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same asymmetry pattern could be inserted into other test-time objectives such as pseudo-labeling or consistency losses to check for similar collapse resistance.
  • If the predictor-plus-stop-gradient pair proves robust, future work might replace more complex regularizers with this lightweight asymmetry.
  • Extending the approach to continual test-time adaptation over long sequences of shifting data could test whether the regularization effect persists.

Load-bearing premise

The combination of a learnable predictor plus stop-gradient will reliably produce useful asymmetry that prevents collapse and regularizes signals across many models, datasets, and regimes without creating new failure modes or needing heavy tuning.

What would settle it

Run test-time entropy minimization on a collapse-prone tiny vision model with and without the stop-gradient operator; check whether constant-class outputs appear only in the version that removes the stop-gradient.

Figures

Figures reproduced from arXiv: 2509.23183 by Deyu Chen, Guohao Chen, Jiahao Yang, Mingkui Tan, Pengcheng Wu, Shuaicheng Niu, Zhiqi Shen, Zitian Zhang.

Figure 1
Figure 1. Figure 1: Comparisons on architectures. (a) Alignment-oriented SSL methods (BYOL ( [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Empirical evidence of ZeroSiam’s stabilization effects. (a) records the Frobenius distance [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Resistance to learning from noise. Models pre-adapt on N pure Gaussian noise, then run TTA on ImageNet-C (level 5). Resistance to Learning from Noise In dynamic real-world ap￾plications, models may frequently encounter test data that are severely corrupted and non-semantic, such as extreme occluded frames and pure sensor noise where no valid label exists. Mini￾mizing entropy on these data can then be misle… view at source ↗
Figure 4
Figure 4. Figure 4: Sensitivity to learning rates. Results are reported on Im￾ageNet-C (level 5) with ViT-Base under label shifts w.r.t. Accuracy. Sensitivity to Learning Rates We further examine the sensi￾tivity of ZeroSiam to learning rates. When the predictor learn￾ing rate is set to zero, the predictor becomes a frozen identity and ZeroSiam degenerates to Tent. From [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 1
Figure 1. Figure 1: Our IMAGENET-C dataset consists of 15 types of algorithmically generated corruptions from noise, blur, weather, and digital categories. Each type of corruption has five levels of severity, resulting in 75 distinct corruptions. See different severity levels in Appendix B. face of minor input changes. Now in order to approximate C, E and these robustness measures, we designed a set of corruptions and perturb… view at source ↗
read the original abstract

Test-time entropy minimization helps adapt a model to novel environments and incentivize its reasoning capability, unleashing the model's potential during inference by allowing it to evolve and improve in real-time using its own predictions, achieving promising performance. However, pure entropy minimization can favor non-generalizable shortcuts, such as inflating the logit norm and driving all predictions to a dominant class to reduce entropy, risking collapsed solutions (e.g., constant one-hot outputs) that trivially minimize the objective without meaningful learning. In this paper, we reveal asymmetry as a key mechanism for collapse prevention and introduce ZeroSiam--an efficient asymmetric Siamese architecture tailored for test-time entropy minimization. ZeroSiam prevents collapse through asymmetric divergence alignment, efficiently achieved by a learnable predictor and a stop-gradient operator before the classifier. We provide empirical and theoretical evidence that ZeroSiam not only prevents collapse, but also regularizes biased learning signals, enhancing performance even when no collapse occurs. Despite its simplicity, extensive results show that ZeroSiam performs more stably over prior methods using negligible overhead, demonstrating efficacy on both vision adaptation and large language model reasoning tasks across challenging test scenarios and diverse models, including particularly collapse-prone tiny models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities; the central claim rests on the unstated premise that the stop-gradient and predictor create a stable asymmetric signal.

pith-pipeline@v0.9.0 · 5764 in / 1095 out tokens · 69677 ms · 2026-05-21T21:31:53.399270+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

63 extracted references · 63 canonical work pages · 4 internal anchors

  1. [1]

    Understanding the impact of entropy on policy optimization

    Zafarali Ahmed, Nicolas Le Roux, Mohammad Norouzi, and Dale Schuurmans. Understanding the impact of entropy on policy optimization. In International conference on machine learning, pp.\ 151--160. PMLR, 2019

  2. [2]

    Vicreg: Variance-invariance-covariance regularization for self-supervised learning

    Adrien Bardes, Jean Ponce, and Yann LeCun. Vicreg: Variance-invariance-covariance regularization for self-supervised learning. In International Conference on Learning Representations, 2022

  3. [3]

    u hler, Felix Wiewel, Mario D \

    Alexander Bartler, Andre B \"u hler, Felix Wiewel, Mario D \"o bler, and Bin Yang. Mt3: Meta test-time training for self-supervised test-time adaption. In International Conference on Artificial Intelligence and Statistics, pp.\ 3080--3090. PMLR, 2022

  4. [4]

    Cross-device collaborative test-time adaptation

    Guohao Chen, Shuaicheng Niu, Deyu Chen, Shuhai Zhang, Changsheng Li, Yuanqing Li, and Mingkui Tan. Cross-device collaborative test-time adaptation. In Advances in Neural Information Processing Systems, volume 37, pp.\ 122917--122951, 2024

  5. [5]

    A simple framework for contrastive learning of visual representations

    Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning, pp.\ 1597--1607, 2020

  6. [6]

    Exploring simple siamese representation learning

    Xinlei Chen and Kaiming He. Exploring simple siamese representation learning. In IEEE Conference on Computer Vision and Pattern Recognition, pp.\ 15750--15758, 2021

  7. [7]

    American invitational mathematics examination-aime, 2024

    MAA Codeforces. American invitational mathematics examination-aime, 2024

  8. [8]

    Imagenet: A large-scale hierarchical image database

    Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition, pp.\ 248--255, 2009

  9. [9]

    Test-time model adaptation for quantized neural networks

    Zeshuai Deng, Guohao Chen, Shuaicheng Niu, Hui Luo, Shuhai Zhang, Yifan Yang, Renjie Chen, Wei Luo, and Mingkui Tan. Test-time model adaptation for quantized neural networks. arXiv preprint arXiv:2508.02180, 2025

  10. [10]

    The llama 3 herd of models

    Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, et al. The llama 3 herd of models. arXiv e-prints, pp.\ arXiv--2407, 2024

  11. [11]

    Test-time training with masked autoencoders

    Yossi Gandelsman, Yu Sun, Xinlei Chen, and Alexei Efros. Test-time training with masked autoencoders. In Advances in Neural Information Processing Systems, volume 35, pp.\ 29374--29385, 2022

  12. [12]

    Semi-supervised learning by entropy minimization

    Yves Grandvalet and Yoshua Bengio. Semi-supervised learning by entropy minimization. In Advances in Neural Information Processing Systems, volume 17, 2004

  13. [13]

    Bootstrap your own latent-a new approach to self-supervised learning

    Jean-Bastien Grill, Florian Strub, Florent Altch \'e , Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, et al. Bootstrap your own latent-a new approach to self-supervised learning. In Advances in neural information processing systems, volume 33, pp.\ 21271--21284, 2020

  14. [14]

    Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor

    Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International Conference on Machine Learning, pp.\ 1861--1870. Pmlr, 2018

  15. [15]

    Momentum contrast for unsupervised visual representation learning

    Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.\ 9729--9738, 2020

  16. [16]

    Benchmarking neural network robustness to common corruptions and perturbations

    Dan Hendrycks and Thomas Dietterich. Benchmarking neural network robustness to common corruptions and perturbations. In International Conference on Learning Representations, pp.\ 1--11, 2019

  17. [17]

    Measuring mathematical problem solving with the math dataset

    Dan Hendrycks, Collin Burns, Saurav Kadavath, Akul Arora, Steven Basart, Eric Tang, Dawn Song, and Jacob Steinhardt. Measuring mathematical problem solving with the math dataset. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021

  18. [18]

    Lora: Low-rank adaptation of large language models

    Edward J Hu, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, et al. Lora: Low-rank adaptation of large language models. In International Conference on Learning Representations, 2022

  19. [19]

    Test-time learning for large language models

    Jinwu Hu, Zitian Zhang, Guohao Chen, Xutao Wen, Chao Shuai, Wei Luo, Bin Xiao, Yuanqing Li, and Mingkui Tan. Test-time learning for large language models. In International Conference on Machine Learning. PMLR, 2025 a

  20. [20]

    Beyond entropy: Region confidence proxy for wild test-time adaptation

    Zixuan Hu, Yichun Hu, Xiaotong Li, Shixiang Tang, and Lingyu Duan. Beyond entropy: Region confidence proxy for wild test-time adaptation. In International Conference on Machine Learning, 2025 b

  21. [21]

    Self-training large language models with confident reasoning

    Hyosoon Jang, Yunhui Jang, Sungjae Lee, Jungseul Ok, and Sungsoo Ahn. Self-training large language models with confident reasoning. arXiv preprint arXiv:2505.17454, 2025

  22. [22]

    Instance weighting for domain adaptation in nlp

    Jing Jiang and ChengXiang Zhai. Instance weighting for domain adaptation in nlp. In Annual Meeting of the Association for Computational Linguistics, 2007

  23. [23]

    How to escape saddle points efficiently

    Chi Jin, Rong Ge, Praneeth Netrapalli, Sham M Kakade, and Michael I Jordan. How to escape saddle points efficiently. In International conference on machine learning, pp.\ 1724--1732. PMLR, 2017

  24. [24]

    Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks

    Dong-Hyun Lee et al. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Workshop on challenges in representation learning, ICML, 2013

  25. [25]

    Entropy is not enough for test-time adaptation: From the perspective of disentangled factors

    Jonghyun Lee, Dahuin Jung, Saehyung Lee, Junsung Park, Juhyeon Shin, Uiwon Hwang, and Sungroh Yoon. Entropy is not enough for test-time adaptation: From the perspective of disentangled factors. In International Conference on Learning Representations, 2024

  26. [26]

    Solving quantitative reasoning problems with language models

    Aitor Lewkowycz, Anders Andreassen, David Dohan, Ethan Dyer, Henryk Michalewski, Vinay Ramasesh, Ambrose Slone, Cem Anil, Imanol Schlag, Theo Gutman-Solo, et al. Solving quantitative reasoning problems with language models. In Advances in Neural Information Processing Systems, volume 35, pp.\ 3843--3857, 2022

  27. [27]

    A comprehensive survey on test-time adaptation under distribution shifts

    Jian Liang, Ran He, and Tieniu Tan. A comprehensive survey on test-time adaptation under distribution shifts. International Journal of Computer Vision, 133 0 (1): 0 31--64, 2025

  28. [28]

    Let's verify step by step

    Hunter Lightman, Vineet Kosaraju, Yuri Burda, Harrison Edwards, Bowen Baker, Teddy Lee, Jan Leike, John Schulman, Ilya Sutskever, and Karl Cobbe. Let's verify step by step. In The Twelfth International Conference on Learning Representations, 2023

  29. [29]

    Ttt++: When does self-supervised test-time training fail or thrive? In Advances in Neural Information Processing Systems, volume 34, pp.\ 21808--21820, 2021

    Yuejiang Liu, Parth Kothari, Bastien van Delft, Baptiste Bellot-Gurlet, Taylor Mordan, and Alexandre Alahi. Ttt++: When does self-supervised test-time training fail or thrive? In Advances in Neural Information Processing Systems, volume 34, pp.\ 21808--21820, 2021

  30. [30]

    Unsupervised domain adaptation with residual transfer networks

    Mingsheng Long, Han Zhu, Jianmin Wang, and Michael I Jordan. Unsupervised domain adaptation with residual transfer networks. In Advances in Neural Information Processing Systems, volume 29, 2016

  31. [31]

    Universal test-time adaptation through weight ensembling, diversity weighting, and prior correction

    Robert A Marsden, Mario D \"o bler, and Bin Yang. Universal test-time adaptation through weight ensembling, diversity weighting, and prior correction. In Winter Conference on Applications of Computer Vision , pp.\ 2555--2565, 2024

  32. [32]

    Human-level control through deep reinforcement learning

    Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. Human-level control through deep reinforcement learning. nature, 518 0 (7540): 0 529--533, 2015

  33. [33]

    Asynchronous methods for deep reinforcement learning

    Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. Asynchronous methods for deep reinforcement learning. In International conference on machine learning, pp.\ 1928--1937. PmLR, 2016

  34. [34]

    Minimal-Entropy Correlation Alignment for Unsupervised Deep Domain Adaptation

    Pietro Morerio, Jacopo Cavazza, and Vittorio Murino. Minimal-entropy correlation alignment for unsupervised deep domain adaptation. arXiv preprint arXiv:1711.10288, 2017

  35. [35]

    Bigger, regularized, optimistic: scaling for compute and sample efficient continuous control

    Michal Nauman, Mateusz Ostaszewski, Krzysztof Jankowski, Piotr Mi o \'s , and Marek Cygan. Bigger, regularized, optimistic: scaling for compute and sample efficient continuous control. Advances in neural information processing systems, 37: 0 113038--113071, 2024

  36. [36]

    Efficient test-time model adaptation without forgetting

    Shuaicheng Niu, Jiaxiang Wu, Yifan Zhang, Yaofo Chen, Shijian Zheng, Peilin Zhao, and Mingkui Tan. Efficient test-time model adaptation without forgetting. In International Conference on Machine Learning, pp.\ 16888--16905. PMLR, 2022

  37. [37]

    Towards stable test-time adaptation in dynamic wild world

    Shuaicheng Niu, Jiaxiang Wu, Yifan Zhang, Zhiquan Wen, Yaofo Chen, Peilin Zhao, and Mingkui Tan. Towards stable test-time adaptation in dynamic wild world. In Internetional Conference on Learning Representations, pp.\ 1--14, 2023

  38. [38]

    Test-time model adaptation with only forward passes

    Shuaicheng Niu, Chunyan Miao, Guohao Chen, Pengcheng Wu, and Peilin Zhao. Test-time model adaptation with only forward passes. In International Conference on Machine Learning, 2024

  39. [39]

    Adapt in the wild: Test-time entropy minimization with sharpness and feature regularization

    Shuaicheng Niu, Guohao Chen, Deyu Chen, Yifan Zhang, Jiaxiang Wu, Zhiquan Wen, Yaofo Chen, Peilin Zhao, Chunyan Miao, and Mingkui Tan. Adapt in the wild: Test-time entropy minimization with sharpness and feature regularization. arXiv preprint arXiv:2509.04977, 2025 a

  40. [40]

    Self-bootstrapping for versatile test-time adaptation

    Shuaicheng Niu, Guohao Chen, Peilin Zhao, Tianyi Wang, Pengcheng Wu, and Zhiqi Shen. Self-bootstrapping for versatile test-time adaptation. In International Conference on Machine Learning, 2025 b

  41. [41]

    Representation Learning with Contrastive Predictive Coding

    Aaron van den Oord, Yazhe Li, and Oriol Vinyals. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018

  42. [42]

    Regularization with stochastic transformations and perturbations for deep semi-supervised learning

    Mehdi Sajjadi, Mehran Javanmardi, and Tolga Tasdizen. Regularization with stochastic transformations and perturbations for deep semi-supervised learning. In Advances in Neural Information Processing Systems, volume 29, 2016

  43. [43]

    A theoretical analysis of contrastive unsupervised representation learning

    Nikunj Saunshi, Orestis Plevrakis, Sanjeev Arora, Mikhail Khodak, and Hrishikesh Khandeparkar. A theoretical analysis of contrastive unsupervised representation learning. In International conference on machine learning, pp.\ 5628--5637. PMLR, 2019

  44. [44]

    Trust region policy optimization

    John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz. Trust region policy optimization. In International conference on machine learning, pp.\ 1889--1897. PMLR, 2015

  45. [45]

    Proximal Policy Optimization Algorithms

    John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017

  46. [46]

    Bigger, better, faster: Human-level atari with human-level efficiency

    Max Schwarzer, Johan Samir Obando Ceron, Aaron Courville, Marc G Bellemare, Rishabh Agarwal, and Pablo Samuel Castro. Bigger, better, faster: Human-level atari with human-level efficiency. In International Conference on Machine Learning, pp.\ 30365--30380. PMLR, 2023

  47. [47]

    Test-time training with self-supervision for generalization under distribution shifts

    Yu Sun, Xiaolong Wang, Zhuang Liu, John Miller, Alexei Efros, and Moritz Hardt. Test-time training with self-supervision for generalization under distribution shifts. In International Conference on Machine Learning, pp.\ 9229--9248, 2020

  48. [48]

    Uncertainty-calibrated test-time model adaptation without forgetting

    Mingkui Tan, Guohao Chen, Jiaxiang Wu, Yifan Zhang, Yaofo Chen, Peilin Zhao, and Shuaicheng Niu. Uncertainty-calibrated test-time model adaptation without forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

  49. [49]

    Mathscale: Scaling instruction tuning for mathematical reasoning

    Zhengyang Tang, Xingxing Zhang, Benyou Wang, and Furu Wei. Mathscale: Scaling instruction tuning for mathematical reasoning. In Forty-first International Conference on Machine Learning, 2024

  50. [50]

    Tent: Fully test-time adaptation by entropy minimization

    Dequan Wang, Evan Shelhamer, Shaoteng Liu, Bruno Olshausen, and Trevor Darrell. Tent: Fully test-time adaptation by entropy minimization. In International Conference on Learning Representations, pp.\ 1--12, 2021

  51. [51]

    Pytorch image models

    Ross Wightman. Pytorch image models. https://github.com/rwightman/pytorch-image-models, 2019

  52. [52]

    Test-time adapted reinforcement learning with action entropy regularization

    Shoukai Xu, Mingkui Tan, Liu Liu, Zhong Zhang, Peilin Zhao, et al. Test-time adapted reinforcement learning with action entropy regularization. In International Conference on Machine Learning, 2025

  53. [53]

    Towards test time adaptation via calibrated entropy minimization

    Hao Yang, Min Wang, Jinshen Jiang, and Yun Zhou. Towards test time adaptation via calibrated entropy minimization. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp.\ 3736--3746, 2024

  54. [54]

    Barlow twins: Self-supervised learning via redundancy reduction

    Jure Zbontar, Li Jing, Ishan Misra, Yann LeCun, and St \'e phane Deny. Barlow twins: Self-supervised learning via redundancy reduction. In International conference on machine learning, pp.\ 12310--12320. PMLR, 2021

  55. [55]

    How does simsiam avoid collapse without negative samples? a unified understanding with self-supervised contrastive learning

    Chaoning Zhang, Kang Zhang, Chenshuang Zhang, Trung X Pham, Chang D Yoo, and In So Kweon. How does simsiam avoid collapse without negative samples? a unified understanding with self-supervised contrastive learning. In International Conference on Learning Representations, 2022 a

  56. [56]

    Memo: Test time robustness via adaptation and augmentation

    Marvin Mengxin Zhang, Sergey Levine, and Chelsea Finn. Memo: Test time robustness via adaptation and augmentation. In Advances in Neural Information Processing Systems, pp.\ 38629--38642, 2022 b

  57. [57]

    Come: Test-time adaption by conservatively minimizing entropy

    Qingyang Zhang, Yatao Bian, Xinke Kong, Peilin Zhao, and Changqing Zhang. Come: Test-time adaption by conservatively minimizing entropy. In International Conference on Learning Representations, 2025

  58. [58]

    Maximum entropy inverse reinforcement learning

    Brian D Ziebart, Andrew L Maas, J Andrew Bagnell, Anind K Dey, et al. Maximum entropy inverse reinforcement learning. In AAAI Conference on Artificial Intelligence, volume 8, pp.\ 1433--1438. Chicago, IL, USA, 2008

  59. [59]

    TTRL: Test-Time Reinforcement Learning

    Yuxin Zuo, Kaiyan Zhang, Li Sheng, Shang Qu, Ganqu Cui, Xuekai Zhu, Haozhan Li, Yuchen Zhang, Xinwei Long, Ermo Hua, et al. Ttrl: Test-time reinforcement learning. arXiv preprint arXiv:2504.16084, 2025

  60. [60]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

  61. [61]

    @esa (Ref

    \@ifxundefined[1] #1\@undefined \@firstoftwo \@secondoftwo \@ifnum[1] #1 \@firstoftwo \@secondoftwo \@ifx[1] #1 \@firstoftwo \@secondoftwo [2] @ #1 \@temptokena #2 #1 @ \@temptokena \@ifclassloaded agu2001 natbib The agu2001 class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later remove the command n...

  62. [62]

    \@lbibitem[] @bibitem@first@sw\@secondoftwo \@lbibitem[#1]#2 \@extra@b@citeb \@ifundefined br@#2\@extra@b@citeb \@namedef br@#2 \@nameuse br@#2\@extra@b@citeb \@ifundefined b@#2\@extra@b@citeb @num @parse #2 @tmp #1 NAT@b@open@#2 NAT@b@shut@#2 \@ifnum @merge>\@ne @bibitem@first@sw \@firstoftwo \@ifundefined NAT@b*@#2 \@firstoftwo @num @NAT@ctr \@secondoft...

  63. [63]

    EmtJ ԾޭDJB #eM -_w4 KUhee]Y iuu((ٲAKr!@9. H O9G벓AN'^ rrd H + MF^CV s+\*6A /sp op? nOpD[5] |3 \쟻x_‹q1A=pg]Is`c DGb1 ] j ]7I `( NR 7ƾZ SNѝdy HNˆYA FV =# \0] I5

    @open @close @open @close and [1] URL: #1 \@ifundefined chapter * \@mkboth \@ifxundefined @sectionbib * \@mkboth * \@mkboth\@gobbletwo \@ifclassloaded amsart * \@ifclassloaded amsbook * \@ifxundefined @heading @heading NAT@ctr thebibliography [1] @ \@biblabel @NAT@ctr \@bibsetup #1 @NAT@ctr @ @openbib .11em \@plus.33em \@minus.07em 4000 4000 `\.\@m @bibit...