Recognition: 2 theorem links
· Lean TheoremThe Agent Use of Agent Beings: Agent Cybernetics Is the Missing Science of Foundation Agents
Pith reviewed 2026-05-12 04:59 UTC · model grok-4.3
The pith
Cybernetics supplies the first principles needed to build reliable, long-running foundation agents.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By mapping six canonical laws of classical cybernetics onto six agent design principles and synthesizing those principles into the three engineering desiderata of reliability, lifelong running, and self-improvement, the authors establish Agent Cybernetics as the missing theoretical scaffold for foundation agents that perceive, reason, and act across long horizons.
What carries the argument
The direct mapping of six classical cybernetics laws to six modern agent design principles, which is then synthesized into the three desiderata that define the Agent Cybernetics framework.
If this is right
- Code generation agents remain on-task across many steps when control and communication laws are applied to error handling.
- Computer-use agents achieve lifelong running by adapting to changing interfaces through feedback mechanisms drawn from cybernetics.
- Automated research agents can pursue safe self-improvement when capability growth is constrained by the three desiderata.
- Common failure modes such as drifting off-task or exceeding representational capacity are diagnosed and mitigated using the mapped principles.
- Engineering shifts from assembling primitives by trial and error to designing agents from the three explicit desiderata.
Where Pith is reading between the lines
- The framework could inspire new evaluation benchmarks that test agents on cybernetic properties like stability and adaptation rather than task success alone.
- Links to broader control theory may allow agents to borrow stability guarantees from engineering domains outside AI.
- Applying the same mapping to non-LLM agents could reveal whether the approach is specific to language models or general to autonomous systems.
Load-bearing premise
The six laws of classical cybernetics can be mapped directly onto LLM-based agents in a useful way without further derivation or empirical checks.
What would settle it
Build two versions of a long-horizon agent for the same task, one following the Agent Cybernetics principles and one following standard engineering practice, then measure which version stays on-task longer and exhibits safer self-improvement.
Figures
read the original abstract
LLM-based foundation agents that perceive, reason, and act across thousands of reasoning steps are rapidly becoming the dominant paradigm for deploying artificial intelligence in open-ended, long-horizon complex tasks. Despite this significance, the field remains overwhelmingly engineering-driven. Engineering practice has converged on useful primitives (tool loops, memory banks, harnesses, reflection steps), yet these are assembled by empirical trial and error rather than from first principles. Fundamental questions remain open: under what conditions does a long-running agent remain on-task? How should an agent respond when its environment exceeds its representational capacity? What architectural properties are necessary for safe self-improvement? We argue that cybernetics, the mid-twentieth-century science of control and communication in complex systems, provides the missing theoretical scaffold for foundation agents. By mapping six canonical laws of classical cybernetics onto six agent design principles, and synthesizing those principles into three engineering desiderata (reliability, lifelong running, and self-Improvement), we arrive at a framework termed Agent Cybernetics. Three application domains, code generation, computer use and automated research, exemplify the analytical framework of agent cybernetics by identifying failure modes and concrete engineering recommendations. We hope that agent cybernetics opens a new research venue and establishes the scientific foundation that foundation agents need for principled, reliable real-world deployment.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that cybernetics, the mid-twentieth-century science of control and communication, supplies the missing theoretical scaffold for LLM-based foundation agents. It does so by mapping six canonical laws of classical cybernetics onto six agent design principles, synthesizing those principles into three engineering desiderata (reliability, lifelong running, and self-improvement), and applying the resulting 'Agent Cybernetics' framework to identify failure modes and concrete recommendations in the domains of code generation, computer use, and automated research.
Significance. If the mapping can be shown to yield non-trivial, testable guidance that improves upon existing agent architectures, the framework could help organize engineering practice around long-horizon reliability and safe self-improvement. As presented, however, the contribution remains an interpretive synthesis without derivation, empirical tests, or direct comparison to prior agent models, limiting its immediate impact on the field.
major comments (3)
- [Framework introduction and mapping section] The central mapping of the six classical cybernetics laws to six agent design principles is asserted without step-by-step derivation, adaptation for discrete token-based stochastic reasoning, finite context windows, or sampling noise. This absence is load-bearing because the claim that cybernetics is 'the missing science' rests on the mapping being useful and non-circular.
- [Application domains section] The three application domains (code generation, computer use, automated research) identify failure modes conceptually but supply no measurements, baselines, or ablation studies showing that the proposed principles resolve the stated open questions on on-task persistence or safe self-improvement.
- [Related work and synthesis section] No explicit comparison is made to existing agent paradigms (ReAct-style loops, memory-augmented systems, hierarchical planners) to demonstrate that the synthesized desiderata are novel or superior rather than a relabeling of known engineering practices.
minor comments (2)
- [Abstract] The abstract contains an inconsistent capitalization ('self-Improvement').
- [References] Ensure original sources for the six canonical cybernetics laws are cited with precise references rather than relying on secondary summaries.
Simulated Author's Rebuttal
We thank the referee for their constructive review and for identifying areas where the presentation of the Agent Cybernetics framework can be strengthened. We address each major comment below, clarifying the conceptual scope of the work while outlining targeted revisions to improve rigor and clarity.
read point-by-point responses
-
Referee: [Framework introduction and mapping section] The central mapping of the six classical cybernetics laws to six agent design principles is asserted without step-by-step derivation, adaptation for discrete token-based stochastic reasoning, finite context windows, or sampling noise. This absence is load-bearing because the claim that cybernetics is 'the missing science' rests on the mapping being useful and non-circular.
Authors: We acknowledge that the mappings are presented as an interpretive synthesis rather than a formal derivation from first principles. The intent is to adapt classical cybernetic laws (originally for continuous control systems) to the discrete, stochastic, and context-limited setting of LLM agents by identifying functional analogies in control, feedback, and stability. In the revised manuscript we will expand the framework section with an explicit justification subsection for each mapping. This will include step-by-step reasoning on how each law translates to token-based reasoning, finite context, and sampling noise, using concrete agent failure examples to show non-circularity. The expanded discussion will make the utility of the mapping more transparent while preserving the paper's conceptual character. revision: partial
-
Referee: [Application domains section] The three application domains (code generation, computer use, automated research) identify failure modes conceptually but supply no measurements, baselines, or ablation studies showing that the proposed principles resolve the stated open questions on on-task persistence or safe self-improvement.
Authors: The manuscript is a theoretical synthesis whose primary contribution is the framework itself; the application domains serve to illustrate how the desiderata surface concrete failure modes and recommendations rather than to validate them empirically. We agree that the absence of measurements limits immediate impact. In revision we will add a forward-looking subsection to the applications section that outlines testable predictions and experimental designs (e.g., persistence metrics in long-horizon code tasks or safety checks in self-improvement loops) that future work could use to evaluate the principles. This keeps the current scope intact while addressing the referee's concern about empirical grounding. revision: partial
-
Referee: [Related work and synthesis section] No explicit comparison is made to existing agent paradigms (ReAct-style loops, memory-augmented systems, hierarchical planners) to demonstrate that the synthesized desiderata are novel or superior rather than a relabeling of known engineering practices.
Authors: We will insert a new subsection (likely in the synthesis or related-work portion) that directly contrasts Agent Cybernetics with ReAct-style loops, memory-augmented architectures, and hierarchical planners. The comparison will show that while these paradigms supply useful primitives, they lack an explicit organizing theory for long-horizon reliability, lifelong operation under representational limits, and safe self-improvement. By mapping the three desiderata onto these existing practices, we will argue that Agent Cybernetics supplies a higher-level scaffold that integrates rather than duplicates them. This addition will clarify the framework's novelty without changing the paper's non-empirical nature. revision: yes
Circularity Check
Agent Cybernetics is defined by the authors' mapping and synthesis, making the 'missing science' claim self-referential by construction
specific steps
-
self definitional
[Abstract]
"By mapping six canonical laws of classical cybernetics onto six agent design principles, and synthesizing those principles into three engineering desiderata (reliability, lifelong running, and self-Improvement), we arrive at a framework termed Agent Cybernetics."
The framework 'Agent Cybernetics' is explicitly defined as the output of the authors' mapping and synthesis operation; the assertion that cybernetics thereby supplies the missing scaffold for foundation agents is therefore equivalent to the construction itself rather than derived from independent evidence or non-trivial transformation of the laws.
full rationale
The paper's derivation chain consists solely of asserting a direct mapping from six classical cybernetics laws to six agent principles, followed by synthesis into three desiderata, after which the result is named 'Agent Cybernetics' and declared the missing theoretical scaffold. No equations, adaptation steps for token-based or stochastic agents, or contrasts with prior frameworks appear in the provided text. This reduces the central claim to the authors' definitional act rather than an independent derivation or external benchmark, satisfying the self-definitional pattern.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Six canonical laws of classical cybernetics apply to foundation agents
invented entities (1)
-
Agent Cybernetics framework
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
By mapping six canonical laws of classical cybernetics onto six agent design principles, and synthesizing those principles into three engineering desiderata (reliability, lifelong running, and self-Improvement)
-
IndisputableMonolith/Foundation/ArrowOfTime.leanz_monotone_absolute echoes?
echoesECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.
Context Entropy Minimization... H(output)≥H(E)−C channel
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
On-policy distillation of language models: Learning from self-generated mistakes
Rishabh Agarwal, Nino Vieillard, Yongchao Zhou, Piotr Stanczyk, Sabela Ramos Garea, Matthieu Geist, and Olivier Bachem. On-policy distillation of language models: Learning from self-generated mistakes. InICLR, 2024
work page 2024
-
[2]
Springer Science & Business Media, 2013
William Ashby.Design for a Brain: The Origin of Adaptive Behaviour. Springer Science & Business Media, 2013
work page 2013
-
[3]
William Ross Ashby.An Introduction to Cybernetics. Chapman & Hall, 1956
work page 1956
-
[4]
Rogerio Bonatti, Dan Zhao, Francesco Bonacci, Dillon Dupont, Sara Abdali, Yinheng Li, Yadong Lu, Justin Wagle, Kazuhito Koishida, Arthur Bucker, et al. Windows agent arena: Evaluating multi-modal os agents at scale.arXiv preprint arXiv:2409.08264, 2024
- [5]
-
[6]
Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory
Prateek Chhikara, Dev Khant, Saket Aryan, Taranjeet Singh, and Deshraj Yadav. Mem0: Building production-ready AI agents with scalable long-term memory.arXiv preprint arXiv:2504.19413, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[7]
On the Measure of Intelligence
François Chollet. On the measure of intelligence.arXiv preprint arXiv:1911.01547, 2019
work page internal anchor Pith review arXiv 1911
-
[8]
SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?
Xiang Deng, Jeff Da, Edwin Pan, Yannis Yiming He, Charles Ide, Kanak Garg, Niklas Lauffer, Andrew Park, Nitin Pasari, Chetan Rane, et al. SWE-bench pro: Can AI agents solve long- horizon software engineering tasks?arXiv preprint arXiv:2509.16941, 2025
work page internal anchor Pith review arXiv 2025
-
[9]
Yilun Du, Shuang Li, Antonio Torralba, Joshua B. Tenenbaum, and Igor Mordatch. Improving factuality and reasoning in language models through multiagent debate. InICML, 2024
work page 2024
-
[10]
Shiyang Feng, Runmin Ma, Xiangchao Yan, Yue Fan, Yusong Hu, Songtao Huang, Shuaiyu Zhang, Zongsheng Cao, Tianshuo Peng, Jiakang Yuan, et al. Internagent-1.5: A unified agentic framework for long-horizon autonomous scientific discovery.arXiv preprint arXiv:2602.08990, 2026
-
[11]
Autonomous closed-loop framework for reproducible perovskite solar cells.Nature, pages 1–3, 2026
Danpeng Gao, Shuaihua Lu, Chunlei Zhang, Ning Wang, Zexin Yu, Xianglang Sun, Rebecca Martin, Francesco Vanin, Liangchen Qian, Nicholas Long, et al. Autonomous closed-loop framework for reproducible perovskite solar cells.Nature, pages 1–3, 2026
work page 2026
-
[12]
On the Reliability of Computer Use Agents
Gonzalo Gonzalez-Pumariega, Saaket Agashe, Jiachen Yang, Ang Li, and Xin Eric Wang. On the reliability of computer use agents.arXiv preprint arXiv:2604.17849, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[13]
A survey on LLM-as-a-judge.The Innovation, 2024
Jiawei Gu, Xuhui Jiang, Zhichao Shi, Hexiang Tan, Xuehao Zhai, Chengjin Xu, Wei Li, Yinghan Shen, Shengjie Ma, Honghao Liu, et al. A survey on LLM-as-a-judge.The Innovation, 2024
work page 2024
-
[14]
Mastering diverse control tasks through world models.Nature, 640(8059):647–653, 2025
Danijar Hafner, Jurgis Pasukonis, Jimmy Ba, and Timothy Lillicrap. Mastering diverse control tasks through world models.Nature, 640(8059):647–653, 2025
work page 2025
-
[15]
Olympiad-level formal mathematical reasoning with reinforcement learning.Nature, pages 1–3, 2025
Thomas Hubert, Rishi Mehta, Laurent Sartran, Miklós Z Horváth, Goran Žuži´c, Eric Wieser, Aja Huang, Julian Schrittwieser, Yannick Schroecker, Hussain Masoom, et al. Olympiad-level formal mathematical reasoning with reinforcement learning.Nature, pages 1–3, 2025
work page 2025
-
[16]
Physical Intelligence, Bo Ai, Ali Amin, Raichelle Aniceto, Ashwin Balakrishna, Greg Balke, Kevin Black, George Bokinsky, Shihao Cao, Thomas Charbonnier, Vedant Choudhary, Foster Collins, Ken Conley, Grace Connors, James Darpinian, Karan Dhabalia, Maitrayee Dhaka, Jared DiCarlo, Danny Driess, Michael Equi, Adnan Esmail, Yunhao Fang, Chelsea Finn, 15 Cather...
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[17]
Pengcheng Jiang, Jiacheng Lin, Zhiyi Shi, Zifeng Wang, Luxi He, Yichen Wu, Ming Zhong, Peiyang Song, Qizheng Zhang, Heng Wang, et al. Adaptation of agentic AI.arXiv preprint arXiv:2512.16301, 2025
-
[18]
SWE-bench: Can language models resolve real-world github issues? In ICLR, 2024
Carlos E Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, and Karthik R Narasimhan. SWE-bench: Can language models resolve real-world github issues? In ICLR, 2024
work page 2024
-
[19]
OS-Harm: A benchmark for measuring safety of computer use agents
Thomas Kuntz, Agatha Duzan, Hao Zhao, Francesco Croce, J Zico Kolter, Nicolas Flammarion, and Maksym Andriushchenko. OS-Harm: A benchmark for measuring safety of computer use agents. InNeurIPS Datasets and Benchmarks Track, 2025
work page 2025
-
[20]
Meta-Harness: End-to-End Optimization of Model Harnesses
Yoonho Lee, Roshen Nair, Qizheng Zhang, Kangwook Lee, Omar Khattab, and Chelsea Finn. Meta-harness: End-to-end optimization of model harnesses.arXiv preprint arXiv:2603.28052, 2026
work page internal anchor Pith review arXiv 2026
-
[21]
Retrieval-augmented generation for knowledge-intensive NLP tasks
Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks. InNeurIPS, pages 9459–9474, 2020
work page 2020
-
[22]
SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks
Xiangyi Li, Wenbo Chen, Yimin Liu, Shenghan Zheng, Xiaokun Chen, Yifeng He, Yubo Li, Bingran You, Haotian Shen, Jiankai Sun, et al. SkillsBench: Benchmarking how well agent skills work across diverse tasks.arXiv preprint arXiv:2602.12670, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[23]
Xinghua Lou, Miguel Lázaro-Gredilla, Antoine Dedieu, Carter Wendelken, Wolfgang Lehrach, and Kevin P Murphy. AutoHarness: improving LLM agents by automatically synthesizing a code harness.arXiv preprint arXiv:2603.03329, 2026
-
[24]
Training language models to follow instructions with human feedback
Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. Training language models to follow instructions with human feedback. InNeurIPS, pages 27730–27744, 2022
work page 2022
-
[25]
Tool learning with foundation models.ACM Computing Surveys, 57(4):1–40, 2024
Yujia Qin, Shengding Hu, Yankai Lin, Weize Chen, Ning Ding, Ganqu Cui, Zheni Zeng, Xuanhe Zhou, Yufei Huang, Chaojun Xiao, et al. Tool learning with foundation models.ACM Computing Surveys, 57(4):1–40, 2024
work page 2024
-
[26]
Direct preference optimization: your language model is secretly a reward model
Rafael Rafailov, Archit Sharma, Eric Mitchell, Stefano Ermon, Christopher D Manning, and Chelsea Finn. Direct preference optimization: your language model is secretly a reward model. InNeurIPS, pages 53728–53741, 2023
work page 2023
-
[27]
Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent Sifre, Si- mon Schmitt, Arthur Guez, Edward Lockhart, Demis Hassabis, Thore Graepel, et al. Mastering Atari, go, chess and shogi by planning with a learned model.Nature, 588(7839):604–609, 2020
work page 2020
-
[28]
A mathematical theory of communication.The Bell System Technical Journal, 27(3):379–423, 1948
Claude Elwood Shannon. A mathematical theory of communication.The Bell System Technical Journal, 27(3):379–423, 1948
work page 1948
-
[29]
Reflexion: language agents with verbal reinforcement learning
Noah Shinn, Federico Cassano, Ashwin Gopinath, Karthik Narasimhan, and Shunyu Yao. Reflexion: language agents with verbal reinforcement learning. InProceedings of the 37th International Conference on Neural Information Processing Systems, pages 8634–8652, 2023. 16
work page 2023
-
[30]
Cognitive architectures for language agents.Transactions on Machine Learning Research, 2023
Theodore Sumers, Shunyu Yao, Karthik R Narasimhan, and Thomas L Griffiths. Cognitive architectures for language agents.Transactions on Machine Learning Research, 2023
work page 2023
-
[31]
Richard S Sutton and Andrew G Barto.Reinforcement Learning: An Introduction. MIT press Cambridge, 1998
work page 1998
-
[32]
Karlsson, Bo An, Shuicheng Y AN, and Zongqing Lu
Weihao Tan, Wentao Zhang, Xinrun Xu, Haochong Xia, Ziluo Ding, Boyu Li, Bohan Zhou, Junpeng Yue, Jiechuan Jiang, Yewen Li, Ruyi An, Molei Qin, Chuqiao Zong, Longtao Zheng, YuJie Wu, Xiaoqiang Chai, Yifei Bi, Tianbao Xie, Pengjie Gu, Xiyun Li, Ceyao Zhang, Long Tian, Chaojie Wang, Xinrun Wang, Börje F. Karlsson, Bo An, Shuicheng Y AN, and Zongqing Lu. Crad...
work page 2025
-
[33]
The information bottleneck method
Naftali Tishby, Fernando C Pereira, and William Bialek. The information bottleneck method. arXiv preprint physics/0004057, 2000
work page internal anchor Pith review Pith/arXiv arXiv 2000
-
[34]
Deep learning and the information bottleneck principle
Naftali Tishby and Noga Zaslavsky. Deep learning and the information bottleneck principle. In 2015 ieee information theory workshop (itw), pages 1–5. Ieee, 2015
work page 2015
-
[35]
Solving olympiad geometry without human demonstrations.Nature, 625(7995):476–482, 2024
Trieu H Trinh, Yuhuai Wu, Quoc V Le, He He, and Thang Luong. Solving olympiad geometry without human demonstrations.Nature, 625(7995):476–482, 2024
work page 2024
-
[36]
Hsue Shen Tsien.Engineering Cybernetics. McGraw-Hill, New York, 1954
work page 1954
-
[37]
Parametrically retargetable decision-makers tend to seek power
Alexander Matt Turner and Prasad Tadepalli. Parametrically retargetable decision-makers tend to seek power. InNeurIPS, pages 31391–31401, 2022
work page 2022
-
[38]
Heinz V on Foerster. Cybernetics of cybernetics. InUnderstanding understanding: Essays on cybernetics and cognition, pages 283–286. Springer, 2003
work page 2003
-
[39]
Grand Central Publishing, 1988
Norbert Wiener.The Human Use of Human Beings: Cybernetics and Society. Grand Central Publishing, 1988
work page 1988
-
[40]
Norbert Wiener.Cybernetics or Control and Communication in the Animal and the Machine. MIT Press, 2019
work page 2019
-
[41]
Michael Wooldridge.An Introduction to Multiagent Systems. John wiley & sons, 2009
work page 2009
-
[42]
OSWorld: benchmarking multimodal agents for open-ended tasks in real computer environments
Tianbao Xie, Danyang Zhang, Jixuan Chen, Xiaochuan Li, Siheng Zhao, Ruisheng Cao, Toh Jing Hua, Zhoujun Cheng, Dongchan Shin, Fangyu Lei, et al. OSWorld: benchmarking multimodal agents for open-ended tasks in real computer environments. InNeurIPS, pages 52040–52094, 2024
work page 2024
-
[43]
Zhenghai Xue, Longtao Zheng, Qian Liu, Yingru Li, Xiaosen Zheng, Zejun Ma, and Bo An. SimpleTIR: End-to-end reinforcement learning for multi-turn tool-integrated reasoning.arXiv preprint arXiv:2509.02479, 2025
-
[44]
arXiv preprint arXiv:2603.01145 , year=
Yutao Yang, Junsong Li, Qianjun Pan, Bihao Zhan, Yuxuan Cai, Lin Du, Jie Zhou, Kai Chen, Qin Chen, Xin Li, et al. Autoskill: Experience-driven lifelong learning via skill self-evolution. arXiv preprint arXiv:2603.01145, 2026
-
[45]
On-Policy Context Distillation for Language Models
Tianzhu Ye, Li Dong, Xun Wu, Shaohan Huang, and Furu Wei. On-policy context distillation for language models.arXiv preprint arXiv:2602.12275, 2026
work page internal anchor Pith review arXiv 2026
- [46]
- [47]
-
[48]
Hyperagents.arXiv preprint arXiv:2603.19461, 2026
Jenny Zhang, Bingchen Zhao, Wannan Yang, Jakob Foerster, Jeff Clune, Minqi Jiang, Sam Devlin, and Tatiana Shavrina. Hyperagents.arXiv preprint arXiv:2603.19461, 2026. 17
-
[49]
A Survey on Test-Time Scaling in Large Language Models: What, How, Where, and How Well?
Qiyuan Zhang, Fuyuan Lyu, Zexu Sun, Lei Wang, Weixu Zhang, Wenyue Hua, Haolun Wu, Zhihan Guo, Yufei Wang, Niklas Muennighoff, et al. A survey on test-time scaling in large language models: What, how, where, and how well?arXiv preprint arXiv:2503.24235, 2025
work page internal anchor Pith review arXiv 2025
-
[50]
Zhen Zhang, Zhichu Ren, Chia-Wei Hsu, Weibin Chen, Zhang-Wei Hong, Chi-Feng Lee, Aubrey Penn, Hongbin Xu, Daniel J Zheng, Shuhan Miao, et al. A multimodal robotic platform for multi-element electrocatalyst discovery.Nature, 647(8089):390–396, 2025
work page 2025
-
[51]
Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering
Chenyu Zhou, Huacan Chai, Wenteng Chen, Zihan Guo, Rong Shan, Yuanyi Song, Tianyi Xu, Yingxuan Yang, Aofan Yu, Weiming Zhang, et al. Externalization in LLM agents: A unified review of memory, skills, protocols and harness engineering.arXiv preprint arXiv:2604.08224, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[52]
Brianna Zitkovich, Tianhe Yu, Sichun Xu, Peng Xu, Ted Xiao, Fei Xia, Jialin Wu, Paul Wohlhart, Stefan Welker, Ayzaan Wahid, et al. RT-2: Vision-language-action models transfer web knowledge to robotic control. InCoRL, pages 2165–2183, 2023. 18 A Frequent Asked Questions (FAQs) A.1 What Are the Novelties of The Two-level Homeostatic Architecture in Agent C...
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.