Recognition: 3 theorem links
Cognitive Architectures for Language Agents
Pith reviewed 2026-05-16 19:30 UTC · model grok-4.3
The pith
CoALA structures language agents with modular memory components, a structured action space, and a generalized decision-making process drawn from cognitive science.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CoALA describes a language agent with modular memory components, a structured action space to interact with internal memory and external environments, and a generalized decision-making process to choose actions. We use CoALA to retrospectively survey and organize a large body of recent work, and prospectively identify actionable directions towards more capable agents.
What carries the argument
CoALA, the proposed architecture that equips language agents with modular memory components, a structured action space for internal and external interactions, and a generalized decision-making process.
Load-bearing premise
That principles from cognitive science and symbolic AI transfer directly to LLM-based agents without losing the flexibility and broad capabilities that make the models effective.
What would settle it
A controlled experiment in which agents built without modular memory or structured actions match or exceed the performance and generalization of CoALA-based agents on multi-step reasoning and tool-use benchmarks.
read the original abstract
Recent efforts have augmented large language models (LLMs) with external resources (e.g., the Internet) or internal control flows (e.g., prompt chaining) for tasks requiring grounding or reasoning, leading to a new class of language agents. While these agents have achieved substantial empirical success, we lack a systematic framework to organize existing agents and plan future developments. In this paper, we draw on the rich history of cognitive science and symbolic artificial intelligence to propose Cognitive Architectures for Language Agents (CoALA). CoALA describes a language agent with modular memory components, a structured action space to interact with internal memory and external environments, and a generalized decision-making process to choose actions. We use CoALA to retrospectively survey and organize a large body of recent work, and prospectively identify actionable directions towards more capable agents. Taken together, CoALA contextualizes today's language agents within the broader history of AI and outlines a path towards language-based general intelligence.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes Cognitive Architectures for Language Agents (CoALA), a descriptive framework for LLM-based agents that incorporates modular memory components, a structured action space for interactions with internal memory and external environments, and a generalized decision-making process. Drawing from cognitive science and symbolic AI, CoALA is applied retrospectively to organize a survey of recent language agent work and prospectively to suggest directions for more capable agents, with the goal of contextualizing current systems within broader AI history.
Significance. If the framework holds as an organizational tool, it offers a useful bridge between empirical language agent successes and historical cognitive/symbolic principles, potentially aiding systematic planning of future developments without introducing fitted parameters or self-referential derivations. The conceptual mapping of existing systems strengthens its value as a retrospective lens, though its prospective utility remains to be tested empirically.
minor comments (2)
- In the survey section, the mapping of specific agents (e.g., those using external tools) to CoALA modules could include a table summarizing coverage to make the retrospective organization more explicit and verifiable.
- The description of the generalized decision-making process in §3 would benefit from a short pseudocode example to clarify how it differs from standard prompt chaining without relying solely on prose.
Simulated Author's Rebuttal
We thank the referee for their positive review and recommendation to accept the manuscript. Their summary accurately captures the goals, structure, and contributions of CoALA as a framework for organizing language agents.
Circularity Check
No significant circularity: CoALA is a conceptual framework proposal
full rationale
The paper proposes CoALA as a descriptive architecture drawing on external cognitive science and symbolic AI literature to organize language agents with modular memory, structured actions, and decision-making. No equations, fitted parameters, or predictions are defined; the central claims are retrospective organization of prior work and prospective suggestions, with no self-referential reductions or load-bearing self-citations that collapse the framework into its own inputs. The derivation chain is self-contained against external benchmarks from cognitive science.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Cognitive science and symbolic AI supply reusable principles for agent design
invented entities (1)
-
CoALA architecture
no independent evidence
Lean theorems connected to this paper
-
LogicAsFunctionalEquationlaws_of_logic_imply_dalembert_hypotheses unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We draw on the rich history of cognitive science and symbolic artificial intelligence to propose Cognitive Architectures for Language Agents (CoALA).
-
HierarchyEmergencehierarchy_emergence_forces_phi unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
CoALA contextualizes today’s language agents within the broader history of AI and outlines a path towards language-based general intelligence.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 18 Pith papers
-
Prompt Infection: LLM-to-LLM Prompt Injection within Multi-Agent Systems
Prompt injection attacks can self-replicate across LLM agents in multi-agent systems, enabling data theft, misinformation, and system disruption while propagating silently.
-
The Moltbook Files: A Harmless Slopocalypse or Humanity's Last Experiment
An AI-agent social platform generated mostly neutral content whose use in fine-tuning reduced model truthfulness comparably to human Reddit data, suggesting limited unique harm but flagging tail risks like secret leaks.
-
More Is Not Always Better: Cross-Component Interference in LLM Agent Scaffolding
Full factorial testing of five LLM agent components reveals that the complete 'All-In' combination is consistently outperformed by smaller subsets due to cross-component interference, with optimal subsets being task- ...
-
OCR-Memory: Optical Context Retrieval for Long-Horizon Agent Memory
OCR-Memory encodes agent trajectories as images with visual anchors and retrieves verbatim text via locate-and-transcribe, yielding gains on long-horizon benchmarks under strict context limits.
-
The Missing Knowledge Layer in Cognitive Architectures for AI Agents
Cognitive architectures for AI agents require a distinct Knowledge layer with indefinite supersession persistence, separate from Memory decay, Wisdom evidence-gating, and Intelligence ephemerality.
-
ClawVM: Harness-Managed Virtual Memory for Stateful Tool-Using LLM Agents
ClawVM introduces a harness-managed virtual memory system for LLM agents that ensures deterministic residency and durability of state under token budgets by using typed pages and validated writeback.
-
ROZA Graphs: Self-Improving Near-Deterministic RAG through Evidence-Centric Feedback
ROZA graphs enable self-improving RAG by storing evidence-specific reasoning chains, yielding up to 10.6pp accuracy gains and 46% lower cost through graph traversal feedback.
-
MatClaw: An Autonomous Code-First LLM Agent for End-to-End Materials Exploration
MatClaw is a code-first LLM agent that autonomously executes end-to-end materials workflows by generating and running Python scripts on remote clusters, achieving reliable code generation via memory architecture and R...
-
$\tau$-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
τ-bench shows state-of-the-art agents like GPT-4o succeed on under 50% of tool-using, rule-following tasks and are inconsistent across repeated trials.
-
How to Interpret Agent Behavior
ACT*ONOMY is a Grounded-Theory-derived hierarchical taxonomy and open repository that enables systematic comparison and characterization of autonomous agent behavior across trajectories.
-
Memanto: Typed Semantic Memory with Information-Theoretic Retrieval for Long-Horizon Agents
Memanto delivers 89.8% and 87.1% accuracy on LongMemEval and LoCoMo benchmarks using typed semantic memory and information-theoretic retrieval, outperforming hybrid graph and vector systems with a single query and zer...
-
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
OS-Atlas, trained on the largest open-source cross-platform GUI grounding corpus of 13 million elements, outperforms prior open-source models on six benchmarks across mobile, desktop, and web platforms.
-
A Roadmap to Pluralistic Alignment
The paper formalizes three types of pluralistic AI models and three benchmark classes, arguing that current alignment techniques may reduce rather than increase distributional pluralism.
-
Is Grep All You Need? How Agent Harnesses Reshape Agentic Search
Grep retrieval generally outperforms vector retrieval in agentic search tasks, with performance varying strongly by agent harness and tool-calling style.
-
Large Language Model based Multi-Agents: A Survey of Progress and Challenges
The paper surveys LLM-based multi-agent systems, covering simulated domains, agent profiling and communication, mechanisms for capacity growth, and common benchmarks.
-
The Rise and Potential of Large Language Model Based Agents: A Survey
The paper surveys the origins, frameworks, applications, and open challenges of AI agents built on large language models.
-
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models
The paper surveys reinforced reasoning techniques for LLMs, covering automated data construction, learning-to-reason methods, and test-time scaling as steps toward Large Reasoning Models.
-
Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security
This survey discusses key components and challenges for Personal LLM Agents and reviews solutions for their capability, efficiency, and security.
Reference graph
Works this paper leans on
-
[1]
M. Ahn, A. Brohan, N. Brown, Y. Chebotar, O. Cortes, B. David, C. Finn, C. Fu, K. Gopalakrishnan, K. Hausman, et al. Do as I can, not as I say: Grounding language in robotic affordances.arXiv preprint arXiv:2204.01691,
work page internal anchor Pith review Pith/arXiv arXiv
-
[2]
J. Andreas. Language models as agent models. InFindings of the Association for Computational Linguistics: EMNLP 2022, pages 5769–5779,
work page 2022
-
[3]
19 Published in Transactions on Machine Learning Research (02/2024) A. D. Baddeley and G. Hitch. Working memory. InPsychology of Learning and Motivation, volume 8, pages 47–89. Elsevier,
work page 2024
-
[4]
Y. Bai, S. Kadavath, S. Kundu, A. Askell, J. Kernion, A. Jones, A. Chen, A. Goldie, A. Mirhoseini, C. McKinnon, et al. Constitutional AI: Harmlessness from AI feedback.arXiv preprint arXiv:2212.08073,
work page internal anchor Pith review Pith/arXiv arXiv
-
[5]
C. Blundell, B. Uria, A. Pritzel, Y. Li, A. Ruderman, J. Z. Leibo, J. Rae, D. Wierstra, and D. Hassabis. Model-free episodic control.arXiv preprint arXiv:1606.04460,
work page internal anchor Pith review Pith/arXiv arXiv
-
[6]
RT-1: Robotics Transformer for Real-World Control at Scale
A. Brohan, N. Brown, J. Carbajal, Y. Chebotar, J. Dabis, C. Finn, K. Gopalakrishnan, K. Hausman, A. Herzog, J. Hsu, et al. RT-1: Robotics transformer for real-world control at scale.arXiv preprint arXiv:2212.06817,
work page internal anchor Pith review Pith/arXiv arXiv
-
[7]
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
A. Brohan, N. Brown, J. Carbajal, Y. Chebotar, X. Chen, K. Choromanski, T. Ding, D. Driess, A. Dubey, C. Finn, et al. RT-2: Vision-language-action models transfer web knowledge to robotic control.arXiv preprint arXiv:2307.15818,
work page internal anchor Pith review Pith/arXiv arXiv
- [8]
-
[9]
C.-M. Chan, W. Chen, Y. Su, J. Yu, W. Xue, S. Zhang, J. Fu, and Z. Liu. Chateval: Towards better llm-based evaluators through multi-agent debate.arXiv preprint arXiv:2308.07201,
work page internal anchor Pith review Pith/arXiv arXiv
-
[10]
B. Chen, F. Xia, B. Ichter, K. Rao, K. Gopalakrishnan, M. S. Ryoo, A. Stone, and D. Kappler. Open- vocabulary queryable scene representations for real world planning. In2023 IEEE International Conference on Robotics and Automation (ICRA), pages 11509–11522, 2023a. 20 Published in Transactions on Machine Learning Research (02/2024) D. Chen and R. Mooney. L...
work page 2024
-
[11]
D. Chen, A. Fisch, J. Weston, and A. Bordes. Reading Wikipedia to answer open-domain questions.arXiv preprint arXiv:1704.00051,
work page internal anchor Pith review Pith/arXiv arXiv
-
[12]
M. Chen, J. Tworek, H. Jun, Q. Yuan, H. P. d. O. Pinto, J. Kaplan, H. Edwards, Y. Burda, N. Joseph, G. Brockman, et al. Evaluating large language models trained on code.arXiv preprint arXiv:2107.03374,
work page internal anchor Pith review Pith/arXiv arXiv
-
[13]
X. Chen, M. Lin, N. Schärli, and D. Zhou. Teaching large language models to self-debug.arXiv preprint arXiv:2304.05128, 2023b. Y. Chen, L. Yuan, G. Cui, Z. Liu, and H. Ji. A close look into the calibration of pre-trained language models. arXiv preprint arXiv:2211.00151,
work page internal anchor Pith review Pith/arXiv arXiv
-
[14]
PaLM: Scaling Language Modeling with Pathways
A. Chowdhery, S. Narang, J. Devlin, M. Bosma, G. Mishra, A. Roberts, P. Barham, H. W. Chung, C. Sutton, S. Gehrmann, et al. Palm: Scaling language modeling with pathways.arXiv preprint arXiv:2204.02311,
work page internal anchor Pith review Pith/arXiv arXiv
-
[15]
M.-A. Côté, A. Kádár, X. Yuan, B. Kybartas, T. Barnes, E. Fine, J. Moore, M. Hausknecht, L. El Asri, M. Adada, et al. Textworld: A learning environment for text-based games. InComputer Games: 7th Workshop, CGW 2018, pages 41–75. Springer,
work page 2018
- [16]
-
[17]
X. Deng, Y. Gu, B. Zheng, S. Chen, S. Stevens, B. Wang, H. Sun, and Y. Su. Mind2Web: Towards a generalist agent for the web.arXiv preprint arXiv:2306.06070,
work page internal anchor Pith review Pith/arXiv arXiv
- [18]
- [19]
-
[20]
PaLM-E: An Embodied Multimodal Language Model
D. Driess, F. Xia, M. S. Sajjadi, C. Lynch, A. Chowdhery, B. Ichter, A. Wahid, J. Tompson, Q. Vuong, T. Yu, et al. Palm-e: An embodied multimodal language model.arXiv preprint arXiv:2303.03378,
work page internal anchor Pith review Pith/arXiv arXiv
-
[21]
21 Published in Transactions on Machine Learning Research (02/2024) Y. Du, S. Li, A. Torralba, J. B. Tenenbaum, and I. Mordatch. Improving factuality and reasoning in language models through multiagent debate.arXiv preprint arXiv:2305.14325,
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[22]
A. Ecoffet, J. Huizinga, J. Lehman, K. O. Stanley, and J. Clune. Go-explore: a new approach for hard- exploration problems. arXiv preprint arXiv:1901.10995,
- [23]
-
[24]
D. Ganguli, A. Askell, N. Schiefer, T. Liao, K. Lukoši¯ ut˙ e, A. Chen, A. Goldie, A. Mirhoseini, C. Olsson, D. Hernandez, et al. The capacity for moral self-correction in large language models.arXiv preprint arXiv:2302.07459,
- [25]
- [26]
- [27]
-
[28]
URLhttps://github.com/guidance-ai/guidance. I. Gur, H. Furuta, A. Huang, M. Safdari, Y. Matsuo, D. Eck, and A. Faust. A real-world webagent with planning, long context understanding, and program synthesis.arXiv preprint arXiv:2307.12856,
work page internal anchor Pith review arXiv
-
[29]
S. Hao, Y. Gu, H. Ma, J. J. Hong, Z. Wang, D. Z. Wang, and Z. Hu. Reasoning with language model is planning with world model.arXiv preprint arXiv:2305.14992,
work page internal anchor Pith review arXiv
- [30]
- [31]
-
[32]
S. Hong, X. Zheng, J. Chen, Y. Cheng, C. Zhang, Z. Wang, S. K. S. Yau, Z. Lin, L. Zhou, C. Ran, et al. Metagpt: Meta programming for multi-agent collaborative framework.arXiv preprint arXiv:2308.00352, 2023a. W. Hong, W. Wang, Q. Lv, J. Xu, W. Yu, J. Ji, Y. Wang, Z. Wang, Y. Dong, M. Ding, et al. Cogagent: A visual language model for gui agents.arXiv prep...
work page internal anchor Pith review Pith/arXiv arXiv
-
[33]
Inner Monologue: Embodied Reasoning through Planning with Language Models
W. Huang, P. Abbeel, D. Pathak, and I. Mordatch. Language models as zero-shot planners: Extracting actionable knowledge for embodied agents. In International Conference on Machine Learning, pages 9118–9147, 2022b. W. Huang, F. Xia, T. Xiao, H. Chan, J. Liang, P. Florence, A. Zeng, J. Tompson, I. Mordatch, Y. Chebotar, et al. Inner monologue: Embodied reas...
work page internal anchor Pith review Pith/arXiv arXiv
-
[34]
G. Irving, P. Christiano, and D. Amodei. AI safety via debate.arXiv preprint arXiv:1805.00899,
work page internal anchor Pith review Pith/arXiv arXiv
-
[35]
Unsupervised Dense Information Retrieval with Contrastive Learning
G. Izacard, M. Caron, L. Hosseini, S. Riedel, P. Bojanowski, A. Joulin, and E. Grave. Unsupervised dense information retrieval with contrastive learning.arXiv preprint arXiv:2112.09118,
work page internal anchor Pith review Pith/arXiv arXiv
- [36]
-
[37]
O. Khattab, K. Santhanam, X. L. Li, D. Hall, P. Liang, C. Potts, and M. Zaharia. Demonstrate-search-predict: Composing retrieval and language models for knowledge-intensive NLP.arXiv preprint arXiv:2212.14024,
-
[38]
URL https://github.com/stanfordnlp/dspy. G.Kim, P.Baldi, andS.McAleer. Languagemodelscansolvecomputertasks. arXiv preprint arXiv:2303.17491,
- [39]
-
[40]
C. Laidlaw, S. Russell, and A. Dragan. Bridging rl theory and practice with the effective horizon.arXiv preprint arXiv:2304.09853,
- [41]
-
[42]
Y. LeCun. A path towards autonomous machine intelligence version 0.9. 2, 2022-06-27.Open Review, 62,
work page 2022
-
[43]
B. Z. Li, W. Chen, P. Sharma, and J. Andreas. Lampp: Language models as probabilistic priors for perception and action. arXiv preprint arXiv:2302.02801, 2023a. C. Li, Z. Gan, Z. Yang, J. Yang, L. Li, L. Wang, and J. Gao. Multimodal foundation models: From specialists to general-purpose assistants.arXiv preprint arXiv:2309.10020, 2023b. H. Li, Y. Su, D. Ca...
-
[44]
Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate
T. Liang, Z. He, W. Jiao, X. Wang, Y. Wang, R. Wang, Y. Yang, Z. Tu, and S. Shi. Encouraging divergent thinking in large language models through multi-agent debate.arXiv preprint arXiv:2305.19118, 2023b. F. Lieder and T. L. Griffiths. Resource-rational analysis: Understanding human cognition as the optimal use of limited computational resources.Behavioral...
work page internal anchor Pith review Pith/arXiv arXiv
- [45]
-
[46]
B. Liu, Y. Jiang, X. Zhang, Q. Liu, S. Zhang, J. Biswas, and P. Stone. LLM+P: Empowering large language models with optimal planning proficiency.arXiv preprint arXiv:2304.11477, 2023a. H. Liu, C. Li, Q. Wu, and Y. J. Lee. Visual instruction tuning. InNeurIPS, 2023b. H. Liu, C. Sferrazza, and P. Abbeel. Languages are rewards: Hindsight finetuning using hum...
work page internal anchor Pith review Pith/arXiv arXiv
-
[47]
P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neubig. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing.ACM Computing Surveys, 55(9), 2023d. ISSN 0360-0300. R. Liu, J. Wei, S. S. Gu, T.-Y. Wu, S. Vosoughi, C. Cui, D. Zhou, and A. M. Dai. Mind’s eye: Grounded language model reasoning through simu...
- [48]
-
[49]
Self-Refine: Iterative Refinement with Self-Feedback
25 Published in Transactions on Machine Learning Research (02/2024) A. Madaan, N. Tandon, P. Gupta, S. Hallinan, L. Gao, S. Wiegreffe, U. Alon, N. Dziri, S. Prabhumoye, Y. Yang, et al. Self-refine: Iterative refinement with self-feedback.arXiv preprint arXiv:2303.17651,
work page internal anchor Pith review Pith/arXiv arXiv 2024
- [50]
-
[51]
Augmented Language Models: a Survey
G. Mialon, R. Dessì, M. Lomeli, C. Nalmpantis, R. Pasunuru, R. Raileanu, B. Rozière, T. Schick, J. Dwivedi- Yu, A. Celikyilmaz, et al. Augmented language models: a survey. arXiv preprint arXiv:2302.07842,
work page internal anchor Pith review Pith/arXiv arXiv
-
[52]
WebGPT: Browser-assisted question-answering with human feedback
R. Nakano, J. Hilton, S. Balaji, J. Wu, L. Ouyang, C. Kim, C. Hesse, S. Jain, V. Kosaraju, W. Saunders, et al. WebGPT: Browser-Assisted Question-Answering with Human Feedback.arXiv preprint arXiv:2112.09332,
work page internal anchor Pith review Pith/arXiv arXiv
-
[53]
K. Nguyen and H. Daumé III. Help, Anna! visual navigation with natural multimodal assistance via retrospective curiosity-encouraging imitation learning.arXiv preprint arXiv:1909.01871,
-
[54]
K. Nguyen, Y. Bisk, and H. Daumé III. A framework for learning to request rich and contextually useful information from humans. InICML, July 2022a. 26 Published in Transactions on Machine Learning Research (02/2024) K. X. Nguyen. Language models are bounded pragmatic speakers. InFirst Workshop on Theory of Mind in Communicating Agents,
work page 2024
-
[55]
K. X. Nguyen, Y. Bisk, and H. D. Iii. A framework for learning to request rich and contextually useful information from humans. InInternational Conference on Machine Learning, pages 16553–16568, 2022b. T. T. Nguyen, T. T. Huynh, P. L. Nguyen, A. W.-C. Liew, H. Yin, and Q. V. H. Nguyen. A survey of machine unlearning.arXiv preprint arXiv:2209.02299, 2022c....
-
[56]
M. Nye, A. J. Andreassen, G. Gur-Ari, H. Michalewski, J. Austin, D. Bieber, D. Dohan, A. Lewkowycz, M. Bosma, D. Luan, et al. Show your work: Scratchpads for intermediate computation with language models. arXiv preprint arXiv:2112.00114,
work page internal anchor Pith review Pith/arXiv arXiv
-
[57]
OpenAI. Gpt-4 technical report.ArXiv, abs/2303.08774, 2023a. OpenAI. Function calling and other API updates, 2023b. URL https://openai.com/blog/ function-calling-and-other-api-updates. L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama, A. Ray, et al. Training language models to follow instructions with human...
work page internal anchor Pith review Pith/arXiv arXiv
-
[58]
A. Padmakumar, J. Thomason, A. Shrivastava, P. Lange, A. Narayan-Chen, S. Gella, R. Piramuthu, G. Tur, and D. Hakkani-Tur. Teach: Task-driven embodied agents that chat. InProceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 2017–2025,
work page 2017
-
[59]
N. D. Palo, A. Byravan, L. Hasenclever, M. Wulfmeier, N. Heess, and M. Riedmiller. Towards a unified agent with foundation models. InWorkshop on Reincarnating Reinforcement Learning at ICLR 2023,
work page 2023
- [60]
-
[61]
J. S. Park, J. C. O’Brien, C. J. Cai, M. R. Morris, P. Liang, and M. S. Bernstein. Generative agents: Interactive simulacra of human behavior.arXiv preprint arXiv:2304.03442,
work page internal anchor Pith review Pith/arXiv arXiv
-
[62]
A. Peng, I. Sucholutsky, B. Li, T. R. Sumers, T. L. Griffiths, J. Andreas, and J. A. Shah. Language guided state abstractions. InWorkshop on Social Intelligence in Humans and Robots at RSS 2023,
work page 2023
-
[63]
27 Published in Transactions on Machine Learning Research (02/2024) M. L. Puterman.Markov decision processes: discrete stochastic dynamic programming. John Wiley & Sons,
work page 2024
-
[64]
C. Qian, X. Cong, C. Yang, W. Chen, Y. Su, J. Xu, Z. Liu, and M. Sun. Communicative agents for software development. arXiv preprint arXiv:2307.07924,
work page internal anchor Pith review Pith/arXiv arXiv
-
[65]
Y. Qin, S. Liang, Y. Ye, K. Zhu, L. Yan, Y. Lu, Y. Lin, X. Cong, X. Tang, B. Qian, et al. Toolllm: Facilitating large language models to master 16000+ real-world apis.arXiv preprint arXiv:2307.16789,
work page internal anchor Pith review Pith/arXiv arXiv
- [66]
-
[67]
Code Llama: Open Foundation Models for Code
B. Rozière, J. Gehring, F. Gloeckle, S. Sootla, I. Gat, X. Tan, Y. Adi, J. Liu, T. Remez, J. Rapin, A. Kozhevnikov, I. Evtimov, J. Bitton, M. P. Bhatt, C. C. Ferrer, A. Grattafiori, W. Xiong, A. D’efossez, J. Copet, F. Azhar, H. Touvron, L. Martin, N. Usunier, T. Scialom, and G. Synnaeve. Code llama: Open foundation models for code.ArXiv, abs/2308.12950,
work page internal anchor Pith review Pith/arXiv arXiv
- [68]
-
[69]
Self-critiquing models for assisting human evaluators
W. Saunders, C. Yeh, J. Wu, S. Bills, L. Ouyang, J. Ward, and J. Leike. Self-critiquing models for assisting human evaluators.arXiv preprint arXiv:2206.05802,
work page internal anchor Pith review arXiv
-
[70]
Toolformer: Language Models Can Teach Themselves to Use Tools
T. Schick, J. Dwivedi-Yu, R. Dessì, R. Raileanu, M. Lomeli, L. Zettlemoyer, N. Cancedda, and T. Scialom. Toolformer: Language models can teach themselves to use tools.arXiv preprint arXiv:2302.04761,
work page internal anchor Pith review Pith/arXiv arXiv
-
[71]
D. Sculley, G. Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V. Chaudhary, and M. Young. Machine Learning: The High Interest Credit Card of Technical Debt. InSE4ML: Software Engineering for Machine Learning (NIPS 2014 Workshop),
work page 2014
-
[72]
Reflexion: Language Agents with Verbal Reinforcement Learning
N. Shinn, F. Cassano, B. Labash, A. Gopinath, K. Narasimhan, and S. Yao. Reflexion: Language agents with verbal reinforcement learning.arXiv preprint arXiv:2303.11366,
work page internal anchor Pith review Pith/arXiv arXiv
-
[73]
ALFWorld: Aligning Text and Embodied Environments for Interactive Learning
M. Shridhar, X. Yuan, M.-A. Côté, Y. Bisk, A. Trischler, and M. Hausknecht. Alfworld: Aligning text and embodied environments for interactive learning.arXiv preprint arXiv:2010.03768,
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[74]
28 Published in Transactions on Machine Learning Research (02/2024) T. Silver, V. Hariprasad, R. S. Shuttleworth, N. Kumar, T. Lozano-Pérez, and L. P. Kaelbling. Pddl planning with pretrained large language models. InNeurIPS 2022 Foundation Models for Decision Making Workshop,
work page 2024
- [75]
-
[76]
O. Tafjord, B. Dalvi, and P. Clark. Proofwriter: Generating implications, proofs, and abductive statements over natural language. InFindings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 3621–3634,
work page 2021
-
[77]
doi: 10.18653/v1/2020.acl-main.559
Association for Computational Linguistics. doi: 10.18653/v1/2020.acl-main.559. M. Tambe, W. L. Johnson, R. M. Jones, F. Koss, J. E. Laird, P. S. Rosenbloom, and K. Schwamb. Intelligent agents for interactive simulation environments.AI magazine, 16(1):15–15,
-
[78]
M. Tang, S. Yao, J. Yang, and K. Narasimhan. Referral augmentation for zero-shot information retrieval, 2023a. Q. Tang, Z. Deng, H. Lin, X. Han, Q. Liang, and L. Sun. ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases.arXiv preprint arXiv:2306.05301, 2023b. G. Team, R. Anil, S. Borgeaud, Y. Wu, J.-B. Alayrac, J. Yu, R. Sor...
work page internal anchor Pith review Pith/arXiv arXiv
-
[79]
Multi-stage episodic control for strategic exploration in text games
J. Tuyls, S. Yao, S. Kakade, and K. Narasimhan. Multi-stage episodic control for strategic exploration in text games. arXiv preprint arXiv:2201.01251,
-
[80]
arXiv preprint arXiv:2206.10498 , year=
29 Published in Transactions on Machine Learning Research (02/2024) K. Valmeekam, A. Olmo, S. Sreedharan, and S. Kambhampati. Large language models still can’t plan (a benchmark for llms on planning and reasoning about change).arXiv preprint arXiv:2206.10498,
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.