Agentic Monte Carlo enables RL-style optimization of black-box LLM agents by sampling from the optimal policy posterior using Sequential Monte Carlo.
2001.An Introduction to Sequential Monte Carlo Methods
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2verdicts
UNVERDICTED 2representative citing papers
TreeCoder improves LLM code generation accuracy by representing decoding as an optimizable tree search over programs with first-class constraints for syntax, style, and execution, outperforming baselines on MBPP and SQL-Spider.
citing papers explorer
-
Agentic Monte Carlo: Simulating Reinforcement Learning for Black-Box Agents
Agentic Monte Carlo enables RL-style optimization of black-box LLM agents by sampling from the optimal policy posterior using Sequential Monte Carlo.
-
TreeCoder: Systematic Exploration and Optimisation of Decoding and Constraints for LLM Code Generation
TreeCoder improves LLM code generation accuracy by representing decoding as an optimizable tree search over programs with first-class constraints for syntax, style, and execution, outperforming baselines on MBPP and SQL-Spider.