GILP combines a small parameterized world model with LLM agent reasoning via a consistency gate, reducing hallucinated-state rate from 0.176 to 0.035 and raising success from 0.668 to 0.838 on graph planning benchmarks.
Large language models as commonsense knowledge for large-scale task planning
5 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
BFS-based LLM framework reduces causal graph discovery queries from quadratic to linear while incorporating observational data and reporting state-of-the-art results on real graphs.
The optimal reasoning strategy for LLMs depends on the model's diversity profile rather than the exploration method itself.
Digital twin representations from vision foundation models enable LLM-based planning for robust peg transfer and gauze retrieval on the dVRK surgical platform with claimed generalizability.
A survey that provides a taxonomy of methods for improving planning in LLM-based agents across task decomposition, plan selection, external modules, reflection, and memory.
citing papers explorer
-
Grounded Iterative Language Planning: How Parameterized World Models Reduce Hallucination Propagation in LLM Agents
GILP combines a small parameterized world model with LLM agent reasoning via a consistency gate, reducing hallucinated-state rate from 0.176 to 0.035 and raising success from 0.668 to 0.838 on graph planning benchmarks.
-
Efficient Causal Graph Discovery Using Large Language Models
BFS-based LLM framework reduces causal graph discovery queries from quadratic to linear while incorporating observational data and reporting state-of-the-art results on real graphs.
-
Your Model Diversity, Not Method, Determines Reasoning Strategy
The optimal reasoning strategy for LLMs depends on the model's diversity profile rather than the exploration method itself.
-
Towards Robust Surgical Automation via Digital Twin Representations from Foundation Models
Digital twin representations from vision foundation models enable LLM-based planning for robust peg transfer and gauze retrieval on the dVRK surgical platform with claimed generalizability.
-
Understanding the planning of LLM agents: A survey
A survey that provides a taxonomy of methods for improving planning in LLM-based agents across task decomposition, plan selection, external modules, reflection, and memory.