PDDL planning problems are used to generate about one million precise reasoning steps for training Process Reward Models, and adding this data to existing datasets improves LLM performance on both mathematical and non-mathematical reasoning benchmarks.
Nau and Paolo Traverso , title =
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
A survey categorizes existing work on counterfactual reasoning in automated planning by changed elements, timing of reasoning, reasons for changes, and methods used.
citing papers explorer
-
Process Reward Models Meet Planning: Generating Precise and Scalable Datasets for Step-Level Rewards
PDDL planning problems are used to generate about one million precise reasoning steps for training Process Reward Models, and adding this data to existing datasets improves LLM performance on both mathematical and non-mathematical reasoning benchmarks.
-
Counterfactual Reasoning in Automated Planning
A survey categorizes existing work on counterfactual reasoning in automated planning by changed elements, timing of reasoning, reasons for changes, and methods used.