pith. sign in

arxiv: 1907.08352 · v1 · pith:UMYV34BHnew · submitted 2019-07-19 · 💻 cs.AI

Representation Learning for Classical Planning from Partially Observed Traces

Pith reviewed 2026-05-24 19:41 UTC · model grok-4.3

classification 💻 cs.AI
keywords classical planningdomain model learninggraph neural networksrepresentation learningpartially observed tracesheuristic learningmodel-based planningvectorized models
0
0 comments X

The pith

A graph neural network learns vectorized domain models from partial traces that solve more real planning problems than declarative models from ARMS.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes LP-GNN, a framework that learns planning domain models in vectorized form from partially observed traces by embedding propositions and actions into a graph processed by a neural network. This embedding is used to uncover latent relationships that become domain-specific heuristics, combining model-free learning with model-based planning. The approach avoids the need for complete declarative specifications in languages like STRIPS or PDDL, which are sensitive to inaccuracies. A sympathetic reader would care because manually writing full domain models remains a major bottleneck for applying classical planning in real-world settings. On five classical domains the learned models prove more effective at solving actual planning instances than those produced by the ARMS learner.

Core claim

The authors claim that embedding propositions and actions in a graph within the LP-GNN framework allows the exploration of latent relationships to form domain-specific heuristics; the resulting vectorized domain models integrate model-free learning and model-based planning and are much more effective on solving real planning problems than the declarative models output by ARMS.

What carries the argument

LP-GNN, a graph neural network that embeds propositions and actions in a graph to derive domain-specific heuristics from partial traces.

If this is right

  • Domain models no longer need to be expressed in exact declarative languages to be usable by planners.
  • Planning tasks become solvable from incomplete observation traces without full model specification.
  • Heuristics arise automatically from the graph structure rather than from hand-crafted rules.
  • The same learned representation can be applied across multiple planning instances in a domain.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same graph-embedding idea could be tested on planning domains with greater partial observability to measure robustness limits.
  • The learned vector representations might transfer to related sequential decision tasks such as automated verification or scheduling.
  • Combining the LP-GNN heuristics with modern heuristic-search planners not used in the original experiments could produce further gains.

Load-bearing premise

Embedding propositions and actions in a graph lets the neural network discover relationships that produce heuristics a planner can actually use.

What would settle it

Run the learned LP-GNN models and the ARMS models on the same set of planning instances from the five test domains and count solved problems; if LP-GNN solves no more or fewer instances, the effectiveness claim is falsified.

Figures

Figures reproduced from arXiv: 1907.08352 by Hai Wan, Hankui Hankz Zhuo, Jinxia Lin, Yanan Liu, Zhanhao Xiao.

Figure 1
Figure 1. Figure 1: An Overview of LP-GNN a training set for an action selection network. The action selection network is trained to return the actions executed in the former state in every pair, which are considered as appropriate actions towards the latter state. The heuristic function is obtained via computing the distances to the appropriate actions. Then, the heuristic function learned helps to choose a suitable action t… view at source ↗
Figure 2
Figure 2. Figure 2: Comparisons on instances solved with various observation percentages. LP-GNN is our approach and LP-GNN -SVM and LP-GNN -RF are our approaches with replacing action selection MLP by SVM and Random Forest. Instances solved are the testing instances which are solved under the original domain model by the plans computed according to the learned domain model. by other states. So, we only focus on the true prop… view at source ↗
read the original abstract

Specifying a complete domain model is time-consuming, which has been a bottleneck of AI planning technique application in many real-world scenarios. Most classical domain-model learning approaches output a domain model in the form of the declarative planning language, such as STRIPS or PDDL, and solve new planning instances by invoking an existing planner. However, planning in such a representation is sensitive to the accuracy of the learned domain model which probably cannot be used to solve real planning problems. In this paper, to represent domain models in a vectorization representation way, we propose a novel framework based on graph neural network (GNN) integrating model-free learning and model-based planning, called LP-GNN. By embedding propositions and actions in a graph, the latent relationship between them is explored to form a domain-specific heuristics. We evaluate our approach on five classical planning domains, comparing with the classical domain-model learner ARMS. The experimental results show that the domain models learned by our approach are much more effective on solving real planning problems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes LP-GNN, a graph neural network framework for learning vectorized domain models from partially observed planning traces. Propositions and actions are embedded in a graph to explore latent relationships that form domain-specific heuristics. The approach integrates model-free learning with model-based planning and is evaluated on five classical planning domains, where it is claimed to produce domain models that are much more effective for solving real planning problems than those learned by ARMS.

Significance. If the experimental claims hold, the work could advance domain model learning by providing a vectorized representation that is potentially more robust to incomplete traces than declarative models. The GNN-based integration of representation learning and planning offers a novel direction for handling real-world scenarios where full domain models are hard to specify.

major comments (2)
  1. [Abstract] Abstract: The central claim that 'the domain models learned by our approach are much more effective on solving real planning problems' is asserted without any reported metrics, statistical tests, error analysis, or even the names of the five domains and the specific planner used, which is load-bearing for evaluating the comparison to ARMS.
  2. [Abstract] Abstract: The core mechanism—that embedding propositions and actions in a graph lets the GNN discover latent relationships forming usable domain-specific heuristics—is stated without justification that standard GNN aggregation preserves or enforces planning semantics such as consistent preconditions/effects or valid state transitions from the traces; this assumption underpins the claim that the vectorized model can be integrated into model-based planning.
minor comments (1)
  1. [Abstract] Abstract: The abstract refers to 'five classical planning domains' without naming them or indicating the performance metrics (e.g., success rate, plan quality) used in the comparison.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for these constructive comments on the abstract. Both points identify areas where the abstract can be strengthened to better support the paper's claims. We will revise the abstract in the next version and provide more context below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim that 'the domain models learned by our approach are much more effective on solving real planning problems' is asserted without any reported metrics, statistical tests, error analysis, or even the names of the five domains and the specific planner used, which is load-bearing for evaluating the comparison to ARMS.

    Authors: We agree that the abstract is too high-level and does not include the concrete details needed to evaluate the central claim. The body of the manuscript reports results across five domains with comparisons to ARMS using a standard planner, including success rates and plan quality metrics. In the revision we will expand the abstract to name the domains, specify the planner, and summarize the key quantitative improvements along with the evaluation protocol. revision: yes

  2. Referee: [Abstract] Abstract: The core mechanism—that embedding propositions and actions in a graph lets the GNN discover latent relationships forming usable domain-specific heuristics—is stated without justification that standard GNN aggregation preserves or enforces planning semantics such as consistent preconditions/effects or valid state transitions from the traces; this assumption underpins the claim that the vectorized model can be integrated into model-based planning.

    Authors: The abstract is constrained by length, but the manuscript justifies the mechanism in the model section by describing how the graph is constructed from the traces (with nodes and edges encoding propositions, actions, and observed transitions) and how the training objective encourages the learned embeddings to produce heuristics that respect preconditions and effects. We will add a short clarifying clause to the abstract to indicate that the GNN is trained end-to-end to produce representations compatible with model-based planning. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper presents LP-GNN as a GNN-based embedding of propositions and actions from traces to produce vectorized domain models for planning. No equations, self-citations, or fitted parameters are shown reducing the central claim (superior effectiveness on real problems vs ARMS) to a definition or input by construction. The approach applies standard GNN techniques without renaming known results or smuggling ansatzes; the integration of model-free learning with model-based planning remains an independent empirical claim.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests primarily on the domain assumption that GNN graph embeddings of propositions and actions will yield usable planning heuristics; no free parameters, invented physical entities, or additional axioms are described in the abstract.

axioms (1)
  • domain assumption Embedding propositions and actions in a graph allows the GNN to explore latent relationships that form effective domain-specific heuristics.
    Directly stated in the abstract as the mechanism enabling the framework.
invented entities (1)
  • LP-GNN no independent evidence
    purpose: Framework that learns vectorized domain models integrating model-free learning and model-based planning
    New method introduced by the paper; no independent evidence provided.

pith-pipeline@v0.9.0 · 5711 in / 1377 out tokens · 61927 ms · 2026-05-24T19:41:20.518601+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages · 2 internal anchors

  1. [1]

    Learning STRIPS action models with classical planning

    Diego Aineto, Sergio Jiménez, and Eva Onaindia. Learning STRIPS action models with classical planning. In Proceedings of the Twenty-Eighth International Conference on Automated Planning and Scheduling, ICAPS 2018, Delft, The Netherlands, June 24-29, 2018., pages 399–407,

  2. [2]

    Classical planning in deep latent space: Bridging the subsymbolic- symbolic boundary

    Masataro Asai and Alex Fukunaga. Classical planning in deep latent space: Bridging the subsymbolic- symbolic boundary. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, (AAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018, pages 6094–6101,

  3. [3]

    Relational inductive biases, deep learning, and graph networks

    Peter W. Battaglia, Jessica B. Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinícius Flores Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, Çaglar Gülçehre, Francis Song, Andrew J. Ballard, Justin Gilmer, George E. Dahl, Ashish Vaswani, Kelsey Allen, Charles Nash, Victoria Langston, Chris Dyer, Nicolas Heess, ...

  4. [4]

    Generalised domain model acquisition from action traces

    Stephen Cresswell and Peter Gregory. Generalised domain model acquisition from action traces. In Proceedings of the 21st International Conference on Automated Planning and Scheduling, ICAPS 2011, Freiburg, Germany June 11-16, 2011,

  5. [5]

    Understanding the difficulty of training deep feedforward neural networks

    Xavier Glorot and Yoshua Bengio. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2010, Chia Laguna Resort, Sardinia, Italy, May 13-15, 2010, pages 249–256,

  6. [6]

    Domain model acquisition in domains with action costs

    Peter Gregory and Alan Lindsay. Domain model acquisition in domains with action costs. In Proceedings of the Twenty-Sixth International Conference on Automated Planning and Scheduling, ICAPS 2016, London, UK, June 12-17, 2016., pages 149–157,

  7. [7]

    Kingma and Jimmy Ba

    Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. In3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings,

  8. [8]

    Learning relational dynamics of stochastic domains for planning

    10 David Martínez, Guillem Alenyà, Carme Torras, Tony Ribeiro, and Katsumi Inoue. Learning relational dynamics of stochastic domains for planning. In Proceedings of the Twenty-Sixth International Conference on Automated Planning and Scheduling, ICAPS 2016, London, UK, June 12-17, 2016., pages 235–243,

  9. [9]

    Efficient Estimation of Word Representations in Vector Space

    Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representa- tions in vector space. CoRR, abs/1301.3781,

  10. [10]

    Kira Mourão, Ronald P. A. Petrick, and Mark Steedman. Learning action effects in partially observable domains. In ECAI 2010 - 19th European Conference on Artificial Intelligence, Lisbon, Portugal, August 16-20, 2010, Proceedings, pages 973–974,

  11. [11]

    Zettlemoyer, and Leslie Pack Kaelbling

    Hanna Pasula, Luke S. Zettlemoyer, and Leslie Pack Kaelbling. Learning probabilistic relational planning rules. In Proceedings of the Fourteenth International Conference on Automated Planning and Scheduling (ICAPS 2004), June 3-7 2004, Whistler, British Columbia, Canada, pages 73–82,

  12. [12]

    Efficient, safe, and probably approximately complete learning of action models

    Roni Stern and Brendan Juba. Efficient, safe, and probably approximately complete learning of action models. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, Melbourne, Australia, August 19-25, 2017, pages 4405–4411,

  13. [13]

    Learning by observation and practice: A framework for automatic acquisition of planning operators

    Xuemei Wang. Learning by observation and practice: A framework for automatic acquisition of planning operators. In Proceedings of the 12th National Conference on Artificial Intelligence, Seattle, WA, USA, July 31 - August 4, 1994, Volume 2., page 1496,

  14. [14]

    Action-model acquisition from noisy plan traces

    Hankz Hankui Zhuo and Subbarao Kambhampati. Action-model acquisition from noisy plan traces. In IJCAI 2013, Proceedings of the 23rd International Joint Conference on Artificial Intelligence, Beijing, China, August 3-9, 2013, pages 2444–2450,

  15. [15]

    Transferring knowledge from another domain for learning action models

    Hankui Zhuo, Qiang Yang, Derek Hao Hu, and Lei Li. Transferring knowledge from another domain for learning action models. In PRICAI 2008: Trends in Artificial Intelligence, 10th Pacific Rim International Conference on Artificial Intelligence, Hanoi, Vietnam, December 15-19,

  16. [16]

    Learning action models for multi- agent planning

    Hankz Hankui Zhuo, Hector Muñoz-Avila, and Qiang Yang. Learning action models for multi- agent planning. In Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2011), Taipei, Taiwan, May 2-6, 2011, Volume 1-3, pages 217–224,

  17. [17]

    Cross-domain action-model acquisition for planning via web search

    Hankz Hankui Zhuo, Qiang Yang, Rong Pan, and Lei Li. Cross-domain action-model acquisition for planning via web search. In Proceedings of the 21st International Conference on Automated Planning and Scheduling, ICAPS 2011, Freiburg, Germany June 11-16, 2011,

  18. [18]

    Crowdsourced action-model acquisition for planning

    Hankz Hankui Zhuo. Crowdsourced action-model acquisition for planning. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, January 25-30, 2015, Austin, Texas, USA., pages 3439–3446,