pith. machine review for the scientific record. sign in

arxiv: 2605.02681 · v1 · submitted 2026-05-04 · 💻 cs.CE · cs.AI· cs.GT· econ.TH

Recognition: unknown

The Design and Composition of Structural Causal Decision Processes

Authors on Pith no claims yet

Pith reviewed 2026-05-08 02:22 UTC · model grok-4.3

classification 💻 cs.CE cs.AIcs.GTecon.TH
keywords causalmodelingscdmsstructuralscdpsusefuldecisionmodels
0
0 comments X

The pith

SCDMs and SCDPs are composable causal decision models that are strictly more expressive than POMDPs by allowing endogenous memory formation and variable discounting without rational belief assumptions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This work creates new ways to model how agents decide in systems where causes and effects are tracked explicitly. SCDMs build on existing causal models by letting decisions be limited by what came before them and by leaving some starting variables without full probability rules. These models can be put together in useful ways. SCDPs take this further by making the model repeat over time with a factor that discounts future payoffs. Unlike common AI models called POMDPs that assume agents always update beliefs correctly, these new models can include how memory itself forms as part of the causal structure. This fits situations where agents have limited thinking resources, such as in computer systems, and allows for changing how much future rewards are valued. The authors propose using this for testing economic policies in digital settings, designing rules for information systems, and building digital copies of cyber systems.

Core claim

SCDPs are strictly more expressive than POMDPs because they do not assume rational belief formation. Indeed, an SCDP can endogenously model the memory-formation process, and is thus useful for modeling resource rational agents in dynamic settings.

Load-bearing premise

That SCDMs have a well-defined and computationally useful property of composability, and that SCDPs can be constructed as recurring SCDMs with a discount variable while maintaining the claimed expressiveness advantages over POMDPs.

Figures

Figures reproduced from arXiv: 2605.02681 by Alan Lujan, Sebastian Benthall.

Figure 1
Figure 1. Figure 1: We extend the taxonomy of causal models from Hammond et al view at source ↗
Figure 2
Figure 2. Figure 2: Influence diagram for the two-period consumption problem. Circles denote state variables, rectangles view at source ↗
Figure 3
Figure 3. Figure 3: A model M is composed of M1 ◦ M2. Indirect paths are represented by dotted edges. If all paths from the bridge nodes Y to reward nodes in the second component U2 have a member of Pa𝐷2 ∪ D2 on it, then Y is d-separated from U2 given those nodes. Under that condition the composition is orthomodular or, equivalently, sequential. An indirect path from 𝑌 to 𝑈2 which is not interrupted by Pa𝐷2 ∪ D2 (shown in red… view at source ↗
Figure 4
Figure 4. Figure 4: Influence diagram for the two-period consumption problem with habit formation. The dashed arrow view at source ↗
Figure 5
Figure 5. Figure 5: The composed consumption and portfolio allocation dynamic problem. view at source ↗
Figure 6
Figure 6. Figure 6: An SCDP with latent state and shocks. The agent observes view at source ↗
Figure 7
Figure 7. Figure 7: An SCDP in which the agent consumes at 𝑑, updates their beliefs about the world at 𝑞, and chooses how much to remember at 𝑟. The agent experiences joy with consumption 𝑢 and consternation at remembering carefully 𝑣. Their resources vary over time due to transitory shocks 𝜖𝑏 as well as a latent stochastic income process which is not directly observed. 5.1 Example: Stochastic discount factors We now present … view at source ↗
Figure 8
Figure 8. Figure 8: Influence diagram for the consumption-saving problem with stochastic discount factors. The agent view at source ↗
read the original abstract

We present two new classes of causal models of decision-making agents. Our approach is motivated by the needs of modeling the economics of computing systems. These systems are composed of subsystems and can exhibit endogenous limits on cognitive resources and value discounting. Structural Causal Decision Models (SCDMs) expand on Structural Causal Influence Models. Like SCIMs, they explicitly represent the causal relationships between model variables and the payoffs of agent decisions. Additionally, agent decisions can be constrained by their causal antecedents, and SCDMs can have open root variables for which no probability distribution or structural equation is given. We show that SCDMs have a well-defined and computationally useful property of composability. Building on SCDMs, we then define a Structural Causal Decision Process (SCDP) as a recurring SCDM with a discount variable. SCDPs benefit from the useful composition properties of SCDMs. Moreover, SCDPs are strictly more expressive than POMDPs because they do not assume rational belief formation. Indeed, an SCDP can endogenously model the memory-formation process, and is thus useful for modeling resource rational agents in dynamic settings. SCDPs are also capable of modeling variable discounting, a tool used widely in social scientific modeling. We pose that SCDPs are a useful framework for policy simulation for the digital economy, mechanism design for information systems, and digital twin modeling of cyberinfrastructure.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces Structural Causal Decision Models (SCDMs) as extensions of Structural Causal Influence Models (SCIMs), allowing open root variables and constraining agent decisions by causal antecedents. It asserts that SCDMs have a well-defined and computationally useful composability property. Building on SCDMs, it defines Structural Causal Decision Processes (SCDPs) as recurring SCDMs with a discount variable. The central claims are that SCDPs inherit composability, are strictly more expressive than POMDPs by not assuming rational belief formation and enabling endogenous modeling of memory-formation processes, and are suitable for modeling resource-rational agents with variable discounting in applications such as digital economy policy simulation, mechanism design, and digital twin modeling of cyberinfrastructure.

Significance. If the composability property and strict expressiveness over POMDPs are formally established with supporting derivations, this framework could offer a meaningful advance for modeling decision processes in systems with endogenous cognitive limits and non-rational belief updates. It addresses limitations of standard POMDPs for resource-rational agents in dynamic settings and adds flexibility via variable discounting, with potential utility for policy simulation and mechanism design in computing systems.

major comments (2)
  1. Abstract: The assertion that 'SCDMs have a well-defined and computationally useful property of composability' is stated at a high level without a formal definition of the composition operation, a theorem establishing its properties, or a concrete example demonstrating computational usefulness for recurring processes.
  2. Abstract: The claim that 'SCDPs are strictly more expressive than POMDPs because they do not assume rational belief formation' and that 'an SCDP can endogenously model the memory-formation process' is presented without any derivation, proof, or example showing a specific scenario or structural equation construction that a POMDP cannot represent but an SCDP can.
minor comments (1)
  1. Abstract: The motivation section references 'the economics of computing systems' and 'endogenous limits on cognitive resources' but provides no specific examples of such systems or limits to ground the modeling needs.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed comments on the manuscript. We address each major comment point by point below, clarifying the content of the full paper and indicating revisions that will be made to improve the abstract's precision.

read point-by-point responses
  1. Referee: Abstract: The assertion that 'SCDMs have a well-defined and computationally useful property of composability' is stated at a high level without a formal definition of the composition operation, a theorem establishing its properties, or a concrete example demonstrating computational usefulness for recurring processes.

    Authors: The full manuscript provides the formal definition of the SCDM composition operation, establishes its properties (including associativity and preservation of causal structure under composition) via theorem, and includes a concrete example of composing SCDMs to model a recurring process with computational benefits for modular analysis. To address the referee's concern that the abstract presents this at too high a level, we will revise the abstract to explicitly reference the definition, theorem, and example. revision: yes

  2. Referee: Abstract: The claim that 'SCDPs are strictly more expressive than POMDPs because they do not assume rational belief formation' and that 'an SCDP can endogenously model the memory-formation process' is presented without any derivation, proof, or example showing a specific scenario or structural equation construction that a POMDP cannot represent but an SCDP can.

    Authors: The full manuscript derives the strict expressiveness result by constructing a specific structural equation for endogenous memory formation that does not rely on rational Bayesian updates (which POMDPs require), provides a formal proof of the expressiveness gap, and gives a concrete example of an SCDP that represents a non-rational memory process in a dynamic setting. We will revise the abstract to briefly indicate this construction and refer readers to the relevant derivation and example in the body of the paper. revision: yes

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The central claims rest on standard assumptions from causal modeling literature plus the new definitions; no free parameters or invented physical entities are introduced.

axioms (2)
  • domain assumption Causal relationships between model variables and agent payoffs can be explicitly represented in decision models.
    Invoked in the expansion of SCIMs to SCDMs.
  • domain assumption Agent decisions can be constrained by causal antecedents.
    Stated as an additional feature of SCDMs.
invented entities (2)
  • SCDM no independent evidence
    purpose: New class of causal decision models with composability and open roots.
    Defined in the paper as an expansion of SCIMs.
  • SCDP no independent evidence
    purpose: Recurring SCDM with discount variable for dynamic modeling.
    Defined in the paper as building on SCDMs.

pith-pipeline@v0.9.0 · 5543 in / 1480 out tokens · 65864 ms · 2026-05-08T02:22:46.973033+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

57 extracted references · 9 canonical work pages

  1. [1]

    Akash Agrawal, Joel Dyer, Aldo Glielmo, and Michael J Wooldridge. 2025. Robust policy design in agent-based simulators using adversarial reinforcement learning. InThe First MARW: Multi-Agent AI in the Real World Workshop at AAAI 2025

  2. [2]

    John R Anderson. 1991. Is human cognition adaptive?Behavioral and brain sciences14, 3 (1991), 471–485

  3. [3]

    David I August, Sharad Malik, Li-Shiuan Peh, Vijay Pai, Manish Vachharajani, and Paul Willmann. 2005. Achieving structural and composable modeling of complex systems.International Journal of Parallel Programming33, 2 (2005), 81–101

  4. [4]

    Robert Axelrod. 2006. Agent-based modeling as a bridge between disciplines.Handbook of computational economics2 (2006), 1565–1584

  5. [5]

    Robert L Axtell and J Doyne Farmer. 2025. Agent-based modeling in economics and finance: Past, present, and future. Journal of Economic Literature63, 1 (2025), 197–287

  6. [6]

    Osman Balci, James D Arthur, and William F Ormsby. 2011. Achieving reusability and composability with a simulation conceptual model.Journal of Simulation5, 3 (2011), 157–165

  7. [7]

    Sebastian Benthall. 2019. Situated information flow theory. InProceedings of the 6th Annual Symposium on Hot Topics in the Science of Security. 1–10

  8. [8]

    Lawrence Blume. 2015. Agent-based models for policy analysis. InAssessing the Use of Agent-Based Models for Tobacco Regulation. National Academies Press (US). 24 Sebastian Benthall and Alan Lujan

  9. [9]

    András Borsos, Adrian Carro, Aldo Glielmo, Marc Hinterschweiger, Jagoda Kaszowska-Mojsa, and Arzu Uluc. 2025. Agent-Based Modelling at Central Banks: Recent Developments and New Challenges. (2025)

  10. [10]

    Craig Boutilier, Richard Dearden, and Moisés Goldszmidt. 2000. Stochastic dynamic programming with factored representations.Artificial intelligence121, 1-2 (2000), 49–107

  11. [11]

    Dan Cao. 2020. Recursive equilibrium in Krusell and Smith (1998).Journal of Economic Theory186 (2020), 104978

  12. [12]

    Micah Carroll, Alan Chan, Henry Ashton, and David Krueger. 2023. Characterizing manipulation from AI systems. In Proceedings of the 3rd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization. 1–13

  13. [13]

    Nick Chater, Mike Oaksford, Nick Chater, and Mike Oaksford. 1999. Ten years of the rational analysis of cognition. Trends in cognitive sciences3, 2 (1999), 57–65

  14. [14]

    Elliot Creager, David Madras, Toniann Pitassi, and Richard Zemel. 2020. Causal Modeling for Fairness in Dynamical Systems. arXiv:1909.09141 [cs.LG] https://arxiv.org/abs/1909.09141

  15. [15]

    Thomas G Dietterich. 2000. Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of artificial intelligence research13 (2000), 227–303

  16. [16]

    1996.Growing artificial societies: social science from the bottom up

    Joshua M Epstein and Robert Axtell. 1996.Growing artificial societies: social science from the bottom up. Brookings Institution Press

  17. [17]

    Tom Everitt, Ryan Carey, Eric D Langlois, Pedro A Ortega, and Shane Legg. 2021. Agent incentives: A causal perspective. InProceedings of the AAAI conference on artificial intelligence, Vol. 35. 11487–11495

  18. [18]

    Tom Everitt, Marcus Hutter, Ramana Kumar, and Victoria Krakovna. 2021. Reward tampering problems and solutions in reinforcement learning: A causal influence diagram perspective.Synthese198, Suppl 27 (2021), 6435–6467

  19. [19]

    1930.The Theory of Interest

    Irving Fisher. 1930.The Theory of Interest. MacMillan, New York

  20. [20]

    James Fox, Tom Everitt, Ryan Carey, Eric D Langlois, Alessandro Abate, and Michael J Wooldridge. 2021. PyCID: A Python Library for Causal Influence Diagrams.. InSciPy. 65–73

  21. [21]

    Lewis Hammond, James Fox, Tom Everitt, Alessandro Abate, and Michael Wooldridge. 2021. Equilibrium Refinements for Multi-Agent Influence Diagrams: Theory and Practice.arXiv preprint arXiv:2102.05008(2021)

  22. [22]

    Lewis Hammond, James Fox, Tom Everitt, Ryan Carey, Alessandro Abate, and Michael Wooldridge. 2023. Reasoning about causality in games.Artificial Intelligence320 (2023), 103919

  23. [23]

    Christopher Harris and David Laibson. 2001. Dynamic choices of hyperbolic consumers.Econometrica69, 4 (2001), 935–957

  24. [24]

    Stephen Kasputis and Henry C Ng. 2000. Composable simulations. In2000 Winter Simulation Conference Proceedings (Cat. No. 00CH37165), Vol. 2. IEEE, 1577–1584

  25. [25]

    Zachary Kenton, Ramana Kumar, Sebastian Farquhar, Jonathan Richens, Matt MacDermott, and Tom Everitt. 2023. Discovering agents.Artificial Intelligence322 (2023), 103963

  26. [26]

    Daphne Koller and Brian Milch. 2003. Multi-agent influence diagrams for representing and solving games.Games and economic behavior45, 1 (2003), 181–221

  27. [27]

    2016.Partially observed Markov decision processes

    Vikram Krishnamurthy. 2016.Partially observed Markov decision processes. Cambridge university press

  28. [28]

    Per Krusell, Burhanettin Kuruşçu, and Anthony A Smith. 2002. Equilibrium welfare and government policy with quasi-geometric discounting.Journal of Economic Theory105, 1 (2002), 42–72

  29. [29]

    Per Krusell and Anthony A Smith. 1998. Income and wealth heterogeneity in the macroeconomy.Journal of Political Economy106, 5 (1998), 867–896

  30. [30]

    Per Krusell and Anthony A Smith. 2003. Consumption–savings decisions with quasi–geometric discounting.Econo- metrica71, 1 (2003), 365–375

  31. [31]

    David Laibson. 1997. Golden eggs and hyperbolic discounting.The Quarterly Journal of Economics112, 2 (1997), 443–478

  32. [32]

    Sydney Levine, Matija Franklin, Tan Zhi-Xuan, Secil Yanik Guyot, Lionel Wong, Daniel Kilov, Yejin Choi, Joshua B Tenenbaum, Noah Goodman, Seth Lazar, et al. 2025. Resource Rational Contractualism Should Guide AI Alignment. arXiv preprint arXiv:2506.17434(2025)

  33. [33]

    Falk Lieder and Thomas L Griffiths. 2020. Resource-rational analysis: Understanding human cognition as the optimal use of limited computational resources.Behavioral and brain sciences43 (2020), e1

  34. [34]

    Alan Lujan. 2026. EGM 𝑛: The Sequential Endogenous Grid Method. (2026). Working Paper

  35. [35]

    Lilia Maliar, Serguei Maliar, and Pablo Winant. 2021. Deep learning for solving dynamic economic models.Journal of Monetary Economics122 (2021), 76–101

  36. [36]

    Shie Mannor, Ishai Menache, Amit Hoze, and Uri Klein. 2004. Dynamic abstraction in reinforcement learning via clustering. InProceedings of the twenty-first international conference on Machine learning. 71

  37. [37]

    Vishwali Mhasawade and Rumi Chunara. 2021. Causal Multi-Level Fairness. arXiv:2010.07343 [cs.LG] https: //arxiv.org/abs/2010.07343

  38. [38]

    2004.Learning bayesian networks

    Richard E Neapolitan et al. 2004.Learning bayesian networks. Vol. 38. Pearson Prentice Hall Upper Saddle River, NJ. The Design and Composition of Structural Causal Decision Processes 25

  39. [39]

    Cyrus Neary and Ufuk Topcu. 2023. Compositional learning of dynamical system models using port-Hamiltonian neural networks. InLearning for Dynamics and Control Conference. PMLR, 679–691

  40. [40]

    Argentina Ortega, Samuel Parra, Sven Schneider, and Nico Hochgeschwender. 2024. Composable and executable scenarios for simulation-based testing of mobile robots.Frontiers in Robotics and AI11 (2024), 1363281

  41. [41]

    Christiaan JJ Paredis, Antonio Diaz-Calderon, Rajarishi Sinha, and Pradeep K Khosla. 2001. Composable models for simulation-based design.Engineering with Computers17, 2 (2001), 112–128

  42. [42]

    Judea Pearl. 1994. A probabilistic calculus of actions. InUncertainty in artificial intelligence. Elsevier, 454–462

  43. [43]

    2009.Causality

    Judea Pearl. 2009.Causality. Cambridge university press

  44. [44]

    Jonathan Richens and Tom Everitt. 2024. Robust agents learn causal world models.arXiv preprint arXiv:2402.10877 (2024)

  45. [45]

    Atharva Sehgal, Arya Grayeli, Jennifer J Sun, and Swarat Chaudhuri. 2023. Neurosymbolic grounding for compositional world models.arXiv preprint arXiv:2310.12690(2023)

  46. [46]

    Ross D Shachter. 1986. Evaluating influence diagrams.Operations research34, 6 (1986), 871–882

  47. [47]

    John Stachurski and Junnan Zhang. 2021. Dynamic programming with state-dependent discounting.Journal of Economic Theory192 (2021), 105190

  48. [48]

    Robert H Strotz. 1956. Myopia and inconsistency in dynamic utility maximization.The Review of Economic Studies23, 3 (1956), 165–180

  49. [49]

    Chris Van Merwijk, Ryan Carey, and Tom Everitt. 2022. A complete criterion for value of information in soluble influence diagrams. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 10034–10041

  50. [50]

    Pieter van Schalkwyk and Dan Isaacs. 2023. Achieving scale through composable and lean digital twins. InThe Digital Twin. Springer, 153–180

  51. [51]

    1988.Influence Diagrams and D-seperation

    Thomas Verma and Judea Pearl. 1988.Influence Diagrams and D-seperation. University of California (Los Angeles). Computer Science Department

  52. [52]

    Neal Wagner. 2024. Comparing the complexity and efficiency of composable modeling techniques for multi-scale and multi-domain complex system modeling and simulation applications: a probabilistic analysis.Systems12, 3 (2024), 96

  53. [53]

    Sifan Wang, Shyam Sankaran, and Paris Perdikaris. 2022. Respecting causality is all you need for training physics- informed neural networks.arXiv preprint arXiv:2203.07404(2022)

  54. [54]

    Xinyue Wang and Biwei Huang. 2025. Modeling Unseen Environments with Language-guided Composable Causal Components in Reinforcement Learning.arXiv preprint arXiv:2505.08361(2025)

  55. [55]

    Hongxin Zhang, Zeyuan Wang, Qiushi Lyu, Zheyuan Zhang, Sunli Chen, Tianmin Shu, Behzad Dariush, Kwonjoon Lee, Yilun Du, and Chuang Gan. 2024. COMBO: compositional world models for embodied multi-agent cooperation. arXiv preprint arXiv:2404.10775(2024)

  56. [56]

    Tan Zhi-Xuan, Micah Carroll, Matija Franklin, and Hal Ashton. 2025. Beyond Preferences in AI Alignment.Philosophical Studies182, 7 (2025), 1813–1863

  57. [57]

    Feng Zhu, Yiping Yao, Jin Li, and Wenjie Tang. 2019. Reusability and composability analysis for an agent-based hierarchical modelling and simulation framework.Simulation Modelling Practice and Theory90 (2019), 81–97. A Reference A.1 Structural Causal Games A Bayesian network is a graphical model of a joint distribution over random variables. Definition 15...