Recognition: unknown
The Design and Composition of Structural Causal Decision Processes
Pith reviewed 2026-05-08 02:22 UTC · model grok-4.3
The pith
SCDMs and SCDPs are composable causal decision models that are strictly more expressive than POMDPs by allowing endogenous memory formation and variable discounting without rational belief assumptions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
SCDPs are strictly more expressive than POMDPs because they do not assume rational belief formation. Indeed, an SCDP can endogenously model the memory-formation process, and is thus useful for modeling resource rational agents in dynamic settings.
Load-bearing premise
That SCDMs have a well-defined and computationally useful property of composability, and that SCDPs can be constructed as recurring SCDMs with a discount variable while maintaining the claimed expressiveness advantages over POMDPs.
Figures
read the original abstract
We present two new classes of causal models of decision-making agents. Our approach is motivated by the needs of modeling the economics of computing systems. These systems are composed of subsystems and can exhibit endogenous limits on cognitive resources and value discounting. Structural Causal Decision Models (SCDMs) expand on Structural Causal Influence Models. Like SCIMs, they explicitly represent the causal relationships between model variables and the payoffs of agent decisions. Additionally, agent decisions can be constrained by their causal antecedents, and SCDMs can have open root variables for which no probability distribution or structural equation is given. We show that SCDMs have a well-defined and computationally useful property of composability. Building on SCDMs, we then define a Structural Causal Decision Process (SCDP) as a recurring SCDM with a discount variable. SCDPs benefit from the useful composition properties of SCDMs. Moreover, SCDPs are strictly more expressive than POMDPs because they do not assume rational belief formation. Indeed, an SCDP can endogenously model the memory-formation process, and is thus useful for modeling resource rational agents in dynamic settings. SCDPs are also capable of modeling variable discounting, a tool used widely in social scientific modeling. We pose that SCDPs are a useful framework for policy simulation for the digital economy, mechanism design for information systems, and digital twin modeling of cyberinfrastructure.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Structural Causal Decision Models (SCDMs) as extensions of Structural Causal Influence Models (SCIMs), allowing open root variables and constraining agent decisions by causal antecedents. It asserts that SCDMs have a well-defined and computationally useful composability property. Building on SCDMs, it defines Structural Causal Decision Processes (SCDPs) as recurring SCDMs with a discount variable. The central claims are that SCDPs inherit composability, are strictly more expressive than POMDPs by not assuming rational belief formation and enabling endogenous modeling of memory-formation processes, and are suitable for modeling resource-rational agents with variable discounting in applications such as digital economy policy simulation, mechanism design, and digital twin modeling of cyberinfrastructure.
Significance. If the composability property and strict expressiveness over POMDPs are formally established with supporting derivations, this framework could offer a meaningful advance for modeling decision processes in systems with endogenous cognitive limits and non-rational belief updates. It addresses limitations of standard POMDPs for resource-rational agents in dynamic settings and adds flexibility via variable discounting, with potential utility for policy simulation and mechanism design in computing systems.
major comments (2)
- Abstract: The assertion that 'SCDMs have a well-defined and computationally useful property of composability' is stated at a high level without a formal definition of the composition operation, a theorem establishing its properties, or a concrete example demonstrating computational usefulness for recurring processes.
- Abstract: The claim that 'SCDPs are strictly more expressive than POMDPs because they do not assume rational belief formation' and that 'an SCDP can endogenously model the memory-formation process' is presented without any derivation, proof, or example showing a specific scenario or structural equation construction that a POMDP cannot represent but an SCDP can.
minor comments (1)
- Abstract: The motivation section references 'the economics of computing systems' and 'endogenous limits on cognitive resources' but provides no specific examples of such systems or limits to ground the modeling needs.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments on the manuscript. We address each major comment point by point below, clarifying the content of the full paper and indicating revisions that will be made to improve the abstract's precision.
read point-by-point responses
-
Referee: Abstract: The assertion that 'SCDMs have a well-defined and computationally useful property of composability' is stated at a high level without a formal definition of the composition operation, a theorem establishing its properties, or a concrete example demonstrating computational usefulness for recurring processes.
Authors: The full manuscript provides the formal definition of the SCDM composition operation, establishes its properties (including associativity and preservation of causal structure under composition) via theorem, and includes a concrete example of composing SCDMs to model a recurring process with computational benefits for modular analysis. To address the referee's concern that the abstract presents this at too high a level, we will revise the abstract to explicitly reference the definition, theorem, and example. revision: yes
-
Referee: Abstract: The claim that 'SCDPs are strictly more expressive than POMDPs because they do not assume rational belief formation' and that 'an SCDP can endogenously model the memory-formation process' is presented without any derivation, proof, or example showing a specific scenario or structural equation construction that a POMDP cannot represent but an SCDP can.
Authors: The full manuscript derives the strict expressiveness result by constructing a specific structural equation for endogenous memory formation that does not rely on rational Bayesian updates (which POMDPs require), provides a formal proof of the expressiveness gap, and gives a concrete example of an SCDP that represents a non-rational memory process in a dynamic setting. We will revise the abstract to briefly indicate this construction and refer readers to the relevant derivation and example in the body of the paper. revision: yes
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Causal relationships between model variables and agent payoffs can be explicitly represented in decision models.
- domain assumption Agent decisions can be constrained by causal antecedents.
invented entities (2)
-
SCDM
no independent evidence
-
SCDP
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Akash Agrawal, Joel Dyer, Aldo Glielmo, and Michael J Wooldridge. 2025. Robust policy design in agent-based simulators using adversarial reinforcement learning. InThe First MARW: Multi-Agent AI in the Real World Workshop at AAAI 2025
2025
-
[2]
John R Anderson. 1991. Is human cognition adaptive?Behavioral and brain sciences14, 3 (1991), 471–485
1991
-
[3]
David I August, Sharad Malik, Li-Shiuan Peh, Vijay Pai, Manish Vachharajani, and Paul Willmann. 2005. Achieving structural and composable modeling of complex systems.International Journal of Parallel Programming33, 2 (2005), 81–101
2005
-
[4]
Robert Axelrod. 2006. Agent-based modeling as a bridge between disciplines.Handbook of computational economics2 (2006), 1565–1584
2006
-
[5]
Robert L Axtell and J Doyne Farmer. 2025. Agent-based modeling in economics and finance: Past, present, and future. Journal of Economic Literature63, 1 (2025), 197–287
2025
-
[6]
Osman Balci, James D Arthur, and William F Ormsby. 2011. Achieving reusability and composability with a simulation conceptual model.Journal of Simulation5, 3 (2011), 157–165
2011
-
[7]
Sebastian Benthall. 2019. Situated information flow theory. InProceedings of the 6th Annual Symposium on Hot Topics in the Science of Security. 1–10
2019
-
[8]
Lawrence Blume. 2015. Agent-based models for policy analysis. InAssessing the Use of Agent-Based Models for Tobacco Regulation. National Academies Press (US). 24 Sebastian Benthall and Alan Lujan
2015
-
[9]
András Borsos, Adrian Carro, Aldo Glielmo, Marc Hinterschweiger, Jagoda Kaszowska-Mojsa, and Arzu Uluc. 2025. Agent-Based Modelling at Central Banks: Recent Developments and New Challenges. (2025)
2025
-
[10]
Craig Boutilier, Richard Dearden, and Moisés Goldszmidt. 2000. Stochastic dynamic programming with factored representations.Artificial intelligence121, 1-2 (2000), 49–107
2000
-
[11]
Dan Cao. 2020. Recursive equilibrium in Krusell and Smith (1998).Journal of Economic Theory186 (2020), 104978
2020
-
[12]
Micah Carroll, Alan Chan, Henry Ashton, and David Krueger. 2023. Characterizing manipulation from AI systems. In Proceedings of the 3rd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization. 1–13
2023
-
[13]
Nick Chater, Mike Oaksford, Nick Chater, and Mike Oaksford. 1999. Ten years of the rational analysis of cognition. Trends in cognitive sciences3, 2 (1999), 57–65
1999
- [14]
-
[15]
Thomas G Dietterich. 2000. Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of artificial intelligence research13 (2000), 227–303
2000
-
[16]
1996.Growing artificial societies: social science from the bottom up
Joshua M Epstein and Robert Axtell. 1996.Growing artificial societies: social science from the bottom up. Brookings Institution Press
1996
-
[17]
Tom Everitt, Ryan Carey, Eric D Langlois, Pedro A Ortega, and Shane Legg. 2021. Agent incentives: A causal perspective. InProceedings of the AAAI conference on artificial intelligence, Vol. 35. 11487–11495
2021
-
[18]
Tom Everitt, Marcus Hutter, Ramana Kumar, and Victoria Krakovna. 2021. Reward tampering problems and solutions in reinforcement learning: A causal influence diagram perspective.Synthese198, Suppl 27 (2021), 6435–6467
2021
-
[19]
1930.The Theory of Interest
Irving Fisher. 1930.The Theory of Interest. MacMillan, New York
1930
-
[20]
James Fox, Tom Everitt, Ryan Carey, Eric D Langlois, Alessandro Abate, and Michael J Wooldridge. 2021. PyCID: A Python Library for Causal Influence Diagrams.. InSciPy. 65–73
2021
- [21]
-
[22]
Lewis Hammond, James Fox, Tom Everitt, Ryan Carey, Alessandro Abate, and Michael Wooldridge. 2023. Reasoning about causality in games.Artificial Intelligence320 (2023), 103919
2023
-
[23]
Christopher Harris and David Laibson. 2001. Dynamic choices of hyperbolic consumers.Econometrica69, 4 (2001), 935–957
2001
-
[24]
Stephen Kasputis and Henry C Ng. 2000. Composable simulations. In2000 Winter Simulation Conference Proceedings (Cat. No. 00CH37165), Vol. 2. IEEE, 1577–1584
2000
-
[25]
Zachary Kenton, Ramana Kumar, Sebastian Farquhar, Jonathan Richens, Matt MacDermott, and Tom Everitt. 2023. Discovering agents.Artificial Intelligence322 (2023), 103963
2023
-
[26]
Daphne Koller and Brian Milch. 2003. Multi-agent influence diagrams for representing and solving games.Games and economic behavior45, 1 (2003), 181–221
2003
-
[27]
2016.Partially observed Markov decision processes
Vikram Krishnamurthy. 2016.Partially observed Markov decision processes. Cambridge university press
2016
-
[28]
Per Krusell, Burhanettin Kuruşçu, and Anthony A Smith. 2002. Equilibrium welfare and government policy with quasi-geometric discounting.Journal of Economic Theory105, 1 (2002), 42–72
2002
-
[29]
Per Krusell and Anthony A Smith. 1998. Income and wealth heterogeneity in the macroeconomy.Journal of Political Economy106, 5 (1998), 867–896
1998
-
[30]
Per Krusell and Anthony A Smith. 2003. Consumption–savings decisions with quasi–geometric discounting.Econo- metrica71, 1 (2003), 365–375
2003
-
[31]
David Laibson. 1997. Golden eggs and hyperbolic discounting.The Quarterly Journal of Economics112, 2 (1997), 443–478
1997
- [32]
-
[33]
Falk Lieder and Thomas L Griffiths. 2020. Resource-rational analysis: Understanding human cognition as the optimal use of limited computational resources.Behavioral and brain sciences43 (2020), e1
2020
-
[34]
Alan Lujan. 2026. EGM 𝑛: The Sequential Endogenous Grid Method. (2026). Working Paper
2026
-
[35]
Lilia Maliar, Serguei Maliar, and Pablo Winant. 2021. Deep learning for solving dynamic economic models.Journal of Monetary Economics122 (2021), 76–101
2021
-
[36]
Shie Mannor, Ishai Menache, Amit Hoze, and Uri Klein. 2004. Dynamic abstraction in reinforcement learning via clustering. InProceedings of the twenty-first international conference on Machine learning. 71
2004
- [37]
-
[38]
2004.Learning bayesian networks
Richard E Neapolitan et al. 2004.Learning bayesian networks. Vol. 38. Pearson Prentice Hall Upper Saddle River, NJ. The Design and Composition of Structural Causal Decision Processes 25
2004
-
[39]
Cyrus Neary and Ufuk Topcu. 2023. Compositional learning of dynamical system models using port-Hamiltonian neural networks. InLearning for Dynamics and Control Conference. PMLR, 679–691
2023
-
[40]
Argentina Ortega, Samuel Parra, Sven Schneider, and Nico Hochgeschwender. 2024. Composable and executable scenarios for simulation-based testing of mobile robots.Frontiers in Robotics and AI11 (2024), 1363281
2024
-
[41]
Christiaan JJ Paredis, Antonio Diaz-Calderon, Rajarishi Sinha, and Pradeep K Khosla. 2001. Composable models for simulation-based design.Engineering with Computers17, 2 (2001), 112–128
2001
-
[42]
Judea Pearl. 1994. A probabilistic calculus of actions. InUncertainty in artificial intelligence. Elsevier, 454–462
1994
-
[43]
2009.Causality
Judea Pearl. 2009.Causality. Cambridge university press
2009
- [44]
- [45]
-
[46]
Ross D Shachter. 1986. Evaluating influence diagrams.Operations research34, 6 (1986), 871–882
1986
-
[47]
John Stachurski and Junnan Zhang. 2021. Dynamic programming with state-dependent discounting.Journal of Economic Theory192 (2021), 105190
2021
-
[48]
Robert H Strotz. 1956. Myopia and inconsistency in dynamic utility maximization.The Review of Economic Studies23, 3 (1956), 165–180
1956
-
[49]
Chris Van Merwijk, Ryan Carey, and Tom Everitt. 2022. A complete criterion for value of information in soluble influence diagrams. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 10034–10041
2022
-
[50]
Pieter van Schalkwyk and Dan Isaacs. 2023. Achieving scale through composable and lean digital twins. InThe Digital Twin. Springer, 153–180
2023
-
[51]
1988.Influence Diagrams and D-seperation
Thomas Verma and Judea Pearl. 1988.Influence Diagrams and D-seperation. University of California (Los Angeles). Computer Science Department
1988
-
[52]
Neal Wagner. 2024. Comparing the complexity and efficiency of composable modeling techniques for multi-scale and multi-domain complex system modeling and simulation applications: a probabilistic analysis.Systems12, 3 (2024), 96
2024
- [53]
- [54]
- [55]
-
[56]
Tan Zhi-Xuan, Micah Carroll, Matija Franklin, and Hal Ashton. 2025. Beyond Preferences in AI Alignment.Philosophical Studies182, 7 (2025), 1813–1863
2025
-
[57]
Feng Zhu, Yiping Yao, Jin Li, and Wenjie Tang. 2019. Reusability and composability analysis for an agent-based hierarchical modelling and simulation framework.Simulation Modelling Practice and Theory90 (2019), 81–97. A Reference A.1 Structural Causal Games A Bayesian network is a graphical model of a joint distribution over random variables. Definition 15...
2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.