arxiv: 2605.02681 · v1 · submitted 2026-05-04 · 💻 cs.CE · cs.AI· cs.GT· econ.TH

Recognition: unknown

The Design and Composition of Structural Causal Decision Processes

Sebastian Benthall , Alan Lujan

Authors on Pith no claims yet

Pith reviewed 2026-05-08 02:22 UTC · model grok-4.3

classification 💻 cs.CE cs.AIcs.GTecon.TH

keywords causalmodelingscdmsstructuralscdpsusefuldecisionmodels

0 comments

The pith

SCDMs and SCDPs are composable causal decision models that are strictly more expressive than POMDPs by allowing endogenous memory formation and variable discounting without rational belief assumptions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This work creates new ways to model how agents decide in systems where causes and effects are tracked explicitly. SCDMs build on existing causal models by letting decisions be limited by what came before them and by leaving some starting variables without full probability rules. These models can be put together in useful ways. SCDPs take this further by making the model repeat over time with a factor that discounts future payoffs. Unlike common AI models called POMDPs that assume agents always update beliefs correctly, these new models can include how memory itself forms as part of the causal structure. This fits situations where agents have limited thinking resources, such as in computer systems, and allows for changing how much future rewards are valued. The authors propose using this for testing economic policies in digital settings, designing rules for information systems, and building digital copies of cyber systems.

Core claim

SCDPs are strictly more expressive than POMDPs because they do not assume rational belief formation. Indeed, an SCDP can endogenously model the memory-formation process, and is thus useful for modeling resource rational agents in dynamic settings.

Load-bearing premise

That SCDMs have a well-defined and computationally useful property of composability, and that SCDPs can be constructed as recurring SCDMs with a discount variable while maintaining the claimed expressiveness advantages over POMDPs.

Figures

Figures reproduced from arXiv: 2605.02681 by Alan Lujan, Sebastian Benthall.

**Figure 1.** Figure 1: We extend the taxonomy of causal models from Hammond et al view at source ↗

**Figure 2.** Figure 2: Influence diagram for the two-period consumption problem. Circles denote state variables, rectangles view at source ↗

**Figure 3.** Figure 3: A model M is composed of M1 ◦ M2. Indirect paths are represented by dotted edges. If all paths from the bridge nodes Y to reward nodes in the second component U2 have a member of Pa𝐷2 ∪ D2 on it, then Y is d-separated from U2 given those nodes. Under that condition the composition is orthomodular or, equivalently, sequential. An indirect path from 𝑌 to 𝑈2 which is not interrupted by Pa𝐷2 ∪ D2 (shown in red… view at source ↗

**Figure 4.** Figure 4: Influence diagram for the two-period consumption problem with habit formation. The dashed arrow view at source ↗

**Figure 5.** Figure 5: The composed consumption and portfolio allocation dynamic problem. view at source ↗

**Figure 6.** Figure 6: An SCDP with latent state and shocks. The agent observes view at source ↗

**Figure 7.** Figure 7: An SCDP in which the agent consumes at 𝑑, updates their beliefs about the world at 𝑞, and chooses how much to remember at 𝑟. The agent experiences joy with consumption 𝑢 and consternation at remembering carefully 𝑣. Their resources vary over time due to transitory shocks 𝜖𝑏 as well as a latent stochastic income process which is not directly observed. 5.1 Example: Stochastic discount factors We now present … view at source ↗

**Figure 8.** Figure 8: Influence diagram for the consumption-saving problem with stochastic discount factors. The agent view at source ↗

read the original abstract

We present two new classes of causal models of decision-making agents. Our approach is motivated by the needs of modeling the economics of computing systems. These systems are composed of subsystems and can exhibit endogenous limits on cognitive resources and value discounting. Structural Causal Decision Models (SCDMs) expand on Structural Causal Influence Models. Like SCIMs, they explicitly represent the causal relationships between model variables and the payoffs of agent decisions. Additionally, agent decisions can be constrained by their causal antecedents, and SCDMs can have open root variables for which no probability distribution or structural equation is given. We show that SCDMs have a well-defined and computationally useful property of composability. Building on SCDMs, we then define a Structural Causal Decision Process (SCDP) as a recurring SCDM with a discount variable. SCDPs benefit from the useful composition properties of SCDMs. Moreover, SCDPs are strictly more expressive than POMDPs because they do not assume rational belief formation. Indeed, an SCDP can endogenously model the memory-formation process, and is thus useful for modeling resource rational agents in dynamic settings. SCDPs are also capable of modeling variable discounting, a tool used widely in social scientific modeling. We pose that SCDPs are a useful framework for policy simulation for the digital economy, mechanism design for information systems, and digital twin modeling of cyberinfrastructure.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper defines SCDMs with composability and open roots, then SCDPs as recurring versions with endogenous memory and discounting, claiming strict expressiveness gains over POMDPs for resource-rational agents in composed systems.

read the letter

Colleague, the main things to know are that SCDMs extend SCIMs by adding open root variables, causal constraints on decisions, and a claimed composability property, while SCDPs turn these into recurring processes with a discount variable and endogenous memory modeling. This setup is positioned for policy simulation in digital economies and digital twin work on cyberinfrastructure. The definitions are new in their explicit handling of composability for subsystems and the avoidance of rational belief updates, which the paper uses to argue SCDPs can model memory formation directly and thus handle non-rational agents better than POMDPs. Variable discounting is also built in as a standard feature from social science modeling. The work connects the models to practical needs like mechanism design for information systems without obvious circularity in the constructions. It builds outward from SCIMs in a straightforward way. The softer part is the support for the key claims. The abstract states that composability is computationally useful and that the expressiveness advantage holds, but without derivations, proofs, or worked examples visible here, those properties stay at the level of definition. The stress-test note correctly flags no internal inconsistency in the stated setup, yet the computational usefulness and strict superiority still need concrete verification in the full text to confirm they deliver in practice. This paper is for researchers working on causal models applied to economic computing or dynamic agent systems who want alternatives to POMDPs that incorporate resource limits and subsystem composition. A reader focused on formal extensions of influence diagrams or memory in decision processes would get value from the new classes. It shows clear engagement with the relevant literature on causal decision models and POMDPs. I would send it for peer review to get external checks on the formal properties and examples rather than desk reject, since the targeted application area and distinct model features make it worth referee time even if revisions are needed.

Referee Report

2 major / 1 minor

Summary. The paper introduces Structural Causal Decision Models (SCDMs) as extensions of Structural Causal Influence Models (SCIMs), allowing open root variables and constraining agent decisions by causal antecedents. It asserts that SCDMs have a well-defined and computationally useful composability property. Building on SCDMs, it defines Structural Causal Decision Processes (SCDPs) as recurring SCDMs with a discount variable. The central claims are that SCDPs inherit composability, are strictly more expressive than POMDPs by not assuming rational belief formation and enabling endogenous modeling of memory-formation processes, and are suitable for modeling resource-rational agents with variable discounting in applications such as digital economy policy simulation, mechanism design, and digital twin modeling of cyberinfrastructure.

Significance. If the composability property and strict expressiveness over POMDPs are formally established with supporting derivations, this framework could offer a meaningful advance for modeling decision processes in systems with endogenous cognitive limits and non-rational belief updates. It addresses limitations of standard POMDPs for resource-rational agents in dynamic settings and adds flexibility via variable discounting, with potential utility for policy simulation and mechanism design in computing systems.

major comments (2)

Abstract: The assertion that 'SCDMs have a well-defined and computationally useful property of composability' is stated at a high level without a formal definition of the composition operation, a theorem establishing its properties, or a concrete example demonstrating computational usefulness for recurring processes.
Abstract: The claim that 'SCDPs are strictly more expressive than POMDPs because they do not assume rational belief formation' and that 'an SCDP can endogenously model the memory-formation process' is presented without any derivation, proof, or example showing a specific scenario or structural equation construction that a POMDP cannot represent but an SCDP can.

minor comments (1)

Abstract: The motivation section references 'the economics of computing systems' and 'endogenous limits on cognitive resources' but provides no specific examples of such systems or limits to ground the modeling needs.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed comments on the manuscript. We address each major comment point by point below, clarifying the content of the full paper and indicating revisions that will be made to improve the abstract's precision.

read point-by-point responses

Referee: Abstract: The assertion that 'SCDMs have a well-defined and computationally useful property of composability' is stated at a high level without a formal definition of the composition operation, a theorem establishing its properties, or a concrete example demonstrating computational usefulness for recurring processes.

Authors: The full manuscript provides the formal definition of the SCDM composition operation, establishes its properties (including associativity and preservation of causal structure under composition) via theorem, and includes a concrete example of composing SCDMs to model a recurring process with computational benefits for modular analysis. To address the referee's concern that the abstract presents this at too high a level, we will revise the abstract to explicitly reference the definition, theorem, and example. revision: yes
Referee: Abstract: The claim that 'SCDPs are strictly more expressive than POMDPs because they do not assume rational belief formation' and that 'an SCDP can endogenously model the memory-formation process' is presented without any derivation, proof, or example showing a specific scenario or structural equation construction that a POMDP cannot represent but an SCDP can.

Authors: The full manuscript derives the strict expressiveness result by constructing a specific structural equation for endogenous memory formation that does not rely on rational Bayesian updates (which POMDPs require), provides a formal proof of the expressiveness gap, and gives a concrete example of an SCDP that represents a non-rational memory process in a dynamic setting. We will revise the abstract to briefly indicate this construction and refer readers to the relevant derivation and example in the body of the paper. revision: yes

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The central claims rest on standard assumptions from causal modeling literature plus the new definitions; no free parameters or invented physical entities are introduced.

axioms (2)

domain assumption Causal relationships between model variables and agent payoffs can be explicitly represented in decision models.
Invoked in the expansion of SCIMs to SCDMs.
domain assumption Agent decisions can be constrained by causal antecedents.
Stated as an additional feature of SCDMs.

invented entities (2)

SCDM no independent evidence
purpose: New class of causal decision models with composability and open roots.
Defined in the paper as an expansion of SCIMs.
SCDP no independent evidence
purpose: Recurring SCDM with discount variable for dynamic modeling.
Defined in the paper as building on SCDMs.

pith-pipeline@v0.9.0 · 5543 in / 1480 out tokens · 65864 ms · 2026-05-08T02:22:46.973033+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

57 extracted references · 9 canonical work pages

[1]

Akash Agrawal, Joel Dyer, Aldo Glielmo, and Michael J Wooldridge. 2025. Robust policy design in agent-based simulators using adversarial reinforcement learning. InThe First MARW: Multi-Agent AI in the Real World Workshop at AAAI 2025

2025
[2]

John R Anderson. 1991. Is human cognition adaptive?Behavioral and brain sciences14, 3 (1991), 471–485

1991
[3]

David I August, Sharad Malik, Li-Shiuan Peh, Vijay Pai, Manish Vachharajani, and Paul Willmann. 2005. Achieving structural and composable modeling of complex systems.International Journal of Parallel Programming33, 2 (2005), 81–101

2005
[4]

Robert Axelrod. 2006. Agent-based modeling as a bridge between disciplines.Handbook of computational economics2 (2006), 1565–1584

2006
[5]

Robert L Axtell and J Doyne Farmer. 2025. Agent-based modeling in economics and finance: Past, present, and future. Journal of Economic Literature63, 1 (2025), 197–287

2025
[6]

Osman Balci, James D Arthur, and William F Ormsby. 2011. Achieving reusability and composability with a simulation conceptual model.Journal of Simulation5, 3 (2011), 157–165

2011
[7]

Sebastian Benthall. 2019. Situated information flow theory. InProceedings of the 6th Annual Symposium on Hot Topics in the Science of Security. 1–10

2019
[8]

Lawrence Blume. 2015. Agent-based models for policy analysis. InAssessing the Use of Agent-Based Models for Tobacco Regulation. National Academies Press (US). 24 Sebastian Benthall and Alan Lujan

2015
[9]

András Borsos, Adrian Carro, Aldo Glielmo, Marc Hinterschweiger, Jagoda Kaszowska-Mojsa, and Arzu Uluc. 2025. Agent-Based Modelling at Central Banks: Recent Developments and New Challenges. (2025)

2025
[10]

Craig Boutilier, Richard Dearden, and Moisés Goldszmidt. 2000. Stochastic dynamic programming with factored representations.Artificial intelligence121, 1-2 (2000), 49–107

2000
[11]

Dan Cao. 2020. Recursive equilibrium in Krusell and Smith (1998).Journal of Economic Theory186 (2020), 104978

2020
[12]

Micah Carroll, Alan Chan, Henry Ashton, and David Krueger. 2023. Characterizing manipulation from AI systems. In Proceedings of the 3rd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization. 1–13

2023
[13]

Nick Chater, Mike Oaksford, Nick Chater, and Mike Oaksford. 1999. Ten years of the rational analysis of cognition. Trends in cognitive sciences3, 2 (1999), 57–65

1999
[14]

Elliot Creager, David Madras, Toniann Pitassi, and Richard Zemel. 2020. Causal Modeling for Fairness in Dynamical Systems. arXiv:1909.09141 [cs.LG] https://arxiv.org/abs/1909.09141

work page arXiv 2020
[15]

Thomas G Dietterich. 2000. Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of artificial intelligence research13 (2000), 227–303

2000
[16]

1996.Growing artificial societies: social science from the bottom up

Joshua M Epstein and Robert Axtell. 1996.Growing artificial societies: social science from the bottom up. Brookings Institution Press

1996
[17]

Tom Everitt, Ryan Carey, Eric D Langlois, Pedro A Ortega, and Shane Legg. 2021. Agent incentives: A causal perspective. InProceedings of the AAAI conference on artificial intelligence, Vol. 35. 11487–11495

2021
[18]

Tom Everitt, Marcus Hutter, Ramana Kumar, and Victoria Krakovna. 2021. Reward tampering problems and solutions in reinforcement learning: A causal influence diagram perspective.Synthese198, Suppl 27 (2021), 6435–6467

2021
[19]

1930.The Theory of Interest

Irving Fisher. 1930.The Theory of Interest. MacMillan, New York

1930
[20]

James Fox, Tom Everitt, Ryan Carey, Eric D Langlois, Alessandro Abate, and Michael J Wooldridge. 2021. PyCID: A Python Library for Causal Influence Diagrams.. InSciPy. 65–73

2021
[21]

Lewis Hammond, James Fox, Tom Everitt, Alessandro Abate, and Michael Wooldridge. 2021. Equilibrium Refinements for Multi-Agent Influence Diagrams: Theory and Practice.arXiv preprint arXiv:2102.05008(2021)

work page arXiv 2021
[22]

Lewis Hammond, James Fox, Tom Everitt, Ryan Carey, Alessandro Abate, and Michael Wooldridge. 2023. Reasoning about causality in games.Artificial Intelligence320 (2023), 103919

2023
[23]

Christopher Harris and David Laibson. 2001. Dynamic choices of hyperbolic consumers.Econometrica69, 4 (2001), 935–957

2001
[24]

Stephen Kasputis and Henry C Ng. 2000. Composable simulations. In2000 Winter Simulation Conference Proceedings (Cat. No. 00CH37165), Vol. 2. IEEE, 1577–1584

2000
[25]

Zachary Kenton, Ramana Kumar, Sebastian Farquhar, Jonathan Richens, Matt MacDermott, and Tom Everitt. 2023. Discovering agents.Artificial Intelligence322 (2023), 103963

2023
[26]

Daphne Koller and Brian Milch. 2003. Multi-agent influence diagrams for representing and solving games.Games and economic behavior45, 1 (2003), 181–221

2003
[27]

2016.Partially observed Markov decision processes

Vikram Krishnamurthy. 2016.Partially observed Markov decision processes. Cambridge university press

2016
[28]

Per Krusell, Burhanettin Kuruşçu, and Anthony A Smith. 2002. Equilibrium welfare and government policy with quasi-geometric discounting.Journal of Economic Theory105, 1 (2002), 42–72

2002
[29]

Per Krusell and Anthony A Smith. 1998. Income and wealth heterogeneity in the macroeconomy.Journal of Political Economy106, 5 (1998), 867–896

1998
[30]

Per Krusell and Anthony A Smith. 2003. Consumption–savings decisions with quasi–geometric discounting.Econo- metrica71, 1 (2003), 365–375

2003
[31]

David Laibson. 1997. Golden eggs and hyperbolic discounting.The Quarterly Journal of Economics112, 2 (1997), 443–478

1997
[32]

Sydney Levine, Matija Franklin, Tan Zhi-Xuan, Secil Yanik Guyot, Lionel Wong, Daniel Kilov, Yejin Choi, Joshua B Tenenbaum, Noah Goodman, Seth Lazar, et al. 2025. Resource Rational Contractualism Should Guide AI Alignment. arXiv preprint arXiv:2506.17434(2025)

work page arXiv 2025
[33]

Falk Lieder and Thomas L Griffiths. 2020. Resource-rational analysis: Understanding human cognition as the optimal use of limited computational resources.Behavioral and brain sciences43 (2020), e1

2020
[34]

Alan Lujan. 2026. EGM 𝑛: The Sequential Endogenous Grid Method. (2026). Working Paper

2026
[35]

Lilia Maliar, Serguei Maliar, and Pablo Winant. 2021. Deep learning for solving dynamic economic models.Journal of Monetary Economics122 (2021), 76–101

2021
[36]

Shie Mannor, Ishai Menache, Amit Hoze, and Uri Klein. 2004. Dynamic abstraction in reinforcement learning via clustering. InProceedings of the twenty-first international conference on Machine learning. 71

2004
[37]

Vishwali Mhasawade and Rumi Chunara. 2021. Causal Multi-Level Fairness. arXiv:2010.07343 [cs.LG] https: //arxiv.org/abs/2010.07343

work page arXiv 2021
[38]

2004.Learning bayesian networks

Richard E Neapolitan et al. 2004.Learning bayesian networks. Vol. 38. Pearson Prentice Hall Upper Saddle River, NJ. The Design and Composition of Structural Causal Decision Processes 25

2004
[39]

Cyrus Neary and Ufuk Topcu. 2023. Compositional learning of dynamical system models using port-Hamiltonian neural networks. InLearning for Dynamics and Control Conference. PMLR, 679–691

2023
[40]

Argentina Ortega, Samuel Parra, Sven Schneider, and Nico Hochgeschwender. 2024. Composable and executable scenarios for simulation-based testing of mobile robots.Frontiers in Robotics and AI11 (2024), 1363281

2024
[41]

Christiaan JJ Paredis, Antonio Diaz-Calderon, Rajarishi Sinha, and Pradeep K Khosla. 2001. Composable models for simulation-based design.Engineering with Computers17, 2 (2001), 112–128

2001
[42]

Judea Pearl. 1994. A probabilistic calculus of actions. InUncertainty in artificial intelligence. Elsevier, 454–462

1994
[43]

2009.Causality

Judea Pearl. 2009.Causality. Cambridge university press

2009
[44]

Jonathan Richens and Tom Everitt. 2024. Robust agents learn causal world models.arXiv preprint arXiv:2402.10877 (2024)

work page arXiv 2024
[45]

Atharva Sehgal, Arya Grayeli, Jennifer J Sun, and Swarat Chaudhuri. 2023. Neurosymbolic grounding for compositional world models.arXiv preprint arXiv:2310.12690(2023)

work page arXiv 2023
[46]

Ross D Shachter. 1986. Evaluating influence diagrams.Operations research34, 6 (1986), 871–882

1986
[47]

John Stachurski and Junnan Zhang. 2021. Dynamic programming with state-dependent discounting.Journal of Economic Theory192 (2021), 105190

2021
[48]

Robert H Strotz. 1956. Myopia and inconsistency in dynamic utility maximization.The Review of Economic Studies23, 3 (1956), 165–180

1956
[49]

Chris Van Merwijk, Ryan Carey, and Tom Everitt. 2022. A complete criterion for value of information in soluble influence diagrams. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 10034–10041

2022
[50]

Pieter van Schalkwyk and Dan Isaacs. 2023. Achieving scale through composable and lean digital twins. InThe Digital Twin. Springer, 153–180

2023
[51]

1988.Influence Diagrams and D-seperation

Thomas Verma and Judea Pearl. 1988.Influence Diagrams and D-seperation. University of California (Los Angeles). Computer Science Department

1988
[52]

Neal Wagner. 2024. Comparing the complexity and efficiency of composable modeling techniques for multi-scale and multi-domain complex system modeling and simulation applications: a probabilistic analysis.Systems12, 3 (2024), 96

2024
[53]

Sifan Wang, Shyam Sankaran, and Paris Perdikaris. 2022. Respecting causality is all you need for training physics- informed neural networks.arXiv preprint arXiv:2203.07404(2022)

work page arXiv 2022
[54]

Xinyue Wang and Biwei Huang. 2025. Modeling Unseen Environments with Language-guided Composable Causal Components in Reinforcement Learning.arXiv preprint arXiv:2505.08361(2025)

work page arXiv 2025
[55]

Hongxin Zhang, Zeyuan Wang, Qiushi Lyu, Zheyuan Zhang, Sunli Chen, Tianmin Shu, Behzad Dariush, Kwonjoon Lee, Yilun Du, and Chuang Gan. 2024. COMBO: compositional world models for embodied multi-agent cooperation. arXiv preprint arXiv:2404.10775(2024)

work page arXiv 2024
[56]

Tan Zhi-Xuan, Micah Carroll, Matija Franklin, and Hal Ashton. 2025. Beyond Preferences in AI Alignment.Philosophical Studies182, 7 (2025), 1813–1863

2025
[57]

Feng Zhu, Yiping Yao, Jin Li, and Wenjie Tang. 2019. Reusability and composability analysis for an agent-based hierarchical modelling and simulation framework.Simulation Modelling Practice and Theory90 (2019), 81–97. A Reference A.1 Structural Causal Games A Bayesian network is a graphical model of a joint distribution over random variables. Definition 15...

2019