Choosing with unknown causal information: Action-outcome probabilities for decision making can be grounded in causal models
Pith reviewed 2026-05-24 15:31 UTC · model grok-4.3
The pith
Decision-making probabilities can be grounded in causal models whether the mechanisms are known or unknown.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In decision problems where actions and consequences are causally connected, actions are regarded as interventions over a causal model. The previous result for known causal models is extended to unknown causal mechanisms. This grounds action-outcome probabilities in causal models for both cases. As an application, Nash Equilibrium is extended to strategic games where players consider causal information.
What carries the argument
Treating actions as interventions over a causal model to derive action-outcome probabilities.
If this is right
- Action-outcome probabilities for expected utility calculations are obtained from interventions on the causal model when known.
- The derivation extends directly to cases where the causal mechanism is unknown to the decision maker.
- Strategic games have equilibria defined using causal information from player interventions.
- Rational decision making remains possible under uncertainty with causally grounded probabilities in both settings.
Where Pith is reading between the lines
- AI decision systems might benefit from maintaining possible causal models to compute intervention effects rather than relying solely on observed correlations.
- Human experiments could test whether people naturally use causal intervention reasoning in unknown-mechanism choice tasks.
- This approach could generalize to multi-agent settings beyond games, such as cooperative planning with uncertain causal links.
- Further work might derive bounds on decision quality when only approximate causal models are available.
Load-bearing premise
Actions can be treated as interventions on an underlying causal model even when the decision maker does not know the precise causal mechanism that controls the environment.
What would settle it
Finding a rational decision problem with unknown causality where no set of action-outcome probabilities can be consistently derived from any causal intervention model would disprove the extension.
read the original abstract
Decision-making under uncertainty and causal thinking are fundamental aspects of intelligent reasoning. Decision-making has been well studied when the available information is considered at the associative (probabilistic) level. The classical Theorems of von Neumann-Morgenstern and Savage provide a formal criterion for rational choice using associative information: maximize expected utility. There is an ongoing debate around the origin of probabilities involved in such calculation. In this work, we will show how the probabilities for decision-making can be grounded in causal models by considering decision problems in which the available actions and consequences are causally connected. In this setting, actions are regarded as an intervention over a causal model. Then, we extend a previous causal decision-making result, which relies on a known causal model, to the case in which the causal mechanism that controls some environment is unknown to a rational decision-maker. In this way, action-outcome probabilities can be grounded in causal models in known and unknown cases. Finally, as an application, we extend the well-known concept of Nash Equilibrium to the case in which the players of a strategic game consider causal information.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that action-outcome probabilities in decision problems can be grounded in causal models (via the do-operator on an underlying graph) both when the causal mechanism is known and when it is unknown to the decision maker; it extends a prior known-model result to the unknown case and applies the framework to extend Nash equilibrium to strategic games that incorporate causal information.
Significance. If the central extension is rigorously derived, the work would strengthen the link between causal semantics and expected-utility calculations in settings where the decision maker lacks an explicit mechanism, with potential implications for causal decision theory and game-theoretic solution concepts.
major comments (2)
- [Section 4] Section 4 (the extension to unknown mechanisms): the manuscript asserts that interventional probabilities remain available without an explicit graph, yet supplies neither a formal statement of the unknown-mechanism setting nor the derivation steps that obtain the same interventional probabilities from a distribution over mechanisms; without these steps it is impossible to verify that the grounding claim survives the extension.
- [Section 4] Section 4, paragraph on the unknown-case probabilities: the argument appears to rely on an implicit mixture over possible causal graphs, but no independent justification is given that this mixture preserves the intervention semantics rather than reducing to observational conditioning; this step is load-bearing for the claim that the probabilities are still 'grounded in causal models' when the mechanism is unknown.
minor comments (1)
- [Abstract] The abstract states the extension is possible but contains no equations or formal definitions; a one-sentence formal statement of the unknown-mechanism case would improve readability.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive comments on Section 4. We agree that the presentation of the unknown-mechanism case requires additional formalization to make the extension fully rigorous and verifiable.
read point-by-point responses
-
Referee: [Section 4] Section 4 (the extension to unknown mechanisms): the manuscript asserts that interventional probabilities remain available without an explicit graph, yet supplies neither a formal statement of the unknown-mechanism setting nor the derivation steps that obtain the same interventional probabilities from a distribution over mechanisms; without these steps it is impossible to verify that the grounding claim survives the extension.
Authors: We acknowledge that the current text does not contain an explicit formal definition of the unknown-mechanism setting or the full derivation. In the revised manuscript we will add a subsection that (i) defines the decision maker's information as a probability distribution over a class of causal mechanisms consistent with the observed data, and (ii) derives the interventional probability by first applying the do-operator inside each mechanism and then integrating with respect to the distribution over mechanisms. This will make the claim that the same interventional probabilities are recovered fully explicit. revision: yes
-
Referee: [Section 4] Section 4, paragraph on the unknown-case probabilities: the argument appears to rely on an implicit mixture over possible causal graphs, but no independent justification is given that this mixture preserves the intervention semantics rather than reducing to observational conditioning; this step is load-bearing for the claim that the probabilities are still 'grounded in causal models' when the mechanism is unknown.
Authors: We agree that an explicit justification is required. In the revision we will insert a paragraph showing that the mixture is taken after the intervention: the quantity is defined as the expectation, over the posterior on mechanisms, of P(Y | do(X), M), where each term uses the causal semantics of its own mechanism M. Because the do-operator is applied inside the integral rather than outside, the resulting probability does not coincide with ordinary conditioning on the observational distribution; we will include a short proof that the two expressions differ in general. revision: yes
Circularity Check
No significant circularity; derivation self-contained
full rationale
The provided abstract and context describe an extension of a prior causal decision-making result to unknown mechanisms, grounding action-outcome probabilities in causal models via interventions. No equations, fitted parameters, or self-citations are quoted that reduce any claimed probability or equilibrium to the input by construction. The central claim relies on treating actions as interventions, which is an independent modeling choice rather than a definitional loop or renamed empirical pattern. Without load-bearing self-citation chains or ansatzes smuggled via prior work by the same authors, the derivation does not reduce to its inputs. This is the expected honest non-finding for a high-level conceptual extension paper.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Actions are regarded as an intervention over a causal model
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Theorem 12 (Causal Savage): ... probability distribution PC over family F of causal graphical models ... Pg(c|do(a))PC(g)
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
actions regarded as intervention over causal model ... do-operator
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Causal thinking in judgment under uncertainty
Amos Tversky and Daniel Kahneman. Causal thinking in judgment under uncertainty. InBasic problems in methodol- ogy and linguistics, pages 167–190. Springer, 1977
work page 1977
-
[2]
Glymour, and Richard Scheines.Causation, prediction and search
Peter Spirtes, Clark N. Glymour, and Richard Scheines.Causation, prediction and search. MIT Press, 2000
work page 2000
-
[3]
Michael R Waldmann and York Hagmayer. Causal reasoning. In Daniel Reisberg, editor,The Oxford Handbook of Cognitive Psychology. Oxford University Press, 2013
work page 2013
-
[4]
Unifying the mind: Cognitive representations as graphical models
David Danks. Unifying the mind: Cognitive representations as graphical models. MIT Press, 2014
work page 2014
-
[5]
Brenden M. Lake, Tomer D. Ullman, Joshua B. Tenenbaum, and Samuel J. Gershman. Building machines that learn and think like people.Behavioral and Brain Sciences, 40, 2017
work page 2017
-
[6]
The Book of Why: The New Science of Cause and Effect
Judea Pearl and Dana Mackenzie. The Book of Why: The New Science of Cause and Effect. Basic Books, 2018. 16 Mauricio Gonzalez-Soto, L. Enrique Sucar, and Hugo Jair Escalante,
work page 2018
-
[7]
Martin Neil, Norman Fenton, Magda Osman, and David Lagnado. Causality, the critical but often ignored com- ponent guiding us through a world of uncertainties in risk assessment.Journal of Risk Research, 0(0):1–5, 2019. 10.1080/13669877.2019.1604564. URL https://doi.org/10.1080/13669877.2019.1604564
-
[8]
Making things happen: A theory of causal explanation
James Woodward. Making things happen: A theory of causal explanation. Oxford Studies in Philosophy of Science. Oxford University Press, 2003
work page 2003
-
[9]
The free-energy principle: a unified brain theory?Nature Reviews Neuroscience, 11(2):127–138, 2010
Karl Friston. The free-energy principle: a unified brain theory?Nature Reviews Neuroscience, 11(2):127–138, 2010
work page 2010
- [10]
-
[11]
Surfing uncertainty: Prediction, action, and the embodied mind
Andy Clark. Surfing uncertainty: Prediction, action, and the embodied mind. Oxford University Press, 2015
work page 2015
-
[12]
Edmund R Hunt, Roland J Baddeley, Alan Worley, Ana B Sendova-Franks, and Nigel R Franks. Ants determine their next move at rest: motor planning and causality in complex systems.Royal Society open science, 3(1):150534, 2016
work page 2016
-
[13]
Causality, chaos, explanation and prediction in economics and finance
William A Brock. Causality, chaos, explanation and prediction in economics and finance. InBeyond Belief, pages 230–279. CRC Press, 2018
work page 2018
-
[14]
Jose Miguel Bernardo and Adrian F. M. Smith. Bayesian theory.Wiley Series in Probability and Statistics., 1994
work page 1994
-
[15]
Cambridge University Press, 2000
Daniel Kahneman and Amos Tversky.Choices, Values, and Frames. Cambridge University Press, 2000
work page 2000
-
[16]
Theory of Decision under Uncertainty
Itzhak Gilboa. Theory of Decision under Uncertainty. Cambridge University Press, 2009
work page 2009
-
[17]
Princeton University Press, 1944
John Von Neumann and Oskar Morgenstern.Theory of games and economic behavior. Princeton University Press, 1944
work page 1944
-
[18]
Leonard Jimmie Savage.The Foundations of Statistics.New York: John Wiley & Sons, 1954
work page 1954
-
[19]
Richard S. Sutton and Andrew G. Barto.Reinforcement Learning: An introduction. MIT Press, 1998
work page 1998
- [20]
-
[21]
Game theory: Decisions, Interaction and Evolution
James N Webb. Game theory: Decisions, Interaction and Evolution. Springer Undergraduate Mathematics Series, 2007
work page 2007
-
[22]
Causal schemas in judgments under uncertainty.Progress in social psychology, 1:49–72, 1980
Amos Tversky and Daniel Kahneman. Causal schemas in judgments under uncertainty.Progress in social psychology, 1:49–72, 1980
work page 1980
-
[23]
Beyond covariation.Causal learning: Psychology, philosophy, and computation, pages 154–172, 2007
David A Lagnado, Michael R Waldmann, York Hagmayer, and Steven A Sloman. Beyond covariation.Causal learning: Psychology, philosophy, and computation, pages 154–172, 2007
work page 2007
-
[24]
Causal learning through repeated decision making
York Hagmayer and Björn Meder. Causal learning through repeated decision making. InProceedings of the Annual Meeting of the Cognitive Science Society, volume 30, 2008
work page 2008
-
[25]
York Hagmayer and Steven A. Sloman. Decision makers conceive of their choices as interventions.Journal of Experi- mental Psychology: General, 138(1):22, 2009
work page 2009
-
[26]
York Hagmayer and Björn Meder. Repeated causal decision making.Journal of Experimental Psychology: Learning, Memory, and Cognition, 39(1):33, 2013
work page 2013
-
[27]
York Hagmayer and Philip Fernbach. Causality in decision-making. In Michael R. Waldmann, editor,The Oxford Handbook of Causal Reasoning. Oxford University Press, 2017
work page 2017
-
[28]
Newcomb’s problem and two principles of choice
Robert Nozick. Newcomb’s problem and two principles of choice. InEssays in honor of Carl G. Hempel, pages 114–
-
[29]
Causal decision theory.Australasian journal of philosophy, 59(1):5–30, 1981
David Lewis. Causal decision theory.Australasian journal of philosophy, 59(1):5–30, 1981
work page 1981
-
[30]
Cambridge University Press, 1999
James M Joyce.The Foundations of Causal Decision Theory. Cambridge University Press, 1999
work page 1999
-
[31]
Rational decision and causality
Ellery Eells. Rational decision and causality. Cambridge University Press, 1982
work page 1982
-
[32]
Causality: Models, Reasoning and Inference
Judea Pearl. Causality: Models, Reasoning and Inference. Cambridge University Press, New York, NY, USA, 2nd edition, 2009. ISBN 052189560X, 9780521895606
work page 2009
-
[33]
Elias Bareinboim, A. Forney, and Judea Pearl. Bandits with unobserved confounders: A causal approach. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, editors,Advances in Neural Information Processing Sys- tems 28, pages 1342–1350. Curran Associates, Inc., 2015
work page 2015
-
[34]
Causal bandits: Learning good interventions via causal infer- ence
Finnian Lattimore, Tor Lattimore, and Mark D Reid. Causal bandits: Learning good interventions via causal infer- ence. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, editors,Advances in Neural Information Processing Systems 29, pages 1181–1189. Curran Associates, Inc., 2016
work page 2016
-
[35]
Identifying best interventions through online importance sampling
Rajat Sen, Karthikeyan Shanmugam, Alexandros G Dimakis, and Sanjay Shakkottai. Identifying best interventions through online importance sampling. InInternational Conference on Machine Learning, pages 3057–3066, 2017
work page 2017
-
[36]
Playing against Nature: causal discovery for decision making under uncertainty
Mauricio Gonzalez-Soto, Luis Enrique Sucar, and Hugo Jair Escalante. Playing against nature: causal discovery for decision making under uncertainty. InMachine Learning for Causal Inference, Counterfactual Prediction and Au- tonomous Action (CausalML) Workshop at ICML 2018, 2018. URL https://arxiv.org/abs/1807.01268
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[37]
Causal Structure Learning: a Bayesian approach based on random graphs
Mauricio Gonzalez-Soto, Ivan Feliciano-Avelino, L. Enrique Sucar, and Hugo Jair Escalante. Learning a causal structure: a bayesian random graph approach. InLatinX in AI Research at NeurIPS 2020, 2020. URL https: //arxiv.org/abs/2010.06164
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[38]
Martin J Osborne and Ariel Rubinstein.A course in game theory. MIT press, 1994
work page 1994
- [39]
- [40]
-
[41]
Games with incomplete information played by ‘bayesian’players, part iii
John C Harsanyi. Games with incomplete information played by ‘bayesian’players, part iii. the basic probability distri- bution of the game.Management Science, 14(7):486–502, 1968
work page 1968
-
[42]
Prospect theory: An analysis of decision under risk.Econometrica, 47(2):263– 292, 1979
Daniel Kahneman and Amos Tversky. Prospect theory: An analysis of decision under risk.Econometrica, 47(2):263– 292, 1979
work page 1979
-
[43]
Case-based decision theory.The Quarterly Journal of Economics, 110(3):605– 639, 1995
Itzhak Gilboa and David Schmeidler. Case-based decision theory.The Quarterly Journal of Economics, 110(3):605– 639, 1995
work page 1995
-
[44]
Kreps.Notes On The Theory Of Choice
David M. Kreps.Notes On The Theory Of Choice. Routledge New York, 1988
work page 1988
-
[45]
Thorsten Hens. A note on savage’s theorem with a finite number of states.Journal of Risk and Uncertainty, 5(1): 63–71, 1992
work page 1992
-
[46]
Risk, ambiguity, and the savage axioms.The quarterly journal of economics, pages 643–669, 1961
Daniel Ellsberg. Risk, ambiguity, and the savage axioms.The quarterly journal of economics, pages 643–669, 1961
work page 1961
-
[47]
Ken Binmore. Rational Decisions. The Gorman Lectures in Economics. Princeton University Press, 2008
work page 2008
-
[48]
Clive WJ Granger. Investigating causal relations by econometric models and cross-spectral methods.Econometrica: Journal of the Econometric Society, pages 424–438, 1969
work page 1969
-
[49]
Leslie Lamport. Time, clocks, and the ordering of events in a distributed system.Communications of the ACM, 21 (7):558–565, 1978
work page 1978
-
[50]
Daphne Koller and Nir Friedman.Probabilistic graphical models: principles and techniques. MIT press, 2009
work page 2009
-
[51]
Rand McNally College Publishing Company Chicago, 1979
Donald Thomas Campbell and Thomas D Cook.Quasi-experimentation: Design & analysis issues for field settings. Rand McNally College Publishing Company Chicago, 1979
work page 1979
-
[52]
Paul W Holland. Statistics and causal inference.Journal of the American Statistical Association, 81(396):945–960, 1986
work page 1986
-
[53]
Advances in Computer Vision and Pattern Recognition
Luis Enrique Sucar.Probabilistic Graphical Models. Advances in Computer Vision and Pattern Recognition. Springer London, 2015
work page 2015
-
[54]
Recursive causal models.Journal of the australian Mathematical Society, 36(1):30–52, 1984
Harri Kiiveri, Terry P Speed, and John B Carlin. Recursive causal models.Journal of the australian Mathematical Society, 36(1):30–52, 1984
work page 1984
-
[55]
Robert Stalnaker. A theory of conditionals. In Nicholas Rescher, editor,Studies in Logical Theory (American Philo- sophical Quarterly Monographs 2), pages 98–112. Oxford: Blackwell, 1968
work page 1968
-
[56]
Counterfactuals and two kinds of expected utility
Allan Gibbard and William L Harper. Counterfactuals and two kinds of expected utility. In William L. Harper, Robert Stalnaker, and Glenn Pearce, editors,IFS: Conditionals, Belief, Decision, Chance and Time, pages 153–190. Springer Netherlands, Dordrecht, 1978
work page 1978
-
[57]
Causality, probability, and time
Samantha Kleinberg. Causality, probability, and time. Cambridge University Press, 2013
work page 2013
-
[58]
David Heckerman and Ross Shachter. Decision-theoretic foundations for causal reasoning.Journal of Artificial Intelli- gence Research, 3:405–430, 1995
work page 1995
-
[59]
Philip Dawid. Influence diagrams for causal modelling and inference.International Statistical Review, 70(2):161–189, 2002
work page 2002
-
[60]
The Decision-Theoretic Approach to Causal Inference, chapter 4, pages 25–42
Philip Dawid. The Decision-Theoretic Approach to Causal Inference, chapter 4, pages 25–42. John Wiley & Sons, Ltd, 2012. ISBN 9781119945710. 10.1002/9781119945710.ch4. URL https://onlinelibrary.wiley.com/doi/abs/10. 1002/9781119945710.ch4
-
[61]
Causal inference using influence diagrams: the problem of partial compliance, pages 45–81
Philip Dawid. Causal inference using influence diagrams: the problem of partial compliance, pages 45–81. Oxford University Presss, 2003. ISBN 9780198510550
work page 2003
-
[62]
Identifying optimal sequential decisions
Philip Dawid and Vanessa Didelez. Identifying optimal sequential decisions. InProceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence, pages 113–120. AUAI Press, 2008
work page 2008
-
[63]
Pure exploration in multi-armed bandits problems
Sébastien Bubeck, Rémi Munos, and Gilles Stoltz. Pure exploration in multi-armed bandits problems. InInternational conference on Algorithmic learning theory, pages 23–37. Springer, 2009
work page 2009
-
[64]
Best arm identification in multi-armed bandits
Jean-Yves Audibert and Sébastien Bubeck. Best arm identification in multi-armed bandits. InCOLT-23th Conference on Learning Theory-2010, pages 13–p, 2010
work page 2010
-
[65]
Best arm identification: A unified approach to fixed budget and fixed confidence
Victor Gabillon, Mohammad Ghavamzadeh, and Alessandro Lazaric. Best arm identification: A unified approach to fixed budget and fixed confidence. InAdvances in Neural Information Processing Systems, pages 3212–3220, 2012
work page 2012
-
[66]
Taming the monster: A fast and simple algorithm for contextual bandits
Alekh Agarwal, Daniel Hsu, Satyen Kale, John Langford, Lihong Li, and Robert Schapire. Taming the monster: A fast and simple algorithm for contextual bandits. InInternational Conference on Machine Learning, pages 1638–1646, 2014
work page 2014
-
[67]
lil’ucb: An optimal exploration algorithm for multi-armed bandits
Kevin Jamieson, Matthew Malloy, Robert Nowak, and Sébastien Bubeck. lil’ucb: An optimal exploration algorithm for multi-armed bandits. InConference on Learning Theory, pages 423–439, 2014
work page 2014
-
[68]
Best-arm identification algorithms for multi-armed bandits in the fixed confidence setting
Kevin Jamieson and Robert Nowak. Best-arm identification algorithms for multi-armed bandits in the fixed confidence setting. In Information Sciences and Systems (CISS), 2014 48th Annual Conference on, pages 1–6. IEEE, 2014
work page 2014
-
[69]
On the Optimal Sample Complexity for Best Arm Identification
Lijie Chen and Jian Li. On the optimal sample complexity for best arm identification.arXiv preprint arXiv:1511.03774, 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[70]
Tight (lower) bounds for the fixed budget best arm identification bandit problem
Alexandra Carpentier and Andrea Locatelli. Tight (lower) bounds for the fixed budget best arm identification bandit problem. InConference on Learning Theory, pages 590–604, 2016. 18 Mauricio Gonzalez-Soto, L. Enrique Sucar, and Hugo Jair Escalante,
work page 2016
-
[71]
Simple bayesian algorithms for best arm identification
Daniel Russo. Simple bayesian algorithms for best arm identification. InConference on Learning Theory, pages 1417– 1418, 2016
work page 2016
-
[72]
On the complexity of best-arm identification in multi-armed bandit models
Emilie Kaufmann, Olivier Cappé, and Aurélien Garivier. On the complexity of best-arm identification in multi-armed bandit models. The Journal of Machine Learning Research, 17(1):1–42, 2016
work page 2016
-
[73]
Pedro A. Ortega and Daniel A. Braun. Generalized thompson sampling for sequential decision-making and causal inference. Complex Adaptive Systems Modeling, 2(1):2, 2014
work page 2014
-
[74]
Reinforcement learning and causal models
Samuel J Gershman. Reinforcement learning and causal models. InThe Oxford Handbook of Causal Reasoning. 2015
work page 2015
-
[75]
Markov decision processes with unobserved confounders: A causal approach
Junzhe Zhang and Elias Bareinboim. Markov decision processes with unobserved confounders: A causal approach. Technical report, Technical Report R-23, Purdue AI Lab, 2016
work page 2016
-
[76]
Transfer learning in multi-armed bandit: a causal approach
Junzhe Zhang and Elias Bareinboim. Transfer learning in multi-armed bandit: a causal approach. InProceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, pages 1778–1780, 2017
work page 2017
- [77]
-
[78]
J. Correa and Elias Bareinboim. From statistical transportability to estimating the effect of stochastic interventions. In S. Kraus, editor,Proceedings of the 28th International Joint Conference on Artificial Intelligence, pages 1661– 1667, Macao, China, 2019. International Joint Conferences on Artificial Intelligence Organization
work page 2019
-
[79]
Near-optimal reinforcement learning in dynamic treatment regimes
Junzhe Zhang and Elias Bareinboim. Near-optimal reinforcement learning in dynamic treatment regimes. In H. Wal- lach, H. Larochelle, A. Beygelzimer, F. d’Alché Buc, E. Fox, and R. Garnett, editors,Advances in Neural Information Processing Systems 32, pages 13401–13411, Vancouver, Canada, 2019. Curran Associates, Inc
work page 2019
-
[80]
Steven A. Sloman and York Hagmayer. The causal psycho-logic of choice.Trends in Cognitive Sciences, 10(9):407– 412, 2006
work page 2006
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.