pith. sign in

arxiv: 1907.11752 · v6 · submitted 2019-07-26 · 💻 cs.AI · stat.ME

Choosing with unknown causal information: Action-outcome probabilities for decision making can be grounded in causal models

Pith reviewed 2026-05-24 15:31 UTC · model grok-4.3

classification 💻 cs.AI stat.ME
keywords causal decision makinginterventionsunknown causal modelsNash equilibriumexpected utilityaction-outcome probabilitiesrational choice
0
0 comments X

The pith

Decision-making probabilities can be grounded in causal models whether the mechanisms are known or unknown.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that action-outcome probabilities used in rational decision making can be derived from causal models by treating actions as interventions. This grounding holds both when the full causal model is known and when only some parts are unknown to the decision maker. A sympathetic reader would care because it provides a way to use causal information for better decisions even with incomplete knowledge, extending beyond purely probabilistic approaches. It also applies this to game theory by generalizing Nash equilibrium.

Core claim

In decision problems where actions and consequences are causally connected, actions are regarded as interventions over a causal model. The previous result for known causal models is extended to unknown causal mechanisms. This grounds action-outcome probabilities in causal models for both cases. As an application, Nash Equilibrium is extended to strategic games where players consider causal information.

What carries the argument

Treating actions as interventions over a causal model to derive action-outcome probabilities.

If this is right

  • Action-outcome probabilities for expected utility calculations are obtained from interventions on the causal model when known.
  • The derivation extends directly to cases where the causal mechanism is unknown to the decision maker.
  • Strategic games have equilibria defined using causal information from player interventions.
  • Rational decision making remains possible under uncertainty with causally grounded probabilities in both settings.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • AI decision systems might benefit from maintaining possible causal models to compute intervention effects rather than relying solely on observed correlations.
  • Human experiments could test whether people naturally use causal intervention reasoning in unknown-mechanism choice tasks.
  • This approach could generalize to multi-agent settings beyond games, such as cooperative planning with uncertain causal links.
  • Further work might derive bounds on decision quality when only approximate causal models are available.

Load-bearing premise

Actions can be treated as interventions on an underlying causal model even when the decision maker does not know the precise causal mechanism that controls the environment.

What would settle it

Finding a rational decision problem with unknown causality where no set of action-outcome probabilities can be consistently derived from any causal intervention model would disprove the extension.

read the original abstract

Decision-making under uncertainty and causal thinking are fundamental aspects of intelligent reasoning. Decision-making has been well studied when the available information is considered at the associative (probabilistic) level. The classical Theorems of von Neumann-Morgenstern and Savage provide a formal criterion for rational choice using associative information: maximize expected utility. There is an ongoing debate around the origin of probabilities involved in such calculation. In this work, we will show how the probabilities for decision-making can be grounded in causal models by considering decision problems in which the available actions and consequences are causally connected. In this setting, actions are regarded as an intervention over a causal model. Then, we extend a previous causal decision-making result, which relies on a known causal model, to the case in which the causal mechanism that controls some environment is unknown to a rational decision-maker. In this way, action-outcome probabilities can be grounded in causal models in known and unknown cases. Finally, as an application, we extend the well-known concept of Nash Equilibrium to the case in which the players of a strategic game consider causal information.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript claims that action-outcome probabilities in decision problems can be grounded in causal models (via the do-operator on an underlying graph) both when the causal mechanism is known and when it is unknown to the decision maker; it extends a prior known-model result to the unknown case and applies the framework to extend Nash equilibrium to strategic games that incorporate causal information.

Significance. If the central extension is rigorously derived, the work would strengthen the link between causal semantics and expected-utility calculations in settings where the decision maker lacks an explicit mechanism, with potential implications for causal decision theory and game-theoretic solution concepts.

major comments (2)
  1. [Section 4] Section 4 (the extension to unknown mechanisms): the manuscript asserts that interventional probabilities remain available without an explicit graph, yet supplies neither a formal statement of the unknown-mechanism setting nor the derivation steps that obtain the same interventional probabilities from a distribution over mechanisms; without these steps it is impossible to verify that the grounding claim survives the extension.
  2. [Section 4] Section 4, paragraph on the unknown-case probabilities: the argument appears to rely on an implicit mixture over possible causal graphs, but no independent justification is given that this mixture preserves the intervention semantics rather than reducing to observational conditioning; this step is load-bearing for the claim that the probabilities are still 'grounded in causal models' when the mechanism is unknown.
minor comments (1)
  1. [Abstract] The abstract states the extension is possible but contains no equations or formal definitions; a one-sentence formal statement of the unknown-mechanism case would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments on Section 4. We agree that the presentation of the unknown-mechanism case requires additional formalization to make the extension fully rigorous and verifiable.

read point-by-point responses
  1. Referee: [Section 4] Section 4 (the extension to unknown mechanisms): the manuscript asserts that interventional probabilities remain available without an explicit graph, yet supplies neither a formal statement of the unknown-mechanism setting nor the derivation steps that obtain the same interventional probabilities from a distribution over mechanisms; without these steps it is impossible to verify that the grounding claim survives the extension.

    Authors: We acknowledge that the current text does not contain an explicit formal definition of the unknown-mechanism setting or the full derivation. In the revised manuscript we will add a subsection that (i) defines the decision maker's information as a probability distribution over a class of causal mechanisms consistent with the observed data, and (ii) derives the interventional probability by first applying the do-operator inside each mechanism and then integrating with respect to the distribution over mechanisms. This will make the claim that the same interventional probabilities are recovered fully explicit. revision: yes

  2. Referee: [Section 4] Section 4, paragraph on the unknown-case probabilities: the argument appears to rely on an implicit mixture over possible causal graphs, but no independent justification is given that this mixture preserves the intervention semantics rather than reducing to observational conditioning; this step is load-bearing for the claim that the probabilities are still 'grounded in causal models' when the mechanism is unknown.

    Authors: We agree that an explicit justification is required. In the revision we will insert a paragraph showing that the mixture is taken after the intervention: the quantity is defined as the expectation, over the posterior on mechanisms, of P(Y | do(X), M), where each term uses the causal semantics of its own mechanism M. Because the do-operator is applied inside the integral rather than outside, the resulting probability does not coincide with ordinary conditioning on the observational distribution; we will include a short proof that the two expressions differ in general. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained

full rationale

The provided abstract and context describe an extension of a prior causal decision-making result to unknown mechanisms, grounding action-outcome probabilities in causal models via interventions. No equations, fitted parameters, or self-citations are quoted that reduce any claimed probability or equilibrium to the input by construction. The central claim relies on treating actions as interventions, which is an independent modeling choice rather than a definitional loop or renamed empirical pattern. Without load-bearing self-citation chains or ansatzes smuggled via prior work by the same authors, the derivation does not reduce to its inputs. This is the expected honest non-finding for a high-level conceptual extension paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract only; the paper relies on the standard assumption that actions are interventions in a causal model, with no free parameters, new entities, or additional axioms stated.

axioms (1)
  • domain assumption Actions are regarded as an intervention over a causal model
    Explicitly stated in the abstract as the modeling choice that allows grounding of probabilities.

pith-pipeline@v0.9.0 · 5736 in / 1199 out tokens · 47422 ms · 2026-05-24T15:31:54.761776+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

88 extracted references · 88 canonical work pages · 3 internal anchors

  1. [1]

    Causal thinking in judgment under uncertainty

    Amos Tversky and Daniel Kahneman. Causal thinking in judgment under uncertainty. InBasic problems in methodol- ogy and linguistics, pages 167–190. Springer, 1977

  2. [2]

    Glymour, and Richard Scheines.Causation, prediction and search

    Peter Spirtes, Clark N. Glymour, and Richard Scheines.Causation, prediction and search. MIT Press, 2000

  3. [3]

    Causal reasoning

    Michael R Waldmann and York Hagmayer. Causal reasoning. In Daniel Reisberg, editor,The Oxford Handbook of Cognitive Psychology. Oxford University Press, 2013

  4. [4]

    Unifying the mind: Cognitive representations as graphical models

    David Danks. Unifying the mind: Cognitive representations as graphical models. MIT Press, 2014

  5. [5]

    Lake, Tomer D

    Brenden M. Lake, Tomer D. Ullman, Joshua B. Tenenbaum, and Samuel J. Gershman. Building machines that learn and think like people.Behavioral and Brain Sciences, 40, 2017

  6. [6]

    The Book of Why: The New Science of Cause and Effect

    Judea Pearl and Dana Mackenzie. The Book of Why: The New Science of Cause and Effect. Basic Books, 2018. 16 Mauricio Gonzalez-Soto, L. Enrique Sucar, and Hugo Jair Escalante,

  7. [7]

    Causality, the critical but often ignored com- ponent guiding us through a world of uncertainties in risk assessment.Journal of Risk Research, 0(0):1–5, 2019

    Martin Neil, Norman Fenton, Magda Osman, and David Lagnado. Causality, the critical but often ignored com- ponent guiding us through a world of uncertainties in risk assessment.Journal of Risk Research, 0(0):1–5, 2019. 10.1080/13669877.2019.1604564. URL https://doi.org/10.1080/13669877.2019.1604564

  8. [8]

    Making things happen: A theory of causal explanation

    James Woodward. Making things happen: A theory of causal explanation. Oxford Studies in Philosophy of Science. Oxford University Press, 2003

  9. [9]

    The free-energy principle: a unified brain theory?Nature Reviews Neuroscience, 11(2):127–138, 2010

    Karl Friston. The free-energy principle: a unified brain theory?Nature Reviews Neuroscience, 11(2):127–138, 2010

  10. [10]

    The predictive mind

    Jakob Hohwy. The predictive mind. Oxford University Press, 2013

  11. [11]

    Surfing uncertainty: Prediction, action, and the embodied mind

    Andy Clark. Surfing uncertainty: Prediction, action, and the embodied mind. Oxford University Press, 2015

  12. [12]

    Ants determine their next move at rest: motor planning and causality in complex systems.Royal Society open science, 3(1):150534, 2016

    Edmund R Hunt, Roland J Baddeley, Alan Worley, Ana B Sendova-Franks, and Nigel R Franks. Ants determine their next move at rest: motor planning and causality in complex systems.Royal Society open science, 3(1):150534, 2016

  13. [13]

    Causality, chaos, explanation and prediction in economics and finance

    William A Brock. Causality, chaos, explanation and prediction in economics and finance. InBeyond Belief, pages 230–279. CRC Press, 2018

  14. [14]

    Jose Miguel Bernardo and Adrian F. M. Smith. Bayesian theory.Wiley Series in Probability and Statistics., 1994

  15. [15]

    Cambridge University Press, 2000

    Daniel Kahneman and Amos Tversky.Choices, Values, and Frames. Cambridge University Press, 2000

  16. [16]

    Theory of Decision under Uncertainty

    Itzhak Gilboa. Theory of Decision under Uncertainty. Cambridge University Press, 2009

  17. [17]

    Princeton University Press, 1944

    John Von Neumann and Oskar Morgenstern.Theory of games and economic behavior. Princeton University Press, 1944

  18. [18]

    Leonard Jimmie Savage.The Foundations of Statistics.New York: John Wiley & Sons, 1954

  19. [19]

    Sutton and Andrew G

    Richard S. Sutton and Andrew G. Barto.Reinforcement Learning: An introduction. MIT Press, 1998

  20. [20]

    Puterman

    Martin L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York, NY, USA, 1st edition, 1994. ISBN 0471619779

  21. [21]

    Game theory: Decisions, Interaction and Evolution

    James N Webb. Game theory: Decisions, Interaction and Evolution. Springer Undergraduate Mathematics Series, 2007

  22. [22]

    Causal schemas in judgments under uncertainty.Progress in social psychology, 1:49–72, 1980

    Amos Tversky and Daniel Kahneman. Causal schemas in judgments under uncertainty.Progress in social psychology, 1:49–72, 1980

  23. [23]

    Beyond covariation.Causal learning: Psychology, philosophy, and computation, pages 154–172, 2007

    David A Lagnado, Michael R Waldmann, York Hagmayer, and Steven A Sloman. Beyond covariation.Causal learning: Psychology, philosophy, and computation, pages 154–172, 2007

  24. [24]

    Causal learning through repeated decision making

    York Hagmayer and Björn Meder. Causal learning through repeated decision making. InProceedings of the Annual Meeting of the Cognitive Science Society, volume 30, 2008

  25. [25]

    York Hagmayer and Steven A. Sloman. Decision makers conceive of their choices as interventions.Journal of Experi- mental Psychology: General, 138(1):22, 2009

  26. [26]

    Repeated causal decision making.Journal of Experimental Psychology: Learning, Memory, and Cognition, 39(1):33, 2013

    York Hagmayer and Björn Meder. Repeated causal decision making.Journal of Experimental Psychology: Learning, Memory, and Cognition, 39(1):33, 2013

  27. [27]

    Causality in decision-making

    York Hagmayer and Philip Fernbach. Causality in decision-making. In Michael R. Waldmann, editor,The Oxford Handbook of Causal Reasoning. Oxford University Press, 2017

  28. [28]

    Newcomb’s problem and two principles of choice

    Robert Nozick. Newcomb’s problem and two principles of choice. InEssays in honor of Carl G. Hempel, pages 114–

  29. [29]

    Causal decision theory.Australasian journal of philosophy, 59(1):5–30, 1981

    David Lewis. Causal decision theory.Australasian journal of philosophy, 59(1):5–30, 1981

  30. [30]

    Cambridge University Press, 1999

    James M Joyce.The Foundations of Causal Decision Theory. Cambridge University Press, 1999

  31. [31]

    Rational decision and causality

    Ellery Eells. Rational decision and causality. Cambridge University Press, 1982

  32. [32]

    Causality: Models, Reasoning and Inference

    Judea Pearl. Causality: Models, Reasoning and Inference. Cambridge University Press, New York, NY, USA, 2nd edition, 2009. ISBN 052189560X, 9780521895606

  33. [33]

    Forney, and Judea Pearl

    Elias Bareinboim, A. Forney, and Judea Pearl. Bandits with unobserved confounders: A causal approach. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, editors,Advances in Neural Information Processing Sys- tems 28, pages 1342–1350. Curran Associates, Inc., 2015

  34. [34]

    Causal bandits: Learning good interventions via causal infer- ence

    Finnian Lattimore, Tor Lattimore, and Mark D Reid. Causal bandits: Learning good interventions via causal infer- ence. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, editors,Advances in Neural Information Processing Systems 29, pages 1181–1189. Curran Associates, Inc., 2016

  35. [35]

    Identifying best interventions through online importance sampling

    Rajat Sen, Karthikeyan Shanmugam, Alexandros G Dimakis, and Sanjay Shakkottai. Identifying best interventions through online importance sampling. InInternational Conference on Machine Learning, pages 3057–3066, 2017

  36. [36]

    Playing against Nature: causal discovery for decision making under uncertainty

    Mauricio Gonzalez-Soto, Luis Enrique Sucar, and Hugo Jair Escalante. Playing against nature: causal discovery for decision making under uncertainty. InMachine Learning for Causal Inference, Counterfactual Prediction and Au- tonomous Action (CausalML) Workshop at ICML 2018, 2018. URL https://arxiv.org/abs/1807.01268

  37. [37]

    Causal Structure Learning: a Bayesian approach based on random graphs

    Mauricio Gonzalez-Soto, Ivan Feliciano-Avelino, L. Enrique Sucar, and Hugo Jair Escalante. Learning a causal structure: a bayesian random graph approach. InLatinX in AI Research at NeurIPS 2020, 2020. URL https: //arxiv.org/abs/2010.06164

  38. [38]

    MIT press, 1994

    Martin J Osborne and Ariel Rubinstein.A course in game theory. MIT press, 1994

  39. [39]

    bayesian

    John C Harsanyi. Games with incomplete information played by “bayesian” players, i–iii part i. the basic model.Man- agement science, 14(3):159–182, 1967. Mauricio Gonzalez-Soto, L. Enrique Sucar, and Hugo Jair Escalante, 17

  40. [40]

    bayesian

    John C Harsanyi. Games with incomplete information played by “bayesian” players part ii. bayesian equilibrium points. Management Science, 14(5):320–334, 1968

  41. [41]

    Games with incomplete information played by ‘bayesian’players, part iii

    John C Harsanyi. Games with incomplete information played by ‘bayesian’players, part iii. the basic probability distri- bution of the game.Management Science, 14(7):486–502, 1968

  42. [42]

    Prospect theory: An analysis of decision under risk.Econometrica, 47(2):263– 292, 1979

    Daniel Kahneman and Amos Tversky. Prospect theory: An analysis of decision under risk.Econometrica, 47(2):263– 292, 1979

  43. [43]

    Case-based decision theory.The Quarterly Journal of Economics, 110(3):605– 639, 1995

    Itzhak Gilboa and David Schmeidler. Case-based decision theory.The Quarterly Journal of Economics, 110(3):605– 639, 1995

  44. [44]

    Kreps.Notes On The Theory Of Choice

    David M. Kreps.Notes On The Theory Of Choice. Routledge New York, 1988

  45. [45]

    A note on savage’s theorem with a finite number of states.Journal of Risk and Uncertainty, 5(1): 63–71, 1992

    Thorsten Hens. A note on savage’s theorem with a finite number of states.Journal of Risk and Uncertainty, 5(1): 63–71, 1992

  46. [46]

    Risk, ambiguity, and the savage axioms.The quarterly journal of economics, pages 643–669, 1961

    Daniel Ellsberg. Risk, ambiguity, and the savage axioms.The quarterly journal of economics, pages 643–669, 1961

  47. [47]

    Rational Decisions

    Ken Binmore. Rational Decisions. The Gorman Lectures in Economics. Princeton University Press, 2008

  48. [48]

    Investigating causal relations by econometric models and cross-spectral methods.Econometrica: Journal of the Econometric Society, pages 424–438, 1969

    Clive WJ Granger. Investigating causal relations by econometric models and cross-spectral methods.Econometrica: Journal of the Econometric Society, pages 424–438, 1969

  49. [49]

    Time, clocks, and the ordering of events in a distributed system.Communications of the ACM, 21 (7):558–565, 1978

    Leslie Lamport. Time, clocks, and the ordering of events in a distributed system.Communications of the ACM, 21 (7):558–565, 1978

  50. [50]

    MIT press, 2009

    Daphne Koller and Nir Friedman.Probabilistic graphical models: principles and techniques. MIT press, 2009

  51. [51]

    Rand McNally College Publishing Company Chicago, 1979

    Donald Thomas Campbell and Thomas D Cook.Quasi-experimentation: Design & analysis issues for field settings. Rand McNally College Publishing Company Chicago, 1979

  52. [52]

    Statistics and causal inference.Journal of the American Statistical Association, 81(396):945–960, 1986

    Paul W Holland. Statistics and causal inference.Journal of the American Statistical Association, 81(396):945–960, 1986

  53. [53]

    Advances in Computer Vision and Pattern Recognition

    Luis Enrique Sucar.Probabilistic Graphical Models. Advances in Computer Vision and Pattern Recognition. Springer London, 2015

  54. [54]

    Recursive causal models.Journal of the australian Mathematical Society, 36(1):30–52, 1984

    Harri Kiiveri, Terry P Speed, and John B Carlin. Recursive causal models.Journal of the australian Mathematical Society, 36(1):30–52, 1984

  55. [55]

    A theory of conditionals

    Robert Stalnaker. A theory of conditionals. In Nicholas Rescher, editor,Studies in Logical Theory (American Philo- sophical Quarterly Monographs 2), pages 98–112. Oxford: Blackwell, 1968

  56. [56]

    Counterfactuals and two kinds of expected utility

    Allan Gibbard and William L Harper. Counterfactuals and two kinds of expected utility. In William L. Harper, Robert Stalnaker, and Glenn Pearce, editors,IFS: Conditionals, Belief, Decision, Chance and Time, pages 153–190. Springer Netherlands, Dordrecht, 1978

  57. [57]

    Causality, probability, and time

    Samantha Kleinberg. Causality, probability, and time. Cambridge University Press, 2013

  58. [58]

    Decision-theoretic foundations for causal reasoning.Journal of Artificial Intelli- gence Research, 3:405–430, 1995

    David Heckerman and Ross Shachter. Decision-theoretic foundations for causal reasoning.Journal of Artificial Intelli- gence Research, 3:405–430, 1995

  59. [59]

    Influence diagrams for causal modelling and inference.International Statistical Review, 70(2):161–189, 2002

    Philip Dawid. Influence diagrams for causal modelling and inference.International Statistical Review, 70(2):161–189, 2002

  60. [60]

    The Decision-Theoretic Approach to Causal Inference, chapter 4, pages 25–42

    Philip Dawid. The Decision-Theoretic Approach to Causal Inference, chapter 4, pages 25–42. John Wiley & Sons, Ltd, 2012. ISBN 9781119945710. 10.1002/9781119945710.ch4. URL https://onlinelibrary.wiley.com/doi/abs/10. 1002/9781119945710.ch4

  61. [61]

    Causal inference using influence diagrams: the problem of partial compliance, pages 45–81

    Philip Dawid. Causal inference using influence diagrams: the problem of partial compliance, pages 45–81. Oxford University Presss, 2003. ISBN 9780198510550

  62. [62]

    Identifying optimal sequential decisions

    Philip Dawid and Vanessa Didelez. Identifying optimal sequential decisions. InProceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence, pages 113–120. AUAI Press, 2008

  63. [63]

    Pure exploration in multi-armed bandits problems

    Sébastien Bubeck, Rémi Munos, and Gilles Stoltz. Pure exploration in multi-armed bandits problems. InInternational conference on Algorithmic learning theory, pages 23–37. Springer, 2009

  64. [64]

    Best arm identification in multi-armed bandits

    Jean-Yves Audibert and Sébastien Bubeck. Best arm identification in multi-armed bandits. InCOLT-23th Conference on Learning Theory-2010, pages 13–p, 2010

  65. [65]

    Best arm identification: A unified approach to fixed budget and fixed confidence

    Victor Gabillon, Mohammad Ghavamzadeh, and Alessandro Lazaric. Best arm identification: A unified approach to fixed budget and fixed confidence. InAdvances in Neural Information Processing Systems, pages 3212–3220, 2012

  66. [66]

    Taming the monster: A fast and simple algorithm for contextual bandits

    Alekh Agarwal, Daniel Hsu, Satyen Kale, John Langford, Lihong Li, and Robert Schapire. Taming the monster: A fast and simple algorithm for contextual bandits. InInternational Conference on Machine Learning, pages 1638–1646, 2014

  67. [67]

    lil’ucb: An optimal exploration algorithm for multi-armed bandits

    Kevin Jamieson, Matthew Malloy, Robert Nowak, and Sébastien Bubeck. lil’ucb: An optimal exploration algorithm for multi-armed bandits. InConference on Learning Theory, pages 423–439, 2014

  68. [68]

    Best-arm identification algorithms for multi-armed bandits in the fixed confidence setting

    Kevin Jamieson and Robert Nowak. Best-arm identification algorithms for multi-armed bandits in the fixed confidence setting. In Information Sciences and Systems (CISS), 2014 48th Annual Conference on, pages 1–6. IEEE, 2014

  69. [69]

    On the Optimal Sample Complexity for Best Arm Identification

    Lijie Chen and Jian Li. On the optimal sample complexity for best arm identification.arXiv preprint arXiv:1511.03774, 2015

  70. [70]

    Tight (lower) bounds for the fixed budget best arm identification bandit problem

    Alexandra Carpentier and Andrea Locatelli. Tight (lower) bounds for the fixed budget best arm identification bandit problem. InConference on Learning Theory, pages 590–604, 2016. 18 Mauricio Gonzalez-Soto, L. Enrique Sucar, and Hugo Jair Escalante,

  71. [71]

    Simple bayesian algorithms for best arm identification

    Daniel Russo. Simple bayesian algorithms for best arm identification. InConference on Learning Theory, pages 1417– 1418, 2016

  72. [72]

    On the complexity of best-arm identification in multi-armed bandit models

    Emilie Kaufmann, Olivier Cappé, and Aurélien Garivier. On the complexity of best-arm identification in multi-armed bandit models. The Journal of Machine Learning Research, 17(1):1–42, 2016

  73. [73]

    Ortega and Daniel A

    Pedro A. Ortega and Daniel A. Braun. Generalized thompson sampling for sequential decision-making and causal inference. Complex Adaptive Systems Modeling, 2(1):2, 2014

  74. [74]

    Reinforcement learning and causal models

    Samuel J Gershman. Reinforcement learning and causal models. InThe Oxford Handbook of Causal Reasoning. 2015

  75. [75]

    Markov decision processes with unobserved confounders: A causal approach

    Junzhe Zhang and Elias Bareinboim. Markov decision processes with unobserved confounders: A causal approach. Technical report, Technical Report R-23, Purdue AI Lab, 2016

  76. [76]

    Transfer learning in multi-armed bandit: a causal approach

    Junzhe Zhang and Elias Bareinboim. Transfer learning in multi-armed bandit: a causal approach. InProceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, pages 1778–1780, 2017

  77. [77]

    Lee and E

    S. Lee and E. Bareinboim. Structural causal bandits with non-manipulable variables. InProceedings of the 33rd AAAI Conference on Artificial Intelligence, pages 4164–4172, Honolulu, Hawaii, 2019. AAAI Press

  78. [78]

    Correa and Elias Bareinboim

    J. Correa and Elias Bareinboim. From statistical transportability to estimating the effect of stochastic interventions. In S. Kraus, editor,Proceedings of the 28th International Joint Conference on Artificial Intelligence, pages 1661– 1667, Macao, China, 2019. International Joint Conferences on Artificial Intelligence Organization

  79. [79]

    Near-optimal reinforcement learning in dynamic treatment regimes

    Junzhe Zhang and Elias Bareinboim. Near-optimal reinforcement learning in dynamic treatment regimes. In H. Wal- lach, H. Larochelle, A. Beygelzimer, F. d’Alché Buc, E. Fox, and R. Garnett, editors,Advances in Neural Information Processing Systems 32, pages 13401–13411, Vancouver, Canada, 2019. Curran Associates, Inc

  80. [80]

    Sloman and York Hagmayer

    Steven A. Sloman and York Hagmayer. The causal psycho-logic of choice.Trends in Cognitive Sciences, 10(9):407– 412, 2006

Showing first 80 references.