pith. sign in

arxiv: 2605.16532 · v1 · pith:6EZHKHKEnew · submitted 2026-05-15 · 💻 cs.LG · econ.GN· q-fin.EC

Boundedly Rational Meta-Learning in Sequential Consumer Choice

Pith reviewed 2026-05-20 19:48 UTC · model grok-4.3

classification 💻 cs.LG econ.GNq-fin.EC
keywords meta-learningbounded rationalityconsumer choicesequential decisionsknowledge transferBayesian learningdynamic programminglaboratory experiment
0
0 comments X

The pith

Consumers transfer experience across related choices using coarse low-dimensional approximations of uncertainty rather than full Bayesian integration or starting over.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies how people carry prior learning into new but similar decision contexts, such as choosing providers across different situations. In a lab task with repeated airline selections on varying routes and noisy feedback, participants show faster improvement in later routes, indicating cross-context transfer. Model comparisons reveal that a boundedly rational policy using only one approximate draw from past uncertainty matches the sequence of human choices more closely than either a no-transfer model or a model that fully integrates all prior information. This points to consumers maintaining simple representations of brand-level patterns when moving between contexts.

Core claim

Trial-by-trial likelihood comparisons in the hierarchical airline choice task show that low-D boundedly rational meta-learning policies, especially BRMDP(1), fit participant behavior better than both a no-transfer benchmark and a fully integrated Bayesian meta-learning benchmark, indicating that consumers transfer regularities across contexts through coarse representations of prior uncertainty.

What carries the argument

BRMDP(D), a boundedly rational meta dynamic programming policy that approximates full Bayesian integration by drawing a limited number D of samples from the hyper-posterior over context parameters.

If this is right

  • Participants choose better options earlier in later routes and reduce pseudo-regret across contexts.
  • Consumer learning models must incorporate approximate rather than complete cross-context transfer.
  • Managerial counterfactuals that assume either no transfer or full integration will produce misleading predictions.
  • Low-dimensional approximations of prior uncertainty provide a better account of observed choice sequences than the two extreme benchmarks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • In markets with many overlapping contexts, firms may benefit from designing recommendations that assume consumers use only a few representative prior samples rather than exhaustive updating.
  • The same coarse-transfer pattern could appear in other sequential domains such as repeated product trials or service selections where underlying regularities exist across categories.
  • Testing whether the fit of low-D policies improves or worsens as context similarity increases would clarify the boundary conditions of the mechanism.

Load-bearing premise

The laboratory task of repeated airline choices across routes with noisy binary outcomes adequately represents real-world cross-context knowledge transfer in sequential consumer decisions.

What would settle it

A new experiment that varies the relatedness between contexts and tests whether the likelihood advantage of BRMDP(1) over full integration disappears when contexts share no underlying structure.

Figures

Figures reproduced from arXiv: 2605.16532 by Hema Yoganarasimhan, Max Kleiman-Weiner, Mehrzad Khosravi.

Figure 1
Figure 1. Figure 1: An example of a consultant in a single-route environment. She has some experience with Ascend and Summit Airways [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: An example of a multi-route environment. The consultant learns about airline performance on the Seattle–Dallas route and [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Graphical representation of the hierarchical Bayesian environment. Each airline [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Airline choice environment used in the experiments. [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Mean participants’ best-airline selection over flights for different routes. Each line represents a distinct route, and earlier [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Mean participants’ pseudo-regret over flights for different routes. Each line represents a distinct route, and earlier routes are [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Mean best-airline selection over flights for different routes in the Far Means–Low Var. condition. Each line represents a [PITH_FULL_IMAGE:figures/full_fig_p029_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Mean pseudo-regret over flights for different routes in the Far Means–Low Var. condition. Each line represents a distinct [PITH_FULL_IMAGE:figures/full_fig_p029_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Log-likelihood comparison across policies for [PITH_FULL_IMAGE:figures/full_fig_p033_9.png] view at source ↗
read the original abstract

Many consumer decisions are repeated choices under uncertainty. Standard models capture these decisions using Bayesian learning and dynamic programming: consumers update beliefs from feedback and use those beliefs to guide future choices. In many markets, however, learning does not restart when consumers enter a new context: prior experience with a brand, product, or provider can shape beliefs in later, related decisions. We study this cross-context knowledge transfer, or meta-learning, in sequential choice. We design a hierarchical laboratory task in which participants repeatedly choose among airlines across routes and observe noisy binary outcomes. Reduced-form evidence shows that participants improve not only within routes, but also across routes: they choose better airlines earlier in later routes and reduce pseudo-regret. To identify the mechanism behind this transfer, we compare human choices to a no-transfer benchmark and a fully integrated Bayesian meta-learning benchmark. In particular, we introduce a class of boundedly rational meta dynamic programming policies, BRMDP(D), that approximate full integration using a limited number of hyper-posterior draws, denoted by D. Trial-by-trial likelihood comparisons show that low-D boundedly rational meta-learning, especially BRMDP(1), fits participant behavior better than both no transfer and fully integrated Bayesian transfer. Consumers, therefore, transfer brand-level regularities across contexts, but through coarse representations of prior uncertainty. The findings imply that models of consumer learning should allow for approximate cross-context transfer, and that managerial counterfactuals based on either no-transfer or fully integrated learning can be misleading.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The manuscript examines cross-context knowledge transfer (meta-learning) in sequential consumer choice under uncertainty. Participants complete a hierarchical laboratory task involving repeated choices among airlines across multiple routes, receiving noisy binary feedback on outcomes. Reduced-form analyses show within-route learning as well as cross-route improvement, with better early-route choices and lower pseudo-regret in later routes. The authors introduce a family of boundedly rational meta-dynamic-programming policies, BRMDP(D), that approximate full Bayesian integration over a hyper-posterior by drawing only D samples. Trial-by-trial likelihood comparisons on human data indicate that low-D policies, particularly BRMDP(1), outperform both a no-transfer benchmark and a fully integrated Bayesian meta-learning model, supporting the claim that consumers transfer brand-level regularities via coarse representations of prior uncertainty.

Significance. If the empirical comparisons hold, the paper offers a useful middle ground between no-transfer and fully rational meta-learning models, with direct implications for consumer-behavior modeling and managerial counterfactuals. The BRMDP(D) construction supplies a computationally tractable approximation whose free parameter D is directly interpretable as the granularity of uncertainty representation. The work also supplies falsifiable predictions via likelihood rankings and reduced-form cross-route metrics, which are strengths for a field that often relies on purely qualitative claims about transfer.

major comments (2)
  1. [§4] §4 (Results), likelihood comparisons: the reported superiority of BRMDP(1) over the fully integrated benchmark and the no-transfer model is central to the main claim, yet the manuscript provides no standard errors, bootstrap intervals, or formal statistical tests on the likelihood differences; without these, it is difficult to judge whether the ranking is robust to sampling variation across participants.
  2. [Experimental design] Experimental design section: the description of the hierarchical task does not state the exact number of routes, trials per route, or total participant count; these quantities are load-bearing for interpreting both the reduced-form cross-route improvement and the power of the model-comparison results.
minor comments (3)
  1. [§3.2] §3.2, definition of BRMDP(D): the precise sampling procedure for the D hyper-posterior draws and how the resulting policy is computed should be written as a short algorithm or pseudocode for reproducibility.
  2. [Figure 2] Figure 2 (or equivalent likelihood plot): adding participant-level variability bands or reporting the number of observations per route would improve interpretability of the visual comparison.
  3. [§3] Notation: the symbol for the hyper-posterior is introduced without an explicit equation reference; adding a numbered display equation would reduce ambiguity when readers compare BRMDP(D) to the full Bayesian benchmark.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which have helped us strengthen the statistical robustness and clarity of the manuscript. We address each major point below and have incorporated revisions accordingly.

read point-by-point responses
  1. Referee: [§4] §4 (Results), likelihood comparisons: the reported superiority of BRMDP(1) over the fully integrated benchmark and the no-transfer model is central to the main claim, yet the manuscript provides no standard errors, bootstrap intervals, or formal statistical tests on the likelihood differences; without these, it is difficult to judge whether the ranking is robust to sampling variation across participants.

    Authors: We agree that measures of uncertainty are necessary to evaluate the robustness of the model ranking. In the revised manuscript we now include bootstrap 95% confidence intervals for the per-participant log-likelihood differences, obtained by resampling participants 10,000 times. These intervals exclude zero for BRMDP(1) versus both the no-transfer and full Bayesian benchmarks. We have also added a paired Wilcoxon signed-rank test on the individual-level likelihoods (p < 0.01 for the primary comparisons) and a new panel in Figure 4 displaying the distribution of differences. These additions directly address the concern while preserving the original likelihood values. revision: yes

  2. Referee: [Experimental design] Experimental design section: the description of the hierarchical task does not state the exact number of routes, trials per route, or total participant count; these quantities are load-bearing for interpreting both the reduced-form cross-route improvement and the power of the model-comparison results.

    Authors: We appreciate the referee noting this omission. The revised Experimental Design section now states that participants completed 4 routes with 20 trials each, for a total of 96 participants (after excluding 4 who failed attention checks). We have inserted a new Table 1 that summarizes all task parameters, including the number of airlines per route (3), the binary feedback noise level, and the route-specific outcome probabilities. These explicit quantities should now allow readers to assess both the reduced-form cross-route effects and the statistical power of the likelihood comparisons. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper defines BRMDP(D) explicitly as an approximation to full Bayesian meta-learning via a finite number of hyper-posterior draws and then performs direct likelihood comparisons of this family, a no-transfer benchmark, and the full-integration model against observed human choices in the airline-route task. These comparisons are statistical fits to external data rather than any quantity being recovered by construction from its own inputs. No self-citations, uniqueness theorems, or ansatzes are invoked to justify the central claim that low-D variants provide a better account of transfer; the derivation from task design through reduced-form cross-route improvement to model ranking therefore remains independent of the reported result.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 1 invented entities

The work rests on standard Bayesian updating assumptions and introduces BRMDP(D) as an approximation mechanism; D controls the coarseness of the representation and is central to the bounded-rationality claim.

free parameters (1)
  • D (number of hyper-posterior draws)
    Controls the degree of approximation in BRMDP(D) policies; low values like D=1 are shown to fit data best.
axioms (2)
  • domain assumption Consumers perform Bayesian belief updating from noisy feedback
    Invoked as the baseline for both full-integration and bounded-rationality models.
  • domain assumption Cross-route knowledge transfer occurs via meta-learning over brand-level regularities
    Core premise of the hierarchical task design.
invented entities (1)
  • BRMDP(D) policies no independent evidence
    purpose: Approximate full Bayesian meta-learning using limited hyper-posterior draws
    Newly introduced class to capture bounded rationality; no independent evidence provided outside the model fits.

pith-pipeline@v0.9.0 · 5810 in / 1385 out tokens · 47890 ms · 2026-05-20T19:48:32.230079+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

192 extracted references · 192 canonical work pages

  1. [1]

    P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47 0 (2-3): 0 235--256, 2002

  2. [2]

    A. V. Banerjee. A simple model of herd behavior. The quarterly journal of economics, 107 0 (3): 0 797--817, 1992

  3. [3]

    S. Basu, B. Kveton, M. Zaheer, and C. Szepesv \'a ri. No regrets for learning the prior in bandits. Advances in neural information processing systems, 34: 0 28029--28041, 2021

  4. [4]

    R. Bellman. A problem in the sequential design of experiments. Sankhy \=a : The Indian Journal of Statistics (1933-1960) , 16 0 (3/4): 0 221--229, 1956

  5. [5]

    R. Bellman. Dynamic programming. science, 153 0 (3731): 0 34--37, 1966

  6. [6]

    R. Bhui, L. Lai, and S. J. Gershman. Resource-rational decision making. Current Opinion in Behavioral Sciences, 41: 0 15--21, 2021

  7. [7]

    M. Binz, I. Dasgupta, A. K. Jagadish, M. Botvinick, J. X. Wang, and E. Schulz. Meta-learned models of cognition. Behavioral and Brain Sciences, 47: 0 e147, 2024

  8. [8]

    J. A. Bohren and D. N. Hauser. Misspecified models in learning and games. Annual Review of Economics, 17, 2025

  9. [9]

    T. Bondi. Alone, together: A model of social (mis) learning from consumer reviews. Marketing Science, 44 0 (6): 0 1258--1277, 2025

  10. [10]

    Bubeck, N

    S. Bubeck, N. Cesa-Bianchi, et al. Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends in Machine Learning , 5 0 (1): 0 1--122, 2012

  11. [11]

    Callaway, B

    F. Callaway, B. Van Opheusden, S. Gul, P. Das, P. M. Krueger, T. L. Griffiths, and F. Lieder. Rational use of cognitive resources in human planning. Nature human behaviour, 6 0 (8): 0 1112--1125, 2022

  12. [12]

    C. F. Camerer, T.-H. Ho, and J.-K. Chong. A cognitive hierarchy model of games. The quarterly journal of economics, 119 0 (3): 0 861--898, 2004

  13. [13]

    Chater, J.-Q

    N. Chater, J.-Q. Zhu, J. Spicer, J. Sundh, P. Le \'o n-Villagr \'a , and A. Sanborn. Probabilistic biases meet the bayesian brain. Current Directions in Psychological Science, 29 0 (5): 0 506--512, 2020

  14. [14]

    H. Che, T. Erdem, and T. S. \"O nc \"u . Consumer learning and evolution of consumer brand preferences. Quantitative Marketing and Economics, 13: 0 173--202, 2015

  15. [15]

    A. T. Ching, T. Erdem, and M. P. Keane. Learning models: An assessment of progress, challenges, and new developments. Marketing Science, 32 0 (6): 0 913--938, 2013

  16. [16]

    A. T. Ching, T. Erdem, and M. P. Keane. Empirical models of learning dynamics: A survey of recent developments. Handbook of marketing decision models, pages 223--257, 2017

  17. [17]

    Coscelli and M

    A. Coscelli and M. Shum. An empirical model of learning and patient spillovers in new drug entry. Journal of econometrics, 122 0 (2): 0 213--246, 2004

  18. [18]

    G. S. Crawford and M. Shum. Uncertainty and learning in pharmaceutical demand. econometrica, 73 0 (4): 0 1137--1173, 2005

  19. [19]

    N. D. Daw et al. Trial-by-trial data analysis using computational models. Decision making, affect, and learning: Attention and performance XXIII, 23 0 (1): 0 3--38, 2011

  20. [20]

    M. H. DeGroot. Optimal statistical decisions. John Wiley & Sons, 2005

  21. [21]

    Eluchans, G

    M. Eluchans, G. L. Lancia, A. Maselli, M. D’Alessandro, J. R. Gordon, and G. Pezzulo. Adaptive planning depth in human problem-solving. Royal Society Open Science, 12 0 (4), 2025

  22. [22]

    T. Erdem. An empirical analysis of umbrella branding. Journal of Marketing Research, 35 0 (3): 0 339--351, 1998

  23. [23]

    Erdem and M

    T. Erdem and M. P. Keane. Decision-making under uncertainty: Capturing dynamic brand choice processes in turbulent consumer goods markets. Marketing science, 15 0 (1): 0 1--20, 1996

  24. [24]

    Gabaix, D

    X. Gabaix, D. Laibson, G. Moloche, and S. Weinberg. Costly information acquisition: Experimental analysis of a boundedly rational model. American Economic Review, 96 0 (4): 0 1043--1068, 2006

  25. [25]

    Gelman, J

    A. Gelman, J. B. Carlin, H. S. Stern, and D. B. Rubin. Bayesian data analysis. Chapman and Hall/CRC, 1995

  26. [26]

    S. J. Gershman. Deconstructing the human algorithms for exploration. Cognition, 173: 0 34--42, 2018

  27. [27]

    Goldfarb, T.-H

    A. Goldfarb, T.-H. Ho, W. Amaldoss, A. L. Brown, Y. Chen, T. H. Cui, A. Galasso, T. Hossain, M. Hsu, N. Lim, et al. Behavioral models of managerial decision-making. Marketing Letters, 23 0 (2): 0 405--421, 2012

  28. [28]

    T. L. Griffiths, F. Callaway, M. B. Chang, E. Grant, P. M. Krueger, and F. Lieder. Doing more with less: meta-reasoning and meta-learning in humans and machines. Current Opinion in Behavioral Sciences, 29: 0 24--30, 2019

  29. [29]

    T. L. Griffiths, N. Chater, and J. B. Tenenbaum. Bayesian models of cognition: Reverse engineering the mind. MIT Press, 2024

  30. [30]

    Guan and H

    J. Guan and H. Xiong. Improved bayes regret bounds for multi-task hierarchical bayesian bandit algorithms. Advances in Neural Information Processing Systems, 37: 0 72964--72999, 2024

  31. [31]

    M. K. Ho, J. D. Cohen, and T. L. Griffiths. Rational simplification and rigidity in human planning. Psychological Science, 34 0 (11): 0 1281--1292, 2023

  32. [32]

    T. H. Ho, N. Lim, and C. F. Camerer. Modeling the psychology of consumer and firm behavior with behavioral economics. Journal of marketing Research, 43 0 (3): 0 307--331, 2006

  33. [33]

    C. F. Hofacker, H. N. Nguyen, and M. Fina. Bayesian inference and consumer behavioral theory. Psychology & Marketing, 41 0 (12): 0 3144--3156, 2024

  34. [34]

    J. Hong, B. Kveton, M. Zaheer, and M. Ghavamzadeh. Hierarchical bayesian bandits. In International Conference on Artificial Intelligence and Statistics, pages 7724--7741. PMLR, 2022

  35. [35]

    J. W. Hutchinson and E. M. Eisenstein. Consumer learning and expertise. Handbook of consumer psychology, 4: 0 103--132, 2008

  36. [36]

    J. W. Hutchinson and R. J. Meyer. Dynamic decision making: Optimal policies and actual behavior in sequential choice problems. Marketing Letters, 5: 0 369--382, 1994

  37. [37]

    R. E. Kass and A. E. Raftery. Bayes factors. Journal of the american statistical association, 90 0 (430): 0 773--795, 1995

  38. [38]

    C. Kemp, A. Perfors, and J. B. Tenenbaum. Learning overhypotheses with hierarchical bayesian models. Developmental science, 10 0 (3): 0 307--321, 2007

  39. [39]

    Koller and N

    D. Koller and N. Friedman. Probabilistic graphical models: principles and techniques. MIT press, 2009

  40. [40]

    Kveton, M

    B. Kveton, M. Konobeev, M. Zaheer, C.-w. Hsu, M. Mladenov, C. Boutilier, and C. Szepesvari. Meta-thompson sampling. In International Conference on Machine Learning, pages 5884--5893. PMLR, 2021

  41. [41]

    Lai and S

    L. Lai and S. J. Gershman. Human decision making balances reward maximization and policy compression. PLOS Computational Biology, 20 0 (4): 0 e1012057, 2024

  42. [42]

    T. L. Lai and H. Robbins. Asymptotically efficient adaptive allocation rules. Advances in applied mathematics, 6 0 (1): 0 4--22, 1985

  43. [43]

    Lattimore and C

    T. Lattimore and C. Szepesv \'a ri. Bandit algorithms. Cambridge University Press, 2020

  44. [44]

    Lieder and T

    F. Lieder and T. L. Griffiths. Resource-rational analysis: Understanding human cognition as the optimal use of limited computational resources. Behavioral and brain sciences, 43: 0 e1, 2020

  45. [45]

    S. Lin, J. Zhang, and J. R. Hauser. Learning from experience, simply. Marketing Science, 34 0 (1): 0 1--19, 2015

  46. [46]

    Liu and A

    J. Liu and A. Ansari. Understanding consumer dynamic decision making under competing loyalty programs. Journal of Marketing Research, 57 0 (3): 0 422--444, 2020

  47. [48]

    McCoy, R

    J. McCoy, R. Ciulli, and E. Bradlow. Two-for-one conjoint: Bayesian cross-category learning for shared-attribute categories. Available at SSRN 4136593, 2022

  48. [49]

    R. J. Meyer and J. W. Hutchinson. (when) are we dynamically optimal? a psychological field guide for marketing modelers. Journal of Marketing, 80 0 (5): 0 20--33, 2016

  49. [50]

    R. J. Meyer and Y. Shi. Sequential choice under ambiguity: Intuitive solutions to the armed-bandit problem. Management science, 41 0 (5): 0 817--834, 1995

  50. [51]

    C. A. Montgomery and B. Wernerfelt. Risk reduction and umbrella branding. Journal of Business, pages 31--50, 1992

  51. [52]

    S. Nabi, H. Nassif, J. Hong, H. Mamani, and G. Imbens. Bayesian meta-prior learning using empirical bayes. Management Science, 68 0 (3): 0 1737--1755, 2022

  52. [53]

    J. Rust. Optimal replacement of gmc bus engines: An empirical model of harold zurcher. Econometrica: Journal of the Econometric Society, pages 999--1033, 1987

  53. [54]

    Salakhutdinov, J

    R. Salakhutdinov, J. B. Tenenbaum, and A. Torralba. Learning with hierarchical-deep models. IEEE transactions on pattern analysis and machine intelligence, 35 0 (8): 0 1958--1971, 2012

  54. [55]

    Schulz and S

    E. Schulz and S. J. Gershman. The algorithmic architecture of exploration in the human brain. Current opinion in neurobiology, 55: 0 7--14, 2019

  55. [56]

    Schulz, N

    E. Schulz, N. T. Franklin, and S. J. Gershman. Finding structure in multi-armed bandits. Cognitive psychology, 119: 0 101261, 2020

  56. [57]

    D. R. Shanks, R. J. Tunney, and J. D. McCarthy. A re-examination of probability matching and rational choice. Journal of Behavioral Decision Making, 15 0 (3): 0 233--250, 2002

  57. [58]

    H. A. Simon. A behavioral model of rational choice. The quarterly journal of economics, pages 99--118, 1955

  58. [59]

    Sridhar, R

    K. Sridhar, R. Bezawada, and M. Trivedi. Investigating the drivers of consumer cross-category learning for new products using multiple data sets. Marketing Science, 31 0 (4): 0 668--688, 2012

  59. [60]

    R. S. Sutton, A. G. Barto, et al. Reinforcement learning: An introduction, volume 1. MIT press Cambridge, 1998

  60. [61]

    S. S. Tehrani and A. T. Ching. A heuristic approach to explore: The value of perfect information. Management Science, 70 0 (5): 0 3200--3224, 2024

  61. [62]

    J. B. Tenenbaum, C. Kemp, T. L. Griffiths, and N. D. Goodman. How to grow a mind: Statistics, structure, and abstraction. science, 331 0 (6022): 0 1279--1285, 2011

  62. [63]

    Vul and H

    E. Vul and H. Pashler. Measuring the crowd within: Probabilistic representations within individuals. Psychological Science, 19 0 (7): 0 645--647, 2008

  63. [64]

    E. Vul, N. Goodman, T. L. Griffiths, and J. B. Tenenbaum. One and done? optimal decisions from very few samples. Cognitive science, 38 0 (4): 0 599--637, 2014

  64. [65]

    Wernerfelt

    B. Wernerfelt. Umbrella branding as a signal of new product quality: An example of signalling by posting a bond. The RAND Journal of Economics, pages 458--466, 1988

  65. [66]

    C. M. Wu, E. Schulz, M. Speekenbrink, J. D. Nelson, and B. Meder. Generalization guides human exploration in vast decision spaces. Nature human behaviour, 2 0 (12): 0 915--924, 2018

  66. [67]

    Xu and J

    F. Xu and J. B. Tenenbaum. Word learning as bayesian inference. Psychological review, 114 0 (2): 0 245, 2007

  67. [68]

    L. Yang, O. Toubia, and M. G. De Jong. A bounded rationality model of information search and choice in preference measurement. Journal of Marketing Research, 52 0 (2): 0 166--183, 2015

  68. [69]

    Nature Human Behaviour, 1, 0017 , author=

    Coherency maximizing exploration in the supermarket. Nature Human Behaviour, 1, 0017 , author=

  69. [70]

    International Conference on Machine Learning , pages=

    Meta-thompson sampling , author=. International Conference on Machine Learning , pages=. 2021 , organization=

  70. [71]

    Journal of marketing Research , volume=

    Modeling the psychology of consumer and firm behavior with behavioral economics , author=. Journal of marketing Research , volume=. 2006 , publisher=

  71. [72]

    Marketing Letters , volume=

    Behavioral models of managerial decision-making , author=. Marketing Letters , volume=. 2012 , publisher=

  72. [73]

    Royal Society Open Science , volume=

    Adaptive planning depth in human problem-solving , author=. Royal Society Open Science , volume=. 2025 , publisher=

  73. [74]

    The quarterly journal of economics , volume=

    A simple model of herd behavior , author=. The quarterly journal of economics , volume=. 1992 , publisher=

  74. [75]

    Psychological Science , volume=

    Rational simplification and rigidity in human planning , author=. Psychological Science , volume=. 2023 , publisher=

  75. [76]

    Nature human behaviour , volume=

    Rational use of cognitive resources in human planning , author=. Nature human behaviour , volume=. 2022 , publisher=

  76. [77]

    PLOS Computational Biology , volume=

    Human decision making balances reward maximization and policy compression , author=. PLOS Computational Biology , volume=. 2024 , publisher=

  77. [78]

    Current Directions in Psychological Science , volume=

    Probabilistic biases meet the Bayesian brain , author=. Current Directions in Psychological Science , volume=. 2020 , publisher=

  78. [79]

    Behavioral and Brain Sciences , volume=

    Meta-learned models of cognition , author=. Behavioral and Brain Sciences , volume=. 2024 , publisher=

  79. [80]

    Annual Review of Economics , volume=

    Misspecified models in learning and games , author=. Annual Review of Economics , volume=. 2025 , publisher=

  80. [81]

    Psychology & Marketing , volume=

    Bayesian inference and consumer behavioral theory , author=. Psychology & Marketing , volume=. 2024 , publisher=

Showing first 80 references.