pith. machine review for the scientific record. sign in

arxiv: 2605.01164 · v1 · submitted 2026-05-01 · 💻 cs.AI

Recognition: unknown

LLMs Should Not Yet Be Credited with Decision Explanation

Authors on Pith no claims yet

Pith reviewed 2026-05-09 18:38 UTC · model grok-4.3

classification 💻 cs.AI
keywords LLMsdecision explanationrationale generationhuman decision modelingexplanatory standardscredit calibrationposition paper
0
0 comments X

The pith

LLMs should not yet be credited with explaining human decisions, as evidence only supports prediction and rationale generation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper distinguishes three claims with different standards: LLMs can predict what decisions people make, generate plausible rationales for those decisions, and sometimes produce explanatory hypotheses. It claims that typical evidence from LLMs meets the first two but fails to show genuine decision explanation rather than rationalization that fits the predictions after the fact. A reader would care because treating rationales as explanations risks redefining explanatory progress in models of human behavior and could delay better tools. The paper offers a bridge standard requiring explanatory claims to name targets, rule out weaker alternatives, apply process-sensitive tests, and limit scope. This keeps LLMs useful for prediction and hypothesis work without overstating their explanatory reach.

Core claim

The central claim is that LLMs should not yet be credited with decision explanation. Evidence most commonly offered for LLM-based decision accounts directly supports decision prediction and rationale generation, and sometimes explanatory hypothesis generation, but does not distinguish decision explanation from prediction-supportive rationalization. Stronger explanatory credit requires a bridge standard: claims must specify explanatory targets, discriminate against weaker rationalizer alternatives, use target-appropriate process- or intervention-sensitive validation, and bound their scope. Adopting a principle of credit calibration ensures LLMs are credited only for the strongest claim their

What carries the argument

The three-claim distinction (decision prediction, rationale generation, decision explanation) and the bridge standard that sets conditions for granting explanatory credit.

If this is right

  • LLMs can still be credited as predictors of decisions and generators of rationales without claiming explanatory power.
  • Explanatory claims about LLM outputs must name specific targets and apply intervention-sensitive tests to be accepted.
  • Adopting credit calibration preserves LLMs as instruments for hypothesis generation while avoiding premature redefinition of explanation.
  • Related work in human decision modeling can continue using LLMs for prediction tasks while developing separate standards for explanation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Researchers might design new experiments that directly test whether LLMs can meet the bridge standards under controlled process manipulations.
  • The position could extend to other AI applications where prediction accuracy is conflated with explanatory insight, such as medical diagnosis or policy recommendation.
  • If adopted, the calibration principle might encourage hybrid systems that pair LLMs with process-tracing methods from psychology.

Load-bearing premise

The distinction between generating rationales that support predictions and providing genuine explanations based on decision processes is meaningful and currently unbridged by available evidence.

What would settle it

A concrete demonstration in which an LLM specifies a clear explanatory target, rules out rationalization alternatives via process interventions, passes target-appropriate validation, and bounds its scope would show that current evidence suffices for explanatory credit.

read the original abstract

This position paper argues that LLMs should not yet be credited with decision explanation. This matters because recent work increasingly treats accurate behavioral prediction, plausible rationales, and outcome-conditioned reasoning traces as evidence that LLMs explain why people decide as they do, risking a premature redefinition of what counts as explanatory progress in human decision modeling. We first distinguish three claims with different evidential burdens: decision prediction, rationale generation, and decision explanation. We then argue that the evidence most commonly offered for LLM-based decision accounts directly supports the first two claims, and sometimes explanatory hypothesis generation, but does not distinguish decision explanation from prediction-supportive rationalization. Next, we propose a bridge standard for decision-explanation credit: stronger claims should specify explanatory targets, discriminate against weaker rationalizer alternatives, use target-appropriate process- or intervention-sensitive validation, and bound their scope. We then situate this standard against competing views and related literatures, clarifying why it preserves the value of LLMs as predictors, narrators, and hypothesis generators while resisting premature explanatory credit. We conclude with a principle of credit calibration: LLMs should be credited for the strongest claim their evidence warrants, and no stronger; if adopted, this principle can help turn LLMs from persuasive narrators of decisions into more reliable instruments for discovering, testing, and communicating explanations of human behavior.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. This position paper argues that LLMs should not yet be credited with decision explanation. It distinguishes three claims with distinct evidential burdens—decision prediction, rationale generation, and decision explanation—and contends that common evidence (accurate behavioral prediction, plausible rationales, outcome-conditioned traces) supports the first two and sometimes hypothesis generation but does not establish process-level explanation over prediction-supportive rationalization. The authors propose a four-part bridge standard (specify explanatory targets, discriminate against weaker alternatives, apply target-appropriate process- or intervention-sensitive validation, bound scope) drawn from philosophy of science and psychology, situate it against competing views, and conclude with a credit-calibration principle that LLMs should receive credit only for the strongest claim their evidence warrants.

Significance. If the distinctions and standards hold, the paper provides a useful conceptual framework for calibrating claims in LLM-assisted human decision modeling. It explicitly credits LLMs' strengths as predictors, narrators, and hypothesis generators while resisting over-attribution of explanatory power, which could help maintain rigor in cognitive modeling and AI applications. The argument is internally consistent, avoids circularity, and offers falsifiable criteria for future work to meet explanatory credit.

minor comments (2)
  1. [Abstract] The abstract previews the four-part standard clearly but could name the four elements in a single sentence for quicker reader orientation.
  2. [Introduction] Section headings and subsection numbering are consistent, but a short table summarizing the three claims and their evidential burdens would improve scannability.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive and accurate summary of the manuscript, their assessment of its significance for calibrating claims in LLM-assisted decision modeling, and their recommendation to accept. We are pleased that the distinctions, bridge standard, and credit-calibration principle were found internally consistent and useful.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper is a conceptual position piece that distinguishes three claims (decision prediction, rationale generation, decision explanation) by their differing evidential requirements, then motivates a four-part bridge standard by reference to external philosophy-of-science and psychology literatures. No equations, fitted parameters, or self-referential definitions appear; the credit-calibration principle follows directly from the distinctions drawn rather than reducing to any input by construction. All load-bearing steps cite independent sources or logical analysis, satisfying the self-contained criterion.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on assumed distinctions in what counts as explanation versus rationalization in decision modeling, without derivation from data or formal proof.

axioms (2)
  • domain assumption Decision prediction, rationale generation, and decision explanation are distinct claims requiring different levels of evidence.
    Invoked at the start to separate the three claims and assign evidential burdens.
  • domain assumption Current LLM evidence supports prediction and rationales but fails to rule out rationalization as an alternative to explanation.
    Core premise used to argue against crediting explanation.

pith-pipeline@v0.9.0 · 5526 in / 1359 out tokens · 36416 ms · 2026-05-09T18:38:41.283081+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

39 extracted references · 31 canonical work pages · 4 internal anchors

  1. [1]

    I., and Kalai, A

    Aher, G., Arriaga, R. I., and Kalai, A. T. (2023). Using large language models to simulate multiple humans and replicate human subject studies. InProceedings of the 40th International Conference on Machine Learning, volume 202 ofProceedings of Machine Learning Research, pages 337–371

  2. [2]

    Flexible Coding of in-depth Interviews: A Twenty- rst Century Approach

    Argyle, L. P., Busby, E. C., Fulda, N., Gubler, J. R., Rytting, C., and Wingate, D. (2023). Out of one, many: Using language models to simulate human samples.Political Analysis, 31(3):337–351. doi:10.1017/pan.2023.2

  3. [3]

    J., Filippas, A., and Manning, B

    Horton, J. J., Filippas, A., and Manning, B. S. (2023). Large language models as simulated economic agents: What can we learn fromHomo Silicus?NBER Working Paper31122. doi:10.3386/w31122

  4. [4]

    Binz, M., Akata, E., Bethge, M., et al. (2025). A foundation model to predict and capture human cognition. Nature, 644(8078):1002–1009. doi:10.1038/s41586-025-09215-4

  5. [5]

    C., and Griffiths, T

    Zhu, J.-Q., Xie, H., Arumugam, D., Wilson, R. C., and Griffiths, T. L. (2025). Using reinforcement learning to train large language models to explain human decisions. arXiv preprint arXiv:2505.11614

  6. [6]

    and Pashler, H

    Roberts, S. and Pashler, H. (2000). How persuasive is a good fit? A comment on theory testing.Psycholog- ical Review, 107(2):358–367. doi:10.1037/0033-295X.107.2.358

  7. [7]

    Shmueli, G. (2010). To explain or to predict?Statistical Science, 25(3):289–310. doi:10.1214/10-STS330

  8. [8]

    M., Watts, D

    Hofman, J. M., Watts, D. J., Athey, S., Garip, F., Griffiths, T. L., Kleinberg, J., Margetts, H., Mullainathan, S., Salganik, M. J., Vazire, S., Vespignani, A., and Yarkoni, T. (2021). Integrating explanation and prediction in computational social science.Nature, 595(7866):181–188. doi:10.1038/s41586-021-03659-0

  9. [9]

    (2003).Making Things Happen: A Theory of Causal Explanation

    Woodward, J. (2003).Making Things Happen: A Theory of Causal Explanation. Oxford University Press

  10. [10]

    (2009).Causality: Models, Reasoning, and Inference

    Pearl, J. (2009).Causality: Models, Reasoning, and Inference. 2nd edition. Cambridge University Press

  11. [11]

    Peters, J., Bühlmann, P., and Meinshausen, N. (2016). Causal inference using invariant prediction: Identification and confidence intervals.Journal of the Royal Statistical Society: Series B (Statistical Methodology), 78(5):947–1012. doi:10.1111/rssb.12167

  12. [12]

    Nisbett, R. E. and Wilson, T. D. (1977). Telling more than we can know: Verbal reports on mental processes.Psychological Review, 84(3):231–259. doi:10.1037/0033-295X.84.3.231

  13. [13]

    Johansson, P., Hall, L., Sikström, S., and Olsson, A. (2005). Failure to detect mismatches between intention and outcome in a simple decision task.Science, 310(5745):116–119. doi:10.1126/science.1111709

  14. [14]

    17 Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, and Denny Zhou

    Jacovi, A. and Goldberg, Y . (2020). Towards faithfully interpretable NLP systems: How should we define and evaluate faithfulness? InProceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4198–4205. doi:10.18653/v1/2020.acl-main.386

  15. [15]

    Turpin, M., Michael, J., Perez, E., and Bowman, S. R. (2023). Language models don’t always say what they think: Unfaithful explanations in chain-of-thought prompting. InAdvances in Neural Information Processing Systems, 36:74952–74965

  16. [16]

    H., Le, Q

    Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E. H., Le, Q. V ., and Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. InAdvances in Neural Information Processing Systems, 35:24824–24837

  17. [17]

    Lanham, T., Chen, A., Radhakrishnan, A., et al. (2023). Measuring faithfulness in chain-of-thought reasoning. arXiv preprint arXiv:2307.13702

  18. [18]

    Paul, D., West, R., Bosselut, A., and Faltings, B. (2024). Making reasoning matter: Measuring and improving faithfulness of chain-of-thought reasoning. InFindings of the Association for Computational Linguistics: EMNLP 2024, pages 15012–15032. doi:10.18653/v1/2024.findings-emnlp.882

  19. [19]

    Yu, Q., Tartaglini, A., Hase, P., Guestrin, C., and Potts, C. (2026). Outcome rewards do not guarantee verifiable or causally important reasoning.arXiv preprint arXiv:2604.22074

  20. [20]

    C., Bourgin, D

    Peterson, J. C., Bourgin, D. D., Agrawal, M., Reichman, D., and Griffiths, T. L. (2021). Using large-scale ex- periments and machine learning to discover theories of human decision-making.Science, 372(6547):1209–

  21. [21]

    doi:10.1126/science.abe2629

  22. [22]

    C., and Griffiths, T

    Reichman, D., Peterson, J. C., and Griffiths, T. L. (2024). Machine learning for modeling human decisions. Decision, 11(4):619–632. doi:10.1037/dec0000242. 10

  23. [23]

    C., Reichman, D., Griffiths, T

    Plonsky, O., Apel, R., Ert, E., Tennenholtz, M., Bourgin, D., Peterson, J. C., Reichman, D., Griffiths, T. L., Russell, S. J., Carter, E. C., Cavanagh, J. F., and Erev, I. (2025). Predicting human decisions with be- havioural theories and machine learning.Nature Human Behaviour, 9(11):2271–2284. doi:10.1038/s41562- 025-02267-6

  24. [24]

    and Schulz, E

    Binz, M. and Schulz, E. (2024). Turning large language models into cognitive models. InThe Twelfth International Conference on Learning Representations

  25. [25]

    and Westfall, J

    Yarkoni, T. and Westfall, J. (2017). Choosing prediction over explanation in psychology: Lessons from ma- chine learning.Perspectives on Psychological Science, 12(6):1100–1122. doi:10.1177/1745691617693393

  26. [26]

    C., Sucholutsky, I., and Griffiths, T

    Liu, R., Geng, J., Peterson, J. C., Sucholutsky, I., and Griffiths, T. L. (2025). Large language models assume people are more rational than we really are. InThe Thirteenth International Conference on Learning Representations

  27. [27]

    N., Jamale, K., and Gonzalez, C

    Nguyen, T. N., Jamale, K., and Gonzalez, C. (2024). Predicting and understanding human action decisions: Insights from large language models and cognitive instance-based learning.Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, 12(1):126–136. doi:10.1609/hcomp.v12i1.31607

  28. [28]

    Feng, Y ., Choudhary, V ., and Shrestha, Y . R. (2025). Noise, adaptation, and strategy: Assessing LLM fidelity in decision-making. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 7693–7706. doi:10.18653/v1/2025.emnlp-main.391

  29. [29]

    Miller, T. (2019). Explanation in artificial intelligence: Insights from the social sciences.Artificial Intelligence, 267:1–38. doi:10.1016/j.artint.2018.07.007

  30. [30]

    Towards A Rigorous Science of Interpretable Machine Learning

    Doshi-Velez, F. and Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608

  31. [31]

    Ziems, C., Held, W., Shaikh, O., Chen, J., Zhang, Z., and Yang, D. (2024). Can large language models trans- form computational social science?Computational Linguistics, 50(1):237–291. doi:10.1162/coli_a_00502

  32. [32]

    Ericsson, K. A. and Simon, H. A. (1980). Verbal reports as data.Psychological Review, 87(3):215–251. doi:10.1037/0033-295X.87.3.215

  33. [33]

    Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead.Nature Machine Intelligence, 1(5):206–215. doi:10.1038/s42256-019-0048-x

  34. [34]

    Lyu, Q., Havaldar, S., Stein, A., Zhang, L., Rao, D., Wong, E., Apidianaki, M., and Callison-Burch, C. (2023). Faithful chain-of-thought reasoning. InProceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Paper...

  35. [35]

    Reasoning Models Don't Always Say What They Think

    Chen, Y ., Benton, J., Radhakrishnan, A., Uesato, J., Denison, C., Schulman, J., Somani, A., Hase, P., Wagner, M., Roger, F., Mikulik, V ., Bowman, S. R., Leike, J., Kaplan, J., and Perez, E. (2025). Reasoning models don’t always say what they think. arXiv preprint arXiv:2505.05410

  36. [36]

    Datta, A., Zhao, Z., Verma, B., Mamidi, R., Marreddy, M., and Mehler, A. (2026). Large language models decide early and explain later. arXiv preprint arXiv:2604.22266

  37. [37]

    Vig, J., Gehrmann, S., Belinkov, Y ., Qian, S., Nevo, D., Singer, Y ., and Shieber, S. M. (2020). Causal mediation analysis for interpreting neural NLP: The case of gender bias. InProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4512–4528. doi:10.18653/v1/2020.emnlp-main.363

  38. [38]

    2023 , month = jan, journal =

    Geiger, A., Ibeling, D., Zur, A., Chaudhary, M., Chauhan, S., Huang, J., Arora, A., Wu, Z., Goodman, N. D., Potts, C., and Icard, T. (2023). Causal abstraction: A theoretical foundation for mechanistic interpretability. arXiv preprint arXiv:2301.04709

  39. [39]

    Toward causal representation learning.Proceedings of the IEEE, 109(5):612–634, 2021

    Schölkopf, B., Locatello, F., Bauer, S., Ke, N. R., Kalchbrenner, N., Goyal, A., and Bengio, Y . (2021). Toward causal representation learning.Proceedings of the IEEE, 109(5):612–634. doi:10.1109/JPROC.2021.3058954. 11