pith. machine review for the scientific record. sign in

arxiv: 2605.09716 · v1 · submitted 2026-05-10 · 💻 cs.AI

Recognition: 2 theorem links

· Lean Theorem

Medical Model Synthesis Architectures: A Case Study

Authors on Pith no claims yet

Pith reviewed 2026-05-12 03:18 UTC · model grok-4.3

classification 💻 cs.AI
keywords medical AIdifferential diagnosisprobabilistic modelslanguage modelsuncertainty quantificationtransparent AIclinical decision supportmodel synthesis
0
0 comments X

The pith

MedMSA is a framework that retrieves medical knowledge via language models and constructs formal probabilistic models to support transparent and calibrated inferences under uncertainty.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper is trying to establish a framework called MedMSA for building AI systems that make clinical predictions under uncertainty in a way that is both practically useful and formally transparent. It does this by having language models retrieve relevant prior medical knowledge from a clinical situation and then using that to construct a formal probabilistic model. A sympathetic reader would care because medicine involves many unknowns and current AI struggles with calibrated reasoning and explainability. If the approach works, it could produce things like lists of possible diagnoses each with an associated uncertainty level. The authors show an initial version for differential diagnosis and suggest it can apply more widely to safe clinical decision support.

Core claim

Given a clinical situation, the MedMSA framework uses language models to retrieve relevant prior knowledge, but constructs a formal probabilistic model to support calibrated and verifiable inferences under uncertainty. An initial proof-of-concept applies this to differential diagnosis by producing an uncertainty-weighted list of potential diagnoses that could explain a patient's symptoms.

What carries the argument

The MedMSA framework, which retrieves prior knowledge using language models and then constructs a formal probabilistic model from it to enable transparent reasoning under uncertainty.

If this is right

  • It can generate uncertainty-weighted lists of potential diagnoses for a patient's symptoms.
  • It supports calibrated and verifiable inferences rather than opaque predictions.
  • The approach can be extended to other clinical tasks such as treatment decisions under uncertainty.
  • It enables safer collaborations between AI systems and clinicians by making reasoning inspectable.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This could be tested by comparing the framework's diagnostic lists to those produced by expert physicians on the same cases.
  • Integration with electronic health records might allow automatic checking of the knowledge retrieved by the language model.
  • The same structure could naturally extend to modeling sequences of clinical actions like ordering tests or choosing treatments.

Load-bearing premise

Language models can reliably retrieve relevant prior medical knowledge and a formal probabilistic model can be constructed from it to support calibrated and verifiable inferences under uncertainty.

What would settle it

A test set of patient cases where the uncertainty-weighted diagnosis lists from the framework fail to include the actual cause or assign probabilities that are poorly calibrated against observed outcomes.

Figures

Figures reproduced from arXiv: 2605.09716 by Adrian Weller, Ayman Ali, Ilia Sucholutsky, Joshua B. Tenenbaum, Katherine M. Collins, Lionel Wong, Marlene Berke, Timothy J. O'Donnell, Tyler Brooke-Wilson.

Figure 1
Figure 1. Figure 1: MedMSA overview. MedMSA takes as input a pa￾tient vignette expressed in natural language (A) and synthesizes a causal diagnosis model (B) using a series, over which probabilis￾tic inference can be run (C) to estimate the likelihood of various conditions. be presented in natural language (e.g., that a patient Sean has chest pain), and a clinician may then be tasked with inferring the likely condition (“diff… view at source ↗
Figure 2
Figure 2. Figure 2: Vignettes and differential inferences. (A)Example vignettes, varying in observations provided about patient Sean. (B-C) Probabilities are computed as the number of samples drawn via rejection sampling, aggregated over all runs that compiled (9, 15, 8, and 10 models of 20 sampled resulted in compileable programs). Any one model is synthesized to joint answer whether Sean is having a heart attack (B) and wha… view at source ↗
Figure 3
Figure 3. Figure 3: Example synthesized program. Excerpts from synthesized probabilistic model from MedMSA for the fourth vignette wherein Sean has a clicking/crunching noise coming from his chest. Code and comments are generated by the pipeline. inference over conditions. We engineer and explore an initial instantiation of a MedMSA, directly following the scaffold built by (Wong et al., 2025) (see Figure 3c). We use an open-… view at source ↗
Figure 4
Figure 4. Figure 4: Model intervention. Example single point edit that a clinician could make on the output of one of the programs synthe￾sized by MedMSA. Changing the condition statement for a model that was synthesized for the second vignette. Inference is rerun to “imagine” the likelihood that Sean is having a heart attack, if it turns out he had exercised. vignette. Second, they indicated that Other is not a sensible diff… view at source ↗
read the original abstract

Medicine is rife with high-stakes uncertainty. Doctors routinely make clinical judgments and decisions that juggle many fundamental unknowns, like predictions about what might be causing a patients' symptoms or decisions about what treatment to try next. Despite increasing interest in developing AI systems that aid or even replace doctors in clinical settings, current systems struggle with calibrated reasoning under uncertainty, and are often deeply opaque about their reasoning. We propose a framework for AI systems that can make practically useful but formally transparent clinical predictions under uncertainty. Given a clinical situation, our framework (MedMSA) uses language models to retrieve relevant prior knowledge, but constructs a formal probabilistic model to support calibrated and verifiable inferences under uncertainty. We show how an initial proof-of-concept of this framework can be used for differential diagnosis, producing an uncertainty-weighted list of potential diagnoses that could explain a patients' symptoms, and discuss future applications and directions for applying this framework more generally for safe clinical collaborations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes MedMSA, a framework for AI-assisted clinical reasoning that uses language models to retrieve relevant prior medical knowledge and then constructs a formal probabilistic model to support calibrated, transparent inferences under uncertainty. It presents an initial proof-of-concept application to differential diagnosis, in which the system generates an uncertainty-weighted list of potential diagnoses consistent with a patient's symptoms, and discusses extensions to other clinical tasks.

Significance. If the framework can be realized with the claimed properties, it would address a key limitation of current medical AI systems by combining the broad knowledge access of language models with the verifiability of probabilistic models, potentially enabling safer human-AI collaboration in high-stakes settings. The approach is conceptually aligned with needs for calibrated uncertainty and transparency in medicine, but the manuscript supplies no concrete implementation, data, metrics, or worked examples, so its practical significance cannot yet be evaluated.

major comments (2)
  1. The abstract and proof-of-concept section claim that MedMSA produces an uncertainty-weighted list of diagnoses, yet no symptom set, retrieval procedure, probabilistic construction steps, or output list is provided. Without these details the central claim that the framework supports calibrated and verifiable inferences cannot be assessed.
  2. No equations or algorithmic description define how LM-retrieved knowledge is turned into a formal probabilistic model (e.g., how priors, likelihoods, or uncertainty weights are instantiated). This step is load-bearing for the paper's promise of formal transparency.
minor comments (2)
  1. Abstract: 'patients'' should be 'patient's'.
  2. The manuscript would benefit from explicit comparison to existing hybrid neuro-symbolic or probabilistic medical reasoning systems.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript describing the MedMSA framework. The points raised correctly identify areas where the proof-of-concept requires greater specificity to allow evaluation of the central claims. We address each comment below and will incorporate revisions to strengthen the paper.

read point-by-point responses
  1. Referee: The abstract and proof-of-concept section claim that MedMSA produces an uncertainty-weighted list of diagnoses, yet no symptom set, retrieval procedure, probabilistic construction steps, or output list is provided. Without these details the central claim that the framework supports calibrated and verifiable inferences cannot be assessed.

    Authors: We agree that the manuscript presents the differential-diagnosis proof-of-concept at a conceptual level without a concrete worked example. In the revised version we will add a dedicated subsection containing a specific symptom set, the exact language-model retrieval steps used to obtain relevant medical knowledge, the sequence of operations that instantiate the probabilistic model from that knowledge, and the resulting uncertainty-weighted diagnosis list with explicit posterior probabilities. This addition will make the claims directly assessable. revision: yes

  2. Referee: No equations or algorithmic description define how LM-retrieved knowledge is turned into a formal probabilistic model (e.g., how priors, likelihoods, or uncertainty weights are instantiated). This step is load-bearing for the paper's promise of formal transparency.

    Authors: We acknowledge that the original text does not supply the requested equations or pseudocode. The revision will include a new formal-description subsection that defines the mapping from retrieved knowledge to the probabilistic model. This will contain (i) the prior distribution over diagnoses derived from the retrieved epidemiological and literature data, (ii) the likelihood function P(symptoms | diagnosis) constructed from symptom-diagnosis associations in the retrieved sources, (iii) the uncertainty-weighting mechanism (via Bayesian updating), and (iv) the algorithmic steps in pseudocode. These additions will substantiate the transparency claim. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in framework proposal

full rationale

The manuscript describes a high-level conceptual framework (MedMSA) that retrieves prior knowledge via language models and then constructs a separate formal probabilistic model for uncertainty-calibrated clinical inferences such as differential diagnosis. No equations, fitted parameters, or derivation steps are exhibited that reduce any claimed output to the inputs by construction, self-definition, or load-bearing self-citation. The proof-of-concept claim rests on external LM retrieval and standard probabilistic construction rather than internal fitting or renaming of results, rendering the argument self-contained against the provided description.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The proposal rests on the assumption that language models provide accurate retrieval of medical knowledge; no free parameters or invented entities are specified in the abstract.

axioms (1)
  • domain assumption Language models can retrieve relevant prior medical knowledge accurately enough to support construction of a formal probabilistic model.
    Central to the MedMSA framework as described in the abstract.

pith-pipeline@v0.9.0 · 5484 in / 1099 out tokens · 42786 ms · 2026-05-12T03:18:20.410719+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

298 extracted references · 298 canonical work pages · 5 internal anchors

  1. [1]

    Health Affairs , volume=

    Hospital ‘Boarding’Of Patients In The Emergency Department Increasingly Common, 2017--24: Article examines hospital ‘boarding’of patience in the emergency department , author=. Health Affairs , volume=

  2. [2]

    European Respiratory Review , volume=

    Spontaneous pneumothorax: epidemiology, pathophysiology and cause , author=. European Respiratory Review , volume=

  3. [3]

    Science , volume=

    Toward the eradication of medical diagnostic errors , author=. Science , volume=. 2024 , publisher=

  4. [4]

    Hopkins and Erik Cornelisse , title =

    Ashley M. Hopkins and Erik Cornelisse , title =. Science , volume =. 2026 , doi =. https://www.science.org/doi/pdf/10.1126/science.aeg8766 , abstract =

  5. [5]

    Nature Communications , year=

    Bayesian teaching enables probabilistic reasoning in large language models , author=. Nature Communications , year=

  6. [6]

    Reasoning Models Don't Always Say What They Think

    Reasoning models don't always say what they think , author=. arXiv preprint arXiv:2505.05410 , year=

  7. [7]

    Brodeur and Thomas A

    Peter G. Brodeur and Thomas A. Buckley and Zahir Kanjee and Ethan Goh and Evelyn Bin Ling and Priyank Jain and Stephanie Cabral and Raja-Elie Abdulnour and Adrian D. Haimovich and Jason A. Freed and Andrew Olson and Daniel J. Morgan and Jason Hom and Robert Gallo and Liam G. McCoy and Haadi Mombini and Christopher Lucas and Misha Fotoohi and Matthew Gwiaz...

  8. [8]

    Br Med J , volume=

    Computer-aided diagnosis of acute abdominal pain , author=. Br Med J , volume=. 1972 , publisher=

  9. [9]

    2010 , publisher=

    Complications: A surgeon's notes on an imperfect science , author=. 2010 , publisher=

  10. [10]

    American anthropologist , publisher =

    Does biology constrain culture? , author =. American anthropologist , publisher =

  11. [11]

    Science , volume=

    Reasoning foundations of medical diagnosis: symbolic logic, probability, and value theory aid our understanding of how physicians reason , author=. Science , volume=. 1959 , publisher=

  12. [12]

    Science , volume=

    Using large-scale experiments and machine learning to discover theories of human decision-making , author=. Science , volume=. 2021 , publisher=

  13. [13]

    PLoS computational biology , volume=

    The pursuit of happiness: A reinforcement learning perspective on habituation and comparisons , author=. PLoS computational biology , volume=. 2022 , publisher=

  14. [14]

    arXiv preprint arXiv:2506.16755 , year=

    Language-Informed Synthesis of Rational Agent Models for Grounded Theory-of-Mind Reasoning On-The-Fly , author=. arXiv preprint arXiv:2506.16755 , year=

  15. [15]

    Are you really sure?

    “Are you really sure?” Understanding the effects of human self-confidence calibration in AI-assisted decision making , author=. Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems , pages=

  16. [16]

    arXiv preprint arXiv:2406.17055 , year=

    Large language models assume people are more rational than we really are , author=. arXiv preprint arXiv:2406.17055 , year=

  17. [17]

    arXiv preprint arXiv:2401.08672 , year=

    Concept alignment , author=. arXiv preprint arXiv:2401.08672 , year=

  18. [18]

    Nature Human Behaviour , volume=

    Testing theory of mind in large language models and humans , author=. Nature Human Behaviour , volume=. 2024 , publisher=

  19. [19]

    arXiv preprint arXiv:2504.10839 , year=

    Rethinking theory of mind benchmarks for llms: Towards a user-centered perspective , author=. arXiv preprint arXiv:2504.10839 , year=

  20. [20]

    arXiv preprint arXiv:2409.08811 , year=

    Mutual theory of mind in human-ai collaboration: An empirical study with llm-driven ai agents in a real-time shared workspace task , author=. arXiv preprint arXiv:2409.08811 , year=

  21. [21]

    American anthropologist , volume=

    Cultural transmission and the diffusion of innovations: Adoption dynamics indicate that biased cultural transmission is the predominate force in behavioral change , author=. American anthropologist , volume=. 2001 , publisher=

  22. [22]

    Proceedings of the National Academy of Sciences , volume=

    Innateness and culture in the evolution of language , author=. Proceedings of the National Academy of Sciences , volume=. 2007 , publisher=

  23. [23]

    ACM Transactions on Interactive Intelligent Systems , volume=

    Understanding trust and reliance development in ai advice: Assessing model accuracy, model explanations, and experiences from previous interactions , author=. ACM Transactions on Interactive Intelligent Systems , volume=. 2024 , publisher=

  24. [24]

    ICLR 2025 Workshop on Foundation Models in the Wild , year =

    AutoToM: Automated Bayesian Inverse Planning and Model Discovery for Open-ended Theory of Mind , author=. ICLR 2025 Workshop on Foundation Models in the Wild , year =

  25. [25]

    2019 , publisher=

    The promise of artificial intelligence: reckoning and judgment , author=. 2019 , publisher=

  26. [26]

    npj Digital Medicine , year=

    From tool to teammate in a randomized controlled trial of clinician-AI collaborative workflows for diagnosis , author=. npj Digital Medicine , year=

  27. [27]

    2015 , publisher=

    The Laws of Medicine: Field Notes from an Uncertain Science , author=. 2015 , publisher=

  28. [28]

    The Study and Design of Human-AI Thought Partnerships , url=

    Collins, Katherine , year=. The Study and Design of Human-AI Thought Partnerships , url=. doi:10.17863/CAM.124836 , school=

  29. [29]

    Nature , volume=

    Large language models encode clinical knowledge , author=. Nature , volume=. 2023 , publisher=

  30. [30]

    Nature medicine , volume=

    Toward expert-level medical question answering with large language models , author=. Nature medicine , volume=. 2025 , publisher=

  31. [31]

    Nature Medicine , volume=

    Enhancing the reliability and accuracy of AI-enabled diagnosis via complementarity-driven deferral to clinicians , author=. Nature Medicine , volume=. 2023 , publisher=

  32. [32]

    Mitigating LLM biases toward spurious social contexts using direct preference optimization

    Mitigating LLM biases toward spurious social contexts using direct preference optimization , author=. arXiv preprint arXiv:2604.02585 , year=

  33. [33]

    JAMA Network Open , volume=

    Large language model performance and clinical reasoning tasks , author=. JAMA Network Open , volume=. 2026 , publisher=

  34. [34]

    Current Directions in Psychological Science , pages=

    Meaningful Long-Term Thought Partnerships of Minds and Machines , author=. Current Directions in Psychological Science , pages=

  35. [35]

    The Fourteenth International Conference on Learning Representations , year=

    Shoot First, Ask Questions Later? Building Rational Agents that Explore and Act Like People , author=. The Fourteenth International Conference on Learning Representations , year=

  36. [36]

    Hastings Center Report , volume=

    Uncertainty and the shaping of medical decisions , author=. Hastings Center Report , volume=. 1991 , publisher=

  37. [37]

    Nature Medicine , pages=

    Teaching machines to doubt , author=. Nature Medicine , pages=. 2025 , publisher=

  38. [38]

    M., Ying, L., Zhang, C

    Modeling open-world cognition as on-demand synthesis of probabilistic models , author=. arXiv preprint arXiv:2507.12547 , year=

  39. [39]

    Bounded Rationality as a Strategy for Cognitive Science , author =

  40. [40]

    Proceedings of the National Academy of Sciences , publisher =

    Algorithmic monoculture and social welfare , author =. Proceedings of the National Academy of Sciences , publisher =

  41. [41]

    Hannah Rose Kirk and Alexander Whitefield and Paul R. The. The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track , url =

  42. [42]

    Trends in cognitive sciences , publisher =

    Cognitive culture: theoretical and empirical insights into social learning strategies , author =. Trends in cognitive sciences , publisher =

  43. [43]

    Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency , pages =

    Towards a science of human-AI decision making: An overview of design space in empirical human-subject studies , author =. Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency , pages =

  44. [44]

    Nature , publisher =

    AI models collapse when trained on recursively generated data , author =. Nature , publisher =

  45. [45]

    Nature Medicine , publisher =

    An algorithmic approach to reducing unexplained pain disparities in underserved populations , author =. Nature Medicine , publisher =

  46. [46]

    Learning personalized decision support policies , author =

  47. [47]

    Proceedings of the 29th International Conference on Intelligent User Interfaces , pages =

    Accuracy-Time Tradeoffs in AI-Assisted Decision Making under Time Pressure , author =. Proceedings of the 29th International Conference on Intelligent User Interfaces , pages =

  48. [48]

    The 2024 ACM Conference on Fairness, Accountability, and Transparency , pages =

    A Decision Theoretic Framework for Measuring AI Reliance , author =. The 2024 ACM Conference on Fairness, Accountability, and Transparency , pages =

  49. [49]

    Organizational Behavior and Human Decision Processes , publisher =

    Algorithm appreciation: People prefer algorithmic to human judgment , author =. Organizational Behavior and Human Decision Processes , publisher =

  50. [50]

    , author =

    Algorithm aversion: People erroneously avoid algorithms after seeing them err. , author =. Journal of Experimental Psychology: General , publisher =

  51. [51]

    Biden , year = 2023, publisher =

    Joe R. Biden , year = 2023, publisher =

  52. [52]

    Proceedings of the 3rd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization , pages =

    The unequal opportunities of large language models: Examining demographic biases in job recommendations by chatgpt and llama , author =. Proceedings of the 3rd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization , pages =

  53. [53]

    Proceedings of the ACM collective intelligence conference , pages =

    Gender bias and stereotypes in large language models , author =. Proceedings of the ACM collective intelligence conference , pages =

  54. [54]

    Morals & Machines , publisher =

    Deskilling, upskilling, and reskilling: a case for hybrid intelligence , author =. Morals & Machines , publisher =

  55. [55]

    Bhatt, Umang and Sargeant, Holli , year = 2024, journal =. When

  56. [56]

    Organizational Dynamics , publisher =

    Loafing in the era of generative AI , author =. Organizational Dynamics , publisher =

  57. [57]

    Positive psychology in practice: Promoting human flourishing in work, health, education, and everyday life , publisher =

    The paradox of choice , author =. Positive psychology in practice: Promoting human flourishing in work, health, education, and everyday life , publisher =

  58. [58]

    Social learning and innovation in contemporary hunter-gatherers: Evolutionary and ethnographic perspectives , publisher =

    A multistage learning model for cultural transmission: Evidence from three indigenous societies , author =. Social learning and innovation in contemporary hunter-gatherers: Evolutionary and ethnographic perspectives , publisher =

  59. [59]

    Philosophical Transactions of the Royal Society B , publisher =

    The optimal timing of teaching and learning across the life course , author =. Philosophical Transactions of the Royal Society B , publisher =

  60. [60]

    Do Large Language Models Perform the Way People Expect? Measuring the Human Generalization Function , author =

  61. [61]

    Royal Society open science , publisher =

    Dynamic social learning in temporally and spatially variable environments , author =. Royal Society open science , publisher =

  62. [62]

    PloS one , publisher =

    When does selection favor learning from the old? Social learning in age-structured populations , author =. PloS one , publisher =

  63. [63]

    Annual Review of Anthropology , publisher =

    What makes inventions become traditions? , author =. Annual Review of Anthropology , publisher =

  64. [64]

    Philosophy of science , publisher =

    The scientist as child , author =. Philosophy of science , publisher =

  65. [65]

    Trends in cognitive sciences , publisher =

    The child as hacker , author =. Trends in cognitive sciences , publisher =

  66. [66]

    Topics in Cognitive Science , publisher =

    Local search and the evolution of world models , author =. Topics in Cognitive Science , publisher =

  67. [67]

    Science , publisher =

    Why copy others? Insights from the social learning strategies tournament , author =. Science , publisher =

  68. [68]

    Journal of theoretical biology , publisher =

    Adaptive strategies for cumulative cultural learning , author =. Journal of theoretical biology , publisher =

  69. [69]

    Behavioral and Brain Sciences , publisher =

    The elephant in the room: What matters cognitively in cumulative technological culture , author =. Behavioral and Brain Sciences , publisher =

  70. [70]

    Trends in ecology & evolution , publisher =

    Cultural evolutionary perspectives on creativity and human innovation , author =. Trends in ecology & evolution , publisher =

  71. [71]

    Science advances , publisher =

    Flexible learning, rather than inveterate innovation or copying, drives cumulative knowledge gain , author =. Science advances , publisher =

  72. [72]

    Journal of the Royal Society interface , publisher =

    Invention as a combinatorial process: evidence from US patents , author =. Journal of the Royal Society interface , publisher =

  73. [73]

    Nature Human Behaviour , publisher =

    Machine culture , author =. Nature Human Behaviour , publisher =

  74. [74]

    Science , publisher =

    AI can help humans find common ground in democratic deliberation , author =. Science , publisher =

  75. [75]

    Nature Human Behaviour , publisher =

    How human--AI feedback loops alter human perceptual, emotional and social judgements , author =. Nature Human Behaviour , publisher =

  76. [76]

    Uncertainty in Artificial Intelligence , pages =

    On the informativeness of supervision signals , author =. Uncertainty in Artificial Intelligence , pages =

  77. [77]

    Environmental Science & Technology , publisher =

    Risks and benefits of large language models for the environment , author =. Environmental Science & Technology , publisher =

  78. [78]

    Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency , pages =

    Harms from increasingly agentic algorithmic systems , author =. Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency , pages =

  79. [79]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume = 33, number =

    Updates in human-ai teams: Understanding and addressing the performance/compatibility tradeoff , author =. Proceedings of the AAAI Conference on Artificial Intelligence , volume = 33, number =

  80. [80]

    Proceedings of the National Academy of Sciences , publisher =

    The debate over understanding in AI’s large language models , author =. Proceedings of the National Academy of Sciences , publisher =

Showing first 80 references.