arxiv: 2604.27894 · v1 · submitted 2026-04-30 · 🧬 q-bio.NC

Recognition: unknown

On Agentic Behavioral Modeling

Belinda Fleischmann, Dirk Ostwald, Franziska Us\'ee, Joram Soch, Rasmus Bruckner, Sean Mulready

Pith reviewed 2026-05-07 05:11 UTC · model grok-4.3

classification 🧬 q-bio.NC

keywords agentic behavioral modelingRescorla-Wagner learningBayesian inferencetwo-armed banditperceptual discriminationpsychometric functiongenerative modelscognitive mechanisms

0 comments

The pith

Agentic behavioral modeling equates Rescorla-Wagner learning with Bayesian inference in symmetric bandits.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces agentic behavioral modeling as a framework that treats artificial agents as latent generative hypotheses for cognitive mechanisms and tests them by how well they statistically explain human behavioral data. It formalizes two laboratory tasks as joint probability models of task, agent, and observations, then derives conditional log-likelihoods for inference. Applying the framework yields an agent-centric reading of the psychometric function, explicit optimal policies for each task, and the mathematical equivalence of Rescorla-Wagner learning to Bayesian updating when bandit arms are symmetric. Model and parameter recovery simulations confirm the inference procedures before they are applied to empirical data. A sympathetic reader would see this as a concrete methodological bridge between agent-based AI and the statistical analysis of human cognition.

Core claim

The framework of agentic behavioral modeling treats artificial agents as latent generative hypotheses about cognitive mechanisms and evaluates them by their statistical adequacy in explaining human behavior. For a binary perceptual contrast-discrimination task and a symmetric two-armed bandit learning task, the approach constructs joint probability models, derives explicit conditional log-likelihoods, validates variants through recovery simulations, and produces an agent-centric interpretation of the psychometric function, optimal policies for both tasks, and the equivalence between Rescorla-Wagner learning and Bayesian inference under symmetry.

What carries the argument

Agentic behavioral modeling (ABM), the construction of joint task-agent-data probability models that enable derivation of conditional log-likelihoods and evaluation of agents as cognitive hypotheses.

If this is right

Optimal policies for the perceptual discrimination and bandit tasks follow directly from the agent models.
The psychometric function arises as the marginal choice probability under the agent's decision policy in the perceptual task.
Model variants for both tasks are distinguishable and recoverable in simulation before data application.
Rescorla-Wagner and Bayesian agents coincide exactly in symmetric stationary bandit environments.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same joint-model structure could be used to test whether other learning rules diverge from optimality once symmetry is broken.
Formalizing existing cognitive models inside ABM would allow direct statistical comparison of their generative adequacy on shared datasets.
Synthetic data generated by the optimal agents could serve as a benchmark for detecting when human behavior deviates from the derived policies.
The framework supplies a route to embed reinforcement-learning agents from neuroscience into the same inference pipeline used for behavioral data.

Load-bearing premise

Artificial agents can be treated as valid latent, generative hypotheses about underlying cognitive mechanisms that are statistically adequate for explaining human behavior.

What would settle it

Human choice data from the symmetric two-armed bandit task in which the fitted Rescorla-Wagner and Bayesian agents produce statistically different likelihoods or systematic posterior predictive deviations from observed learning curves.

Figures

Figures reproduced from arXiv: 2604.27894 by Belinda Fleischmann, Dirk Ostwald, Franziska Us\'ee, Joram Soch, Rasmus Bruckner, Sean Mulready.

**Figure 1.** Figure 1: (Figure on previous page.) Agentic behavioral modeling (ABM). (A) Basic model partitioning of the ABM scenario. A task model describes the agent choice environment, an agent model describes an algorithmic trial-by-trial solution of the task, commonly aiming to capture human decision processes, and a data model accounts for the latent nature of the agent’s internal dynamics and their indirect observation un… view at source ↗

**Figure 2.** Figure 2: Gabor contrast discrimination task and descriptive statistics. (A) Gabor patch contrast discrimination trial sequence. Two Gabor patches differing in contrast were presented to participants, who were required to indicate whether they perceived the contrast of the left or the right Gabor patch as higher. The correct response yielded a reward of +1, the incorrect response a reward of +0. Rewards were present… view at source ↗

**Figure 3.** Figure 3: Neurocognitive perspective of the Gabor contrast discrimination ABM. Note that the state evolution is independent of the agent’s action, and the agent’s decision is independent of the realized rewards. For implementational details of the task model, please refer to abm task.py. Agent model We consider the agent model A := (T , O, B, V, D, p(ot|ct), β, ϕ, δ), (5) where • T is the task model and represents t… view at source ↗

**Figure 4.** Figure 4: (Figure on previous page.) Model validation for the Gabor contrast discrimination ABM. (A) Model face validity. From left to right, the subpanels visualize the effect of varying one of the parameters of ABM variant A2 on the group psychometric function in isolation. For the effect of σ (left subpanel), the complementary parameter settings are η = 0, τ = 0, for the effect of η (center subpanel), the complem… view at source ↗

**Figure 5.** Figure 5: Model evaluation and post-hoc model validation of the Gabor contrast discrimination case study. (A) Group-cumulative BIC values, corresponding to the sums of the participant-specific BIC values for each ABM variant. Bar colors indicate the results from the experimental data and the simulated post-hoc validation data. Error bars indicate the standard deviation across post-hoc validation simulations. (B) P… view at source ↗

**Figure 6.** Figure 6: Symmetric bandit learning task and descriptive statistics. (A) Symmetric bandit task block mechanics. For each block, a probability state s was realized from a uniform distribution on [0, 1] and the leftwards cursor button endowed with a probability of s to yield a reward of +0 and a probability of 1 − s to yield a reward of +1. Vice versa, the rightwards cursor button was endowed with a probability of 1−s… view at source ↗

**Figure 7.** Figure 7: Neurocognitive perspective of the symmetric Bandit Bayesian ABM. This entails that α (1) t counts events with probability s1, while α (2) t counts events with probability 1−s1. The former two events are the joint observations (at = 1, rt = 1) and (at = 0, rt = 0); the latter two events are the joint observations (at = 0, rt = 1) and (at = 1, rt = 0). This counting process is implemented in the update formu… view at source ↗

**Figure 8.** Figure 8: Model validation for the symmetric bandit ABM. (A) Model face validity. The left subpanel visualizes the effect of varying the action noise parameter τ for ABM variant A1. The center and right subpanels visualize the effect of varying τ for ABM variant A2 for a low (λ = 0.1) and a high (λ = 0.9) learning rate parameter. (B) Model recovery. The left subpanel visualizes the PEP fractions attained by the data… view at source ↗

**Figure 9.** Figure 9: Model evaluation and post-hoc model validation of the symmetric bandit learning case study. (A) Group-cumulative BIC values, corresponding to the sums of the participant-specific BIC values for each ABM variant. Bar colors indicate the results from the experimental data and the simulated post-hoc validation data. Error bars indicate the standard deviation across post-hoc validation simulations. (B) PEPs f… view at source ↗

read the original abstract

Integrating theoretical neuroscience, decision theory, and probabilistic inference offers a promising route to understanding human cognition, yet concrete methodological bridges between agentic AI models and behavioral data analysis remain formally underdeveloped. We advance this synthesis under the framework of agentic behavioral modeling (ABM), which treats artificial agents as latent, generative hypotheses about cognitive mechanisms and evaluates them by their statistical adequacy in explaining human behavior. After outlining its conceptual foundations, we apply the framework to two minimal laboratory paradigms: a binary perceptual contrast-discrimination task and a symmetric two-armed bandit learning task. We formalize each task-agent-data system as a joint probability model, derive explicit conditional log-likelihoods for behavioral inference, validate different model variants using model and parameter recovery simulations, and evaluate them in light of empirical data. Using these minimal examples, we provide an agent-centric interpretation of the psychometric function, derive optimal policies for both tasks, and show the equivalence between Rescorla-Wagner learning and Bayesian inference in symmetric bandits. More broadly, this work may serve as a conceptual and practical foundation for applying ABM to cognitive behavioral science.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper formalizes agentic behavioral modeling with joint probabilities and derives a clean equivalence between Rescorla-Wagner and Bayesian updates in symmetric bandits.

read the letter

The paper introduces agentic behavioral modeling, or ABM, to connect artificial agents with behavioral data in cognitive science. The key result is that in a symmetric two-armed bandit, Rescorla-Wagner learning turns out to be equivalent to Bayesian inference once you set up the joint model properly. They formalize two minimal tasks—a perceptual discrimination one and the bandit—as full joint probability distributions over task, agent parameters, and observed behavior. From there they derive the conditional log-likelihoods for fitting, run simulations to check model and parameter recovery, and look at some empirical data. They also give optimal policies for each task and reinterpret the psychometric function in agent terms. The equivalence follows directly from the symmetry in the bandit setup. This approach is useful because it makes the modeling steps explicit and checkable. The recovery simulations add some reassurance that the parameters can be recovered. It's a step toward treating RL-style agents as generative models for cognition rather than just tools. The main soft spot is the foundational assumption that these artificial agents are good latent hypotheses for human mechanisms. The paper doesn't explore cases where this might break down or compare it head-to-head with more traditional cognitive models on the same data. The tasks are simple by design, which helps the derivations but means the framework hasn't been stress-tested on richer behaviors yet. The empirical evaluation is described but without specific results or code it's difficult to gauge how well the models actually fit human data. This work is for researchers in cognitive modeling and computational neuroscience who are interested in bridging agent-based AI with statistical inference on behavior. It has clear enough structure and verifiable steps that it merits peer review. The equivalence claim in particular is worth having referees check the details on. I'd send it out for review.

Referee Report

0 major / 3 minor

Summary. The paper introduces the Agentic Behavioral Modeling (ABM) framework, which treats artificial agents as latent generative hypotheses about cognitive mechanisms and evaluates them via statistical adequacy in explaining human behavior. It applies the framework to a binary perceptual contrast-discrimination task and a symmetric two-armed bandit task by formalizing each task-agent-data system as a joint probability model, deriving explicit conditional log-likelihoods, validating model variants through model and parameter recovery simulations, and evaluating against empirical data. Key results include an agent-centric interpretation of the psychometric function, optimal policies for both tasks, and the equivalence between Rescorla-Wagner learning and Bayesian inference in symmetric bandits.

Significance. If the derivations hold, this provides a concrete methodological bridge between agentic AI models and behavioral data analysis, advancing integration of theoretical neuroscience, decision theory, and probabilistic inference. The explicit joint-model construction and likelihood derivation supply verifiable machinery for the equivalence claim, while the recovery simulations and empirical evaluation add practical credibility. The framework's emphasis on artificial agents as generative hypotheses offers a foundation that could extend to other paradigms, with the symmetry-based equivalence serving as a clear, falsifiable example.

minor comments (3)

[Abstract] Abstract: the description of ABM as treating agents as 'latent, generative hypotheses' is repeated in the broader contribution statement; a single, concise definition early in the abstract would reduce redundancy.
[Simulations section] Model and parameter recovery simulations: the abstract states that different model variants are validated, but does not specify the exact recovery metrics (e.g., bias, coverage, or confusion matrices); adding these quantitative details would strengthen the validation claim.
[Empirical evaluation] Empirical evaluation: the abstract mentions evaluation 'in light of empirical data' without naming the dataset or the specific adequacy measures (e.g., posterior predictive checks or out-of-sample log-likelihood); including these would make the empirical support more transparent.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive and accurate summary of our manuscript on Agentic Behavioral Modeling (ABM). The report correctly identifies the framework's core elements: treating artificial agents as latent generative hypotheses, formalizing task-agent-data systems as joint probability models, deriving conditional log-likelihoods, validating via recovery simulations, and demonstrating results such as the agent-centric psychometric function, optimal policies, and the Rescorla-Wagner/Bayesian equivalence in symmetric bandits. We appreciate the recommendation for minor revision and the recognition of the work's potential to bridge agentic AI, theoretical neuroscience, and behavioral data analysis.

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper formalizes each task-agent-data system as an explicit joint probability model, derives conditional log-likelihoods for inference, performs model/parameter recovery simulations, and evaluates against data. The central equivalence between Rescorla-Wagner learning and Bayesian inference is obtained directly from the symmetry condition within this joint model construction; it is an internal mathematical property of the two agent specifications once the model is fixed, not a reduction to fitted parameters, self-definitions, or load-bearing self-citations. The broader ABM premise is presented as the methodological framing rather than an unexamined input required for the equivalence result. No quoted step reduces by construction to its own inputs, and the workflow supplies independent verification machinery (likelihoods, simulations, empirical checks). This aligns with a score of 0 as the most common honest finding for self-contained derivations.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on treating artificial agents as latent generative hypotheses for cognition, which is a domain assumption without independent evidence supplied in the abstract; no free parameters or invented entities beyond the framework itself are detailed.

axioms (1)

domain assumption Artificial agents can serve as latent, generative hypotheses about cognitive mechanisms that are statistically adequate for explaining human behavior
This is the foundational premise of the ABM framework stated in the abstract.

invented entities (1)

Agentic Behavioral Modeling (ABM) framework no independent evidence
purpose: To treat artificial agents as testable hypotheses linking AI models to behavioral data
New synthesis introduced to bridge theoretical neuroscience, decision theory, and probabilistic inference.

pith-pipeline@v0.9.0 · 5498 in / 1412 out tokens · 51571 ms · 2026-05-07T05:11:04.659757+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

2 extracted references · 1 canonical work pages

[1]

Summerfield, C., & Tsetsos, K

Cambridge, Mass: Belknap Press of Harvard University Press. Summerfield, C., & Tsetsos, K. (2012). Building Bridges between Perceptual and Economic Decision-Making: Neural and Computational Mechanisms.Frontiers in Neuroscience,6, , https://doi.org/10.3389/fnins.2012.00070 Sutton, R.S., & Barto, A.G. (1981). Toward a modern theory of adaptive networks: Exp...

work page doi:10.3389/fnins.2012.00070 2012
[2]

Stimuli withc∈[−κ, κ] forκ= 1 2 in steps of 0.01 were thus created by setting γl :=µ− 1 2 candγ r :=c+γ l (S.11) as implemented inabm stimuli.py. 4 −1.0 −0.5 0.0 0.5 1.0 0.0 0.2 0.4 0.6 0.8 1.0 P1 A2 ˆσ: 0.25, ˆη: -0.10, ˆτ: 0.00 −1.0 −0.5 0.0 0.5 1.0 0.0 0.2 0.4 0.6 0.8 1.0 P2 A2 ˆσ: 0.47, ˆη: -0.16, ˆτ: 0.05 −1.0 −0.5 0.0 0.5 1.0 0.0 0.2 0.4 0.6 0.8 1.0...

2016