Evaluating Stochastic Collapse and Implicit Bias in Multimodal Large Language Models

· 2026 · cs.CL · arXiv 2606.05874

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Current evaluations for Multimodal Large Language Models (MLLMs) overwhelmingly focus on utility-driven objectives, leaving model behavior under logic-neutral scenarios largely underexplored. Stochasticity is essential in scenarios where multiple actions are equally valid, such as recommending travel itineraries or daily schedules where multiple options have similar utility. In such settings, deterministic policies may lead to repetitive behaviors and reduced coverage of valid alternatives. To bridge this gap, we propose RandomBench, a benchmark designed to evaluate whether MLLMs can maintain distributionally neutral behavior when selecting among equivalent options. We further introduce three metrics, including RI, BCI, BII, to quantify entropy and distributional bias. Experiments reveal a pervasive phenomenon termed Stochastic Collapse, where MLLMs fail to maintain uniform randomness under explicit random instructions, with top-1 probabilities reaching 97% from the ideal one quarter baseline and RI dropping to 0.068 in Claude Sonnet 4.6. Extensive ablation studies further demonstrate that these deviations persist across languages and representation formats, highlighting the robustness of distributional collapse in logic-neutral decision settings.

representative citing papers

Contagion Networks: Evaluator Preference Propagation in Multi-Agent LLM Systems

cs.LG · 2026-06-18 · unverdicted · novelty 6.0

Introduces Contagion Networks framework and measures preference propagation in 3-agent LLM setups, finding architectural priors dominate prompts, topology affects spread, and larger committees reduce contagion by ~69%.

citing papers explorer

Showing 1 of 1 citing paper.

Contagion Networks: Evaluator Preference Propagation in Multi-Agent LLM Systems cs.LG · 2026-06-18 · unverdicted · none · ref 19 · internal anchor
Introduces Contagion Networks framework and measures preference propagation in 3-agent LLM setups, finding architectural priors dominate prompts, topology affects spread, and larger committees reduce contagion by ~69%.

Evaluating Stochastic Collapse and Implicit Bias in Multimodal Large Language Models

fields

years

verdicts

representative citing papers

citing papers explorer