The Inattentional Gap: Task-Conditioned Language and Vision Models Omit the Safety-Critical Signals They Can Otherwise Report

Kwan Soo Shin

arxiv: 2606.26529 · v1 · pith:TE22IR6Inew · submitted 2026-06-25 · 💻 cs.CL · cs.AI· cs.CV

The Inattentional Gap: Task-Conditioned Language and Vision Models Omit the Safety-Critical Signals They Can Otherwise Report

Kwan Soo Shin This is my paper

Pith reviewed 2026-06-26 05:22 UTC · model grok-4.3

classification 💻 cs.CL cs.AIcs.CV

keywords inattentional gaptask conditioningAI safetyinattentional blindnesslanguage modelsvision modelssafety-critical signalsbenchmark decoupling

0 comments

The pith

Conditioning AI models on a narrow task suppresses their reporting of safety-critical signals they report at much higher rates when unconstrained.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that giving language or vision models a specific task causes them to omit co-present safety signals even when those signals are present in the input and the models can report them otherwise. This suppression occurs in radiology text scenarios, driving scenarios, and chest-radiograph vision tasks, appears in every model tested, does not shrink with larger scale, and holds in a reasoning model. The authors name the dissociation the Inattentional Gap and claim it means strong performance on specified hazards in benchmarks does not ensure detection of the unspecified hazards that cause real harm.

Core claim

Task conditioning on language or vision models creates a dissociation in which the models omit safety-critical signals that are co-present in the input yet that the same models report at substantially higher rates when the prompt imposes no narrow task; the authors label this dissociation the Inattentional Gap and argue that it decouples measured benchmark safety from real-world safety because a system can score near-perfectly on the hazards an evaluation specifies while remaining blind to those that cause harm.

What carries the argument

The Inattentional Gap: the observed dissociation in reporting rates between task-conditioned prompts and unconstrained prompts across text and vision inputs.

If this is right

The suppression effect appears in every model tested and does not diminish with scale.
The effect varies more by model family than by model size.
The suppression persists even in a reasoning model.
High scores on specified safety benchmarks do not guarantee awareness of unspecified hazards.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Safety testing may need separate unconstrained reporting checks to catch signals omitted under task focus.
The gap could affect deployment in domains where models receive focused instructions yet must still notice side hazards.
Training methods that preserve reporting of secondary signals under task conditioning could be explored as a direct response.

Load-bearing premise

The unconstrained condition provides a valid baseline for what the model can report without confounds from prompt length, output format, or differing capability across conditions.

What would settle it

An experiment in which models report the safety-critical signals at the same rate under both the narrow task condition and the unconstrained condition, or in which the gap disappears in larger models.

Figures

Figures reproduced from arXiv: 2606.26529 by Kwan Soo Shin.

**Figure 1.** Figure 1: The Inattentional Gap (schematic). Task-conditioning focuses evaluation on the specified target, where report on the specified target appears high, while report of co-present unspecified safety-critical signals is suppressed, decoupling benchmark from real-world safety [PITH_FULL_IMAGE:figures/full_fig_p014_1.png] view at source ↗

**Figure 2.** Figure 2: Study 1 (text): critical-signal report rate by model and condition. [PITH_FULL_IMAGE:figures/full_fig_p014_2.png] view at source ↗

**Figure 4.** Figure 4: Inattentional gap by modality and model. [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗

**Figure 8.** Figure 8: Study 3: output scope, not monotonic load. [PITH_FULL_IMAGE:figures/full_fig_p018_8.png] view at source ↗

read the original abstract

AI safety is evaluated by how reliably a model detects the hazards it is told to find, yet accidents often arise from the hazard no one specified. We show that conditioning a language or vision model on a narrow task suppresses its reporting of co-present, safety-critical signals it can otherwise report, a machine analogue of human inattentional blindness arising from a different mechanism. Across radiology and driving text scenarios and chest-radiograph vision tasks, suppression appeared in every model tested, did not diminish with scale, persisted in a reasoning model, and varied more by model family than by size, while the same models reported these signals at substantially higher rates when unconstrained. We name this dissociation the Inattentional Gap and argue that it decouples measured benchmark safety from real-world safety: a system can score near-perfectly on the hazards an evaluation specifies while remaining blind to those that cause harm.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Task conditioning suppresses safety signal reporting in the tested models, but the unconstrained baseline needs checking for prompt confounds before the gap can be treated as a real mechanism.

read the letter

The paper's main finding is that models given a narrow task report fewer co-present safety-critical signals than the same models do when left unconstrained. This shows up in radiology and driving text cases plus chest x-ray vision tasks, holds for every model tested, stays stable with scale, and appears even in a reasoning model. The gap varies more by model family than by size.

What is new is the consistent dissociation across modalities and the claim that it decouples benchmark scores from real-world hazard detection. The authors name it the inattentional gap and note it arises from a different mechanism than the human version.

The work is straightforward in its setup and draws a clear line to evaluation practice in medical imaging and autonomous driving. That part is useful for anyone thinking about how task specification affects what models surface.

The soft spot is the baseline. The abstract compares rates directly, but unconstrained prompts are usually less directive and may differ in length or expected output style. If those factors are not matched, the higher reporting could come from prompt differences rather than removal of task focus. Without methods, data, or controls visible, it is hard to tell how much of the gap is real suppression versus experimental artifact.

This is for readers working on safety benchmarks and multimodal evaluation. It raises a point worth testing, but the current evidence is too thin to rely on. It deserves peer review so the full experimental details can be examined and the controls verified.

Referee Report

1 major / 0 minor

Summary. The paper claims that conditioning language or vision models on a narrow task suppresses reporting of co-present safety-critical signals that the same models can report when unconstrained, naming this dissociation the 'Inattentional Gap' as a machine analogue of inattentional blindness. The effect is reported across radiology and driving text scenarios plus chest-radiograph vision tasks, appearing in every model tested, independent of scale, persistent in a reasoning model, and varying more by model family than size; the authors argue this decouples benchmark safety from real-world safety.

Significance. If the empirical dissociation holds after controls for prompt confounds, the result would be significant for AI safety evaluation: it supplies concrete evidence that near-perfect performance on specified hazards need not imply detection of unspecified ones, with direct implications for deployment in safety-critical domains such as medicine and autonomous driving.

major comments (1)

[Abstract] Abstract: the central claim that models 'can otherwise report' the signals rests on substantially higher reporting rates in the unconstrained condition. The manuscript provides no indication that unconstrained prompts were length-matched, format-matched, or otherwise equated to the task-conditioned prompts; without such controls the elevated rates could arise from differences in prompt directiveness or output expectations rather than removal of task focus, directly undermining the evidence for a task-induced suppression mechanism.

Simulated Author's Rebuttal

1 responses · 0 unresolved

Thank you for the constructive feedback. We address the referee's major comment on prompt matching below and will revise the manuscript to strengthen the evidence.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that models 'can otherwise report' the signals rests on substantially higher reporting rates in the unconstrained condition. The manuscript provides no indication that unconstrained prompts were length-matched, format-matched, or otherwise equated to the task-conditioned prompts; without such controls the elevated rates could arise from differences in prompt directiveness or output expectations rather than removal of task focus, directly undermining the evidence for a task-induced suppression mechanism.

Authors: We agree that the manuscript does not explicitly report length, format, or directiveness matching between conditions, which leaves open the possibility of prompt confounds. In revision we will add a dedicated methods subsection documenting prompt construction, including token-length statistics and structural equivalence, and will include new matched-prompt experiments (or re-analysis of existing data where feasible) confirming that the reporting-rate dissociation persists under these controls. This directly addresses the concern and bolsters the claim that suppression arises from task conditioning rather than prompt differences. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical comparisons stand independently

full rationale

The paper reports experimental measurements of reporting rates for safety-critical signals under task-conditioned vs unconstrained prompts across multiple models and modalities. No equations, parameter fits, or derivations appear; the central claim is a direct comparison of observed frequencies rather than any reduction to self-defined quantities or self-citation chains. The unconstrained baseline is an experimental control whose validity can be debated on methodological grounds but does not constitute circularity by construction. Self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available, so no specific free parameters, axioms, or invented entities can be extracted from methods or derivations.

pith-pipeline@v0.9.1-grok · 5684 in / 1128 out tokens · 20663 ms · 2026-06-26T05:22:40.787012+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

54 extracted references · 46 canonical work pages

[1]

Simons, D. J. & Chabris, C. F. (1999). Gorillas in our midst: Sustained inattentional blindness for dynamic events. Perception, 28(9), 1059–1074. DOI: 10.1068/p281059

work page doi:10.1068/p281059 1999
[2]

Drew, T., Võ, M. L.-H. & Wolfe, J. M. (2013). The invisible gorilla strikes again: Sustained inattentional blindness in expert observers. Psychological Science, 24(9), 1848–1853. DOI: 10.1177/0956797613479386

work page doi:10.1177/0956797613479386 2013
[3]

& Rock, I

Mack, A. & Rock, I. (1998). Inattentional Blindness. MIT Press. DOI: 10.7551/mit- press/3707.001.0001

work page doi:10.7551/mit- 1998
[4]

and Schulz, E

Binz, M. & Schulz, E. (2023). Using cognitive psychology to understand GPT-3. PNAS, 120(6), e2218523120. DOI: 10.1073/pnas.2218523120

work page doi:10.1073/pnas.2218523120 2023
[5]

URL: https://doi.org/10.1038/s43588-023-00527-x

Hagendorff, T., Fabi, S. & Kosinski, M. (2023). Human-like intuitive behavior and reasoning biases emerged in large language models but disappeared in ChatGPT. Nature Computational Science, 3(10), 833–838. DOI: 10.1038/s43588-023-00527-x

work page doi:10.1038/s43588-023-00527-x 2023
[6]

IRASim: A fine-grained world model for robot manipulation,

Dahou, Y., Huynh, N. D., Le-Khac, P. H., Para, W. R., Singh, A. & Narayan, S. (2025). Vision-language models can’t see the obvious (SalBench). ICCV 2025. DOI: 10.1109/ICCV51701.2025.02239. arXiv:2507.04741

work page doi:10.1109/iccv51701.2025.02239 2025
[7]

S., Malhotra, G., Dujmović, M., et al

Bowers, J. S., Malhotra, G., Dujmović, M., et al. (2023). Deep problems with neu- ral network models of human vision. Behavioral and Brain Sciences, 46, e385. DOI: 10.1017/S0140525X22002813. arXiv:2312.05355

work page doi:10.1017/s0140525x22002813 2023
[8]

doi: 10.1017/S0140525X22002849

Quilty-Dunn, J., Porot, N. & Mandelbaum, E. (2023). The best game in town: The re- emergence of the language-of-thought hypothesis across the cognitive sciences. Behavioral and Brain Sciences, 46, e261. DOI: 10.1017/S0140525X22002849

work page doi:10.1017/s0140525x22002849 2023
[9]

B., Scholl, B

Most, S. B., Scholl, B. J., Clifford, E. R. & Simons, D. J. (2005). What you see is what you set: Sustained inattentional blindness and the capture of awareness. Psychological Review, 112(1), 217–242. DOI: 10.1037/0033-295X.112.1.217

work page doi:10.1037/0033-295x.112.1.217 2005
[10]

invisible gorilla

Zamir, E. (2014). The bias of the question posed: A diagnostic “invisible gorilla.” Diagnosis, 1(3), 245-248. DOI: 10.1515/dx-2014-0017

work page doi:10.1515/dx-2014-0017 2014
[11]

K., Ouyang, B., Kocak, M., Bhabad, S., Bleck, T

Garg, R. K., Ouyang, B., Kocak, M., Bhabad, S., Bleck, T. P. & Jhaveri, M. (2022). Inat- tentional blindness to DWI lesions in spontaneous intracerebral hemorrhage. Neurological Sciences, 43(7), 4355-4361. DOI: 10.1007/s10072-022-05992-2

work page doi:10.1007/s10072-022-05992-2 2022
[12]

M., Ding, Y., Xie, G

Hults, C. M., Ding, Y., Xie, G. G., Raja, R., Johnson, W., Lee, A. & Simons, D. J. (2024). Inattentional blindness in medicine. Cognitive Research: Principles and Implications, 9(1), 18. DOI: 10.1186/s41235-024-00537-x

work page doi:10.1186/s41235-024-00537-x 2024
[13]

& Becklen, R

Neisser, U. & Becklen, R. (1975). Selective looking: Attending to visually specified events. Cognitive Psychology, 7(4), 480–494. DOI: 10.1016/0010-0285(75)90019-5

work page doi:10.1016/0010-0285(75)90019-5 1975
[14]

S., Franken, E

Berbaum, K. S., Franken, E. A., Dorfman, D. D., Rooholamini, S. A., Kathol, M. H., Barloon, T. J., Behlke, F. M., Sato, Y., Lu, C. H., el-Khoury, G. Y., Flickinger, F. W. & Montgomery, W. J. (1990). Satisfaction of search in diagnostic radiology. Invest. Radiol. 25, 133–140. DOI: 10.1097/00004424-199002000-00006

work page doi:10.1097/00004424-199002000-00006 1990
[15]

S., Franken, E

Berbaum, K. S., Franken, E. A., Caldwell, R. T. & Schartz, K. M. (2001). Gaze dwell times on acute trauma injuries missed because of satisfaction of search. Acad. Radiol. 8, 304–314. DOI: 10.1016/S1076-6332(03)80499-3

work page doi:10.1016/s1076-6332(03)80499-3 2001
[16]

satisfaction of search

Adamo, S. H., Gereke, B. J., Shomstein, S. & Schmidt, J. (2021). From “satisfaction of search” to “subsequent search misses”: A review of multiple-target search errors across radiology and cognitive science. Cogn. Res. Princ. Implic. 6, 59. DOI: 10.1186/s41235-021-00318-w

work page doi:10.1186/s41235-021-00318-w 2021
[17]

H., Carrigan, A

Williams, L. H., Carrigan, A. J., Auffermann, W. F., Mills, M. K., Rich, A. N., Elmore, J. G. & Drew, T. (2021). The invisible breast cancer: Experience does not protect against 19 inattentional blindness to clinically relevant findings in radiology. Psychonomic Bulletin & Review, 28(2), 503–511. DOI: 10.3758/s13423-020-01826-4

work page doi:10.3758/s13423-020-01826-4 2021
[18]

Robson, S. G. & Tangen, J. M. (2023). The invisible 800-pound gorilla: Expertise can increase inattentional blindness. Cogn. Res. Princ. Implic. 8, 33. DOI: 10.1186/s41235-023-00486-x

work page doi:10.1186/s41235-023-00486-x 2023
[19]

Lake, Tomer D

Lake, B. M., Ullman, T. D., Tenenbaum, J. B. & Gershman, S. J. (2017). Building ma- chines that learn and think like people. Behavioral and Brain Sciences, 40, e253. DOI: 10.1017/S0140525X16001837

work page doi:10.1017/s0140525x16001837 2017
[20]

Hagendorff, T., Dasgupta, I., Binz, M., Chan, S. C. Y., Lampinen, A., Wang, J. X., Akata, Z. & Schulz, E. (2023). Machine psychology: Investigating emergent capabilities and behavior in large language models using psychological methods. arXiv:2303.13988

arXiv 2023
[21]

K., Dasgupta, I., Chan, S

Lampinen, A. K., Dasgupta, I., Chan, S. C. Y., Sheahan, H. R., Creswell, A., Kumaran, D., McClelland, J. L. & Hill, F. (2024). Language models, like humans, show content effects on reasoning tasks. PNAS Nexus 3, pgae233. DOI: 10.1093/pnasnexus/pgae233

work page doi:10.1093/pnasnexus/pgae233 2024
[22]

Ivanova, Idan A

Mahowald, K., Ivanova, A. A., Blank, I. A., Kanwisher, N., Tenenbaum, J. B. & Fedorenko, E. (2024). Dissociating language and thought in large language models. Trends in Cognitive Sciences, 28(6), 517–540. DOI: 10.1016/j.tics.2024.01.011

work page doi:10.1016/j.tics.2024.01.011 2024
[23]

doi: 10.1016/j.tics.2024.02.008

Yildirim, I. & Paul, L. A. (2024). From task structures to world models: What do LLMs know? Trends Cogn. Sci. 28, 404–415. DOI: 10.1016/j.tics.2024.02.008

work page doi:10.1016/j.tics.2024.02.008 2024
[24]

Webb, T., Holyoak, K. J. & Lu, H. (2023). Emergent analogical reasoning in large language models. Nat. Hum. Behav. 7, 1526–1541. DOI: 10.1038/s41562-023-01659-w

work page doi:10.1038/s41562-023-01659-w 2023
[25]

Block, N. (2011). Perceptual consciousness overflows cognitive access. Trends in Cognitive Sciences, 15(12), 567–575. DOI: 10.1016/j.tics.2011.11.001

work page doi:10.1016/j.tics.2011.11.001 2011
[26]

A., Dennett, D

Cohen, M. A., Dennett, D. C. & Kanwisher, N. (2016). What is the bandwidth of perceptual experience? Trends Cogn. Sci. 20, 324–335. DOI: 10.1016/j.tics.2016.03.006

work page doi:10.1016/j.tics.2016.03.006 2016
[27]

van Amsterdam, W. A. C., van Geloven, N., Krijthe, J. H., Ranganath, R. & Cinà, G. (2025). When accurate prediction models yield harmful self-fulfilling prophecies. Patterns, 6(4), 101229. DOI: 10.1016/j.patter.2025.101229

work page doi:10.1016/j.patter.2025.101229 2025
[28]

Park, Simon Goldstein, Aidan O'Gara, Michael Chen, and Dan Hendrycks

Park, P. S., Goldstein, S., O’Gara, A., Chen, M. & Hendrycks, D. (2024). AI decep- tion: A survey of examples, risks, and potential solutions. Patterns 5, 100988. DOI: 10.1016/j.patter.2024.100988

work page doi:10.1016/j.patter.2024.100988 2024
[29]

M., Nguyen, C

Sanchez, M., Alford, K., Krishna, V., Huynh, T. M., Nguyen, C. D. T., Lungren, M. P., Truong, S. Q. H. & Rajpurkar, P. (2023). AI-clinician collaboration via disagreement prediction: A decision pipeline and retrospective analysis of real-world radiologist-AI interactions. Cell Reports Medicine, 4(10), 101207. DOI: 10.1016/j.xcrm.2023.101207

work page doi:10.1016/j.xcrm.2023.101207 2023
[30]

Rahmanzadehgervi, P., Bolton, L., Taesiri, M. R. & Nguyen, A. T. (2024). Vision language models are blind. arXiv:2407.06581 (ACCV 2024)

arXiv 2024
[31]

Zhang, Z

Tong, S., Liu, Z., Zhai, Y., Ma, Y., LeCun, Y. & Xie, S. (2024). Eyes wide shut? Explor- ing the visual shortcomings of multimodal LLMs. Proc. IEEE/CVF CVPR 2024. DOI: 10.1109/CVPR52733.2024.00914. arXiv:2401.06209

work page doi:10.1109/cvpr52733.2024.00914 2024
[32]

Li, Y., Du, Y., Zhou, K., Wang, J., Zhao, W. X. & Wen, J.-R. (2023). Evaluating object hallucination in large vision-language models. Proc. EMNLP 2023. arXiv:2305.10355

Pith/arXiv arXiv 2023
[33]

Y., Shrivastava, A., Moore, J., West, P., Tan, C

Fu, H. Y., Shrivastava, A., Moore, J., West, P., Tan, C. & Holtzman, A. (2025). AbsenceBench: Language models can’t tell what’s missing. arXiv:2506.11440

arXiv 2025
[34]

Wichmann

Geirhos, R., Jacobsen, J.-H., Michaelis, C., Zemel, R., Brendel, W., Bethge, M. & Wichmann, F. A. (2020). Shortcut learning in deep neural networks. Nat. Mach. Intell. 2, 665–673. DOI: 10.1038/s42256-020-00257-z

work page doi:10.1038/s42256-020-00257-z 2020
[35]

& Mané, D

Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J. & Mané, D. (2016). Concrete problems in AI safety. arXiv:1606.06565. 20

Pith/arXiv arXiv 2016
[36]

Proceedings of the AAAI Conference on Artificial Intelligence , author=

Booth, S., Knox, W. B., Shah, J., Niekum, S., Stone, P. & Allievi, A. (2023). The perils of trial-and-error reward design: Misdesign through overfitting and invalid task specifications. Proc. AAAI Conf. Artif. Intell. 37, 5920–5929. DOI: 10.1609/aaai.v37i5.25733

work page doi:10.1609/aaai.v37i5.25733 2023
[37]

Oakden-Rayner, L., Dunnmon, J., Carneiro, G. & Ré, C. (2020). Hidden stratification causes clinically meaningful failures in medical imaging deep learning. Proc. ACM Conf. Health Inference Learn. (CHIL) 151–159. DOI: 10.1145/3368555.3384468

work page doi:10.1145/3368555.3384468 2020
[38]

Wu, E., Wu, K., Daneshjou, R., Ouyang, D., Ho, D. E. & Zou, J. (2021). How medical AI devices are evaluated: Limitations and recommendations from an analysis of FDA approvals. Nat. Med. 27, 582–584. DOI: 10.1038/s41591-021-01312-x

work page doi:10.1038/s41591-021-01312-x 2021
[39]

N., Kaiser, L

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L. & Polosukhin, I. (2017). Attention is all you need. NeurIPS 2017. arXiv:1706.03762

Pith/arXiv arXiv 2017
[40]

Evans, J. St. B. T. & Stanovich, K. E. (2013). Dual-process theories of higher cognition: Advancing the debate. Perspect. Psychol. Sci. 8, 223–241. DOI: 10.1177/1745691612460685

work page doi:10.1177/1745691612460685 2013
[41]

& Koch, C

Itti, L. & Koch, C. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20, 1254–1259. DOI: 10.1109/34.730558

work page doi:10.1109/34.730558 1998
[42]

Bainbridge, L. (1983). Ironies of automation. Automatica, 19(6), 775–779. DOI: 10.1016/0005- 1098(83)90046-8

work page doi:10.1016/0005- 1983
[43]

Human Factors 39, 230–253

Parasuraman, R. & Riley, V. (1997). Humans and automation: Use, misuse, disuse, abuse. Human Factors, 39(2), 230–253. DOI: 10.1518/001872097778543886

work page doi:10.1518/001872097778543886 1997
[44]

& Manzey, D

Parasuraman, R. & Manzey, D. H. (2010). Complacency and bias in human use of automation: An attentional integration. Hum. Factors 52, 381–410. DOI: 10.1177/0018720810376055

work page doi:10.1177/0018720810376055 2010
[45]

T., Peterson, S

Dzindolet, M. T., Peterson, S. A., Pomranky, R. A., Pierce, L. G. & Beck, H. P. (2003). The role of trust in automation reliance. Int. J. Hum.-Comput. Stud. 58, 697–718. DOI: 10.1016/S1071-5819(03)00038-7

work page doi:10.1016/s1071-5819(03)00038-7 2003
[46]

& Wyatt, J

Goddard, K., Roudsari, A. & Wyatt, J. C. (2012). Automation bias: A systematic review of frequency, effect mediators, and mitigators. J. Am. Med. Inform. Assoc. 19, 121–127. DOI: 10.1136/amiajnl-2011-000089

work page doi:10.1136/amiajnl-2011-000089 2012
[47]

Kahneman, D. (2011). Thinking, Fast and Slow. Farrar, Straus and Giroux

2011
[48]

& Papakyriakopoulos, O

Yu, C., Engelmann, S., Cao, R., Ali, D. & Papakyriakopoulos, O. (2026). How should AI safety benchmarks benchmark safety? arXiv:2601.23112. DOI: 10.48550/arXiv.2601.23112

work page doi:10.48550/arxiv.2601.23112 2026
[49]

Torres, I., & Evals-Consensus.AI Consortium. (2026). A call to join a collective effort on AI evaluation. Patterns, 7(3), 101512. DOI: 10.1016/j.patter.2026.101512

work page doi:10.1016/j.patter.2026.101512 2026
[50]

L., Nodine, C

Kundel, H. L., Nodine, C. F. & Carmody, D. (1978). Visual scanning, pattern recognition and decision-making in pulmonary nodule detection. Invest. Radiol. 13, 175–181. DOI: 10.1097/00004424-197805000-00001

work page doi:10.1097/00004424-197805000-00001 1978
[51]

Krupinski, E. A. (2010). Current perspectives in medical image perception. Atten. Percept. Psychophys. 72, 1205–1217. DOI: 10.3758/APP.72.5.1205

work page doi:10.3758/app.72.5.1205 2010
[52]

& Navalesi, P

de Cassai, A., Negro, S., Geraldini, F., Boscolo, A., Sella, N., Munari, M. & Navalesi, P. (2021). Inattentional blindness in anesthesiology: A gorilla is worth one thousand words. PLoS ONE 16, e0257508. DOI: 10.1371/journal.pone.0257508

work page doi:10.1371/journal.pone.0257508 2021
[53]

Roadsaw: A large-scale dataset for camera- based road surface and wetness estimation,

Bogdoll, D., Nitsche, M. & Zöllner, J. M. (2022). Anomaly detection in autonomous driving: A survey. Proc. IEEE/CVF CVPR Workshops 2022. DOI: 10.1109/CVPRW56347.2022.00495. arXiv:2204.07974

work page doi:10.1109/cvprw56347.2022.00495 2022
[54]

& Rottmann, M

Chan, R., Lis, K., Uhlemeyer, S., Blum, H., Honari, S., Siegwart, R., Fua, P., Salzmann, M. & Rottmann, M. (2021). SegmentMeIfYouCan: A benchmark for anomaly segmentation. Advances in Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track. arXiv:2104.14812. 21 The complete annotated bibliography of 116 works, each carrying a finding...

arXiv 2021

[1] [1]

Simons, D. J. & Chabris, C. F. (1999). Gorillas in our midst: Sustained inattentional blindness for dynamic events. Perception, 28(9), 1059–1074. DOI: 10.1068/p281059

work page doi:10.1068/p281059 1999

[2] [2]

Drew, T., Võ, M. L.-H. & Wolfe, J. M. (2013). The invisible gorilla strikes again: Sustained inattentional blindness in expert observers. Psychological Science, 24(9), 1848–1853. DOI: 10.1177/0956797613479386

work page doi:10.1177/0956797613479386 2013

[3] [3]

& Rock, I

Mack, A. & Rock, I. (1998). Inattentional Blindness. MIT Press. DOI: 10.7551/mit- press/3707.001.0001

work page doi:10.7551/mit- 1998

[4] [4]

and Schulz, E

Binz, M. & Schulz, E. (2023). Using cognitive psychology to understand GPT-3. PNAS, 120(6), e2218523120. DOI: 10.1073/pnas.2218523120

work page doi:10.1073/pnas.2218523120 2023

[5] [5]

URL: https://doi.org/10.1038/s43588-023-00527-x

Hagendorff, T., Fabi, S. & Kosinski, M. (2023). Human-like intuitive behavior and reasoning biases emerged in large language models but disappeared in ChatGPT. Nature Computational Science, 3(10), 833–838. DOI: 10.1038/s43588-023-00527-x

work page doi:10.1038/s43588-023-00527-x 2023

[6] [6]

IRASim: A fine-grained world model for robot manipulation,

Dahou, Y., Huynh, N. D., Le-Khac, P. H., Para, W. R., Singh, A. & Narayan, S. (2025). Vision-language models can’t see the obvious (SalBench). ICCV 2025. DOI: 10.1109/ICCV51701.2025.02239. arXiv:2507.04741

work page doi:10.1109/iccv51701.2025.02239 2025

[7] [7]

S., Malhotra, G., Dujmović, M., et al

Bowers, J. S., Malhotra, G., Dujmović, M., et al. (2023). Deep problems with neu- ral network models of human vision. Behavioral and Brain Sciences, 46, e385. DOI: 10.1017/S0140525X22002813. arXiv:2312.05355

work page doi:10.1017/s0140525x22002813 2023

[8] [8]

doi: 10.1017/S0140525X22002849

Quilty-Dunn, J., Porot, N. & Mandelbaum, E. (2023). The best game in town: The re- emergence of the language-of-thought hypothesis across the cognitive sciences. Behavioral and Brain Sciences, 46, e261. DOI: 10.1017/S0140525X22002849

work page doi:10.1017/s0140525x22002849 2023

[9] [9]

B., Scholl, B

Most, S. B., Scholl, B. J., Clifford, E. R. & Simons, D. J. (2005). What you see is what you set: Sustained inattentional blindness and the capture of awareness. Psychological Review, 112(1), 217–242. DOI: 10.1037/0033-295X.112.1.217

work page doi:10.1037/0033-295x.112.1.217 2005

[10] [10]

invisible gorilla

Zamir, E. (2014). The bias of the question posed: A diagnostic “invisible gorilla.” Diagnosis, 1(3), 245-248. DOI: 10.1515/dx-2014-0017

work page doi:10.1515/dx-2014-0017 2014

[11] [11]

K., Ouyang, B., Kocak, M., Bhabad, S., Bleck, T

Garg, R. K., Ouyang, B., Kocak, M., Bhabad, S., Bleck, T. P. & Jhaveri, M. (2022). Inat- tentional blindness to DWI lesions in spontaneous intracerebral hemorrhage. Neurological Sciences, 43(7), 4355-4361. DOI: 10.1007/s10072-022-05992-2

work page doi:10.1007/s10072-022-05992-2 2022

[12] [12]

M., Ding, Y., Xie, G

Hults, C. M., Ding, Y., Xie, G. G., Raja, R., Johnson, W., Lee, A. & Simons, D. J. (2024). Inattentional blindness in medicine. Cognitive Research: Principles and Implications, 9(1), 18. DOI: 10.1186/s41235-024-00537-x

work page doi:10.1186/s41235-024-00537-x 2024

[13] [13]

& Becklen, R

Neisser, U. & Becklen, R. (1975). Selective looking: Attending to visually specified events. Cognitive Psychology, 7(4), 480–494. DOI: 10.1016/0010-0285(75)90019-5

work page doi:10.1016/0010-0285(75)90019-5 1975

[14] [14]

S., Franken, E

Berbaum, K. S., Franken, E. A., Dorfman, D. D., Rooholamini, S. A., Kathol, M. H., Barloon, T. J., Behlke, F. M., Sato, Y., Lu, C. H., el-Khoury, G. Y., Flickinger, F. W. & Montgomery, W. J. (1990). Satisfaction of search in diagnostic radiology. Invest. Radiol. 25, 133–140. DOI: 10.1097/00004424-199002000-00006

work page doi:10.1097/00004424-199002000-00006 1990

[15] [15]

S., Franken, E

Berbaum, K. S., Franken, E. A., Caldwell, R. T. & Schartz, K. M. (2001). Gaze dwell times on acute trauma injuries missed because of satisfaction of search. Acad. Radiol. 8, 304–314. DOI: 10.1016/S1076-6332(03)80499-3

work page doi:10.1016/s1076-6332(03)80499-3 2001

[16] [16]

satisfaction of search

Adamo, S. H., Gereke, B. J., Shomstein, S. & Schmidt, J. (2021). From “satisfaction of search” to “subsequent search misses”: A review of multiple-target search errors across radiology and cognitive science. Cogn. Res. Princ. Implic. 6, 59. DOI: 10.1186/s41235-021-00318-w

work page doi:10.1186/s41235-021-00318-w 2021

[17] [17]

H., Carrigan, A

Williams, L. H., Carrigan, A. J., Auffermann, W. F., Mills, M. K., Rich, A. N., Elmore, J. G. & Drew, T. (2021). The invisible breast cancer: Experience does not protect against 19 inattentional blindness to clinically relevant findings in radiology. Psychonomic Bulletin & Review, 28(2), 503–511. DOI: 10.3758/s13423-020-01826-4

work page doi:10.3758/s13423-020-01826-4 2021

[18] [18]

Robson, S. G. & Tangen, J. M. (2023). The invisible 800-pound gorilla: Expertise can increase inattentional blindness. Cogn. Res. Princ. Implic. 8, 33. DOI: 10.1186/s41235-023-00486-x

work page doi:10.1186/s41235-023-00486-x 2023

[19] [19]

Lake, Tomer D

Lake, B. M., Ullman, T. D., Tenenbaum, J. B. & Gershman, S. J. (2017). Building ma- chines that learn and think like people. Behavioral and Brain Sciences, 40, e253. DOI: 10.1017/S0140525X16001837

work page doi:10.1017/s0140525x16001837 2017

[20] [20]

Hagendorff, T., Dasgupta, I., Binz, M., Chan, S. C. Y., Lampinen, A., Wang, J. X., Akata, Z. & Schulz, E. (2023). Machine psychology: Investigating emergent capabilities and behavior in large language models using psychological methods. arXiv:2303.13988

arXiv 2023

[21] [21]

K., Dasgupta, I., Chan, S

Lampinen, A. K., Dasgupta, I., Chan, S. C. Y., Sheahan, H. R., Creswell, A., Kumaran, D., McClelland, J. L. & Hill, F. (2024). Language models, like humans, show content effects on reasoning tasks. PNAS Nexus 3, pgae233. DOI: 10.1093/pnasnexus/pgae233

work page doi:10.1093/pnasnexus/pgae233 2024

[22] [22]

Ivanova, Idan A

Mahowald, K., Ivanova, A. A., Blank, I. A., Kanwisher, N., Tenenbaum, J. B. & Fedorenko, E. (2024). Dissociating language and thought in large language models. Trends in Cognitive Sciences, 28(6), 517–540. DOI: 10.1016/j.tics.2024.01.011

work page doi:10.1016/j.tics.2024.01.011 2024

[23] [23]

doi: 10.1016/j.tics.2024.02.008

Yildirim, I. & Paul, L. A. (2024). From task structures to world models: What do LLMs know? Trends Cogn. Sci. 28, 404–415. DOI: 10.1016/j.tics.2024.02.008

work page doi:10.1016/j.tics.2024.02.008 2024

[24] [24]

Webb, T., Holyoak, K. J. & Lu, H. (2023). Emergent analogical reasoning in large language models. Nat. Hum. Behav. 7, 1526–1541. DOI: 10.1038/s41562-023-01659-w

work page doi:10.1038/s41562-023-01659-w 2023

[25] [25]

Block, N. (2011). Perceptual consciousness overflows cognitive access. Trends in Cognitive Sciences, 15(12), 567–575. DOI: 10.1016/j.tics.2011.11.001

work page doi:10.1016/j.tics.2011.11.001 2011

[26] [26]

A., Dennett, D

Cohen, M. A., Dennett, D. C. & Kanwisher, N. (2016). What is the bandwidth of perceptual experience? Trends Cogn. Sci. 20, 324–335. DOI: 10.1016/j.tics.2016.03.006

work page doi:10.1016/j.tics.2016.03.006 2016

[27] [27]

van Amsterdam, W. A. C., van Geloven, N., Krijthe, J. H., Ranganath, R. & Cinà, G. (2025). When accurate prediction models yield harmful self-fulfilling prophecies. Patterns, 6(4), 101229. DOI: 10.1016/j.patter.2025.101229

work page doi:10.1016/j.patter.2025.101229 2025

[28] [28]

Park, Simon Goldstein, Aidan O'Gara, Michael Chen, and Dan Hendrycks

Park, P. S., Goldstein, S., O’Gara, A., Chen, M. & Hendrycks, D. (2024). AI decep- tion: A survey of examples, risks, and potential solutions. Patterns 5, 100988. DOI: 10.1016/j.patter.2024.100988

work page doi:10.1016/j.patter.2024.100988 2024

[29] [29]

M., Nguyen, C

Sanchez, M., Alford, K., Krishna, V., Huynh, T. M., Nguyen, C. D. T., Lungren, M. P., Truong, S. Q. H. & Rajpurkar, P. (2023). AI-clinician collaboration via disagreement prediction: A decision pipeline and retrospective analysis of real-world radiologist-AI interactions. Cell Reports Medicine, 4(10), 101207. DOI: 10.1016/j.xcrm.2023.101207

work page doi:10.1016/j.xcrm.2023.101207 2023

[30] [30]

Rahmanzadehgervi, P., Bolton, L., Taesiri, M. R. & Nguyen, A. T. (2024). Vision language models are blind. arXiv:2407.06581 (ACCV 2024)

arXiv 2024

[31] [31]

Zhang, Z

Tong, S., Liu, Z., Zhai, Y., Ma, Y., LeCun, Y. & Xie, S. (2024). Eyes wide shut? Explor- ing the visual shortcomings of multimodal LLMs. Proc. IEEE/CVF CVPR 2024. DOI: 10.1109/CVPR52733.2024.00914. arXiv:2401.06209

work page doi:10.1109/cvpr52733.2024.00914 2024

[32] [32]

Li, Y., Du, Y., Zhou, K., Wang, J., Zhao, W. X. & Wen, J.-R. (2023). Evaluating object hallucination in large vision-language models. Proc. EMNLP 2023. arXiv:2305.10355

Pith/arXiv arXiv 2023

[33] [33]

Y., Shrivastava, A., Moore, J., West, P., Tan, C

Fu, H. Y., Shrivastava, A., Moore, J., West, P., Tan, C. & Holtzman, A. (2025). AbsenceBench: Language models can’t tell what’s missing. arXiv:2506.11440

arXiv 2025

[34] [34]

Wichmann

Geirhos, R., Jacobsen, J.-H., Michaelis, C., Zemel, R., Brendel, W., Bethge, M. & Wichmann, F. A. (2020). Shortcut learning in deep neural networks. Nat. Mach. Intell. 2, 665–673. DOI: 10.1038/s42256-020-00257-z

work page doi:10.1038/s42256-020-00257-z 2020

[35] [35]

& Mané, D

Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J. & Mané, D. (2016). Concrete problems in AI safety. arXiv:1606.06565. 20

Pith/arXiv arXiv 2016

[36] [36]

Proceedings of the AAAI Conference on Artificial Intelligence , author=

Booth, S., Knox, W. B., Shah, J., Niekum, S., Stone, P. & Allievi, A. (2023). The perils of trial-and-error reward design: Misdesign through overfitting and invalid task specifications. Proc. AAAI Conf. Artif. Intell. 37, 5920–5929. DOI: 10.1609/aaai.v37i5.25733

work page doi:10.1609/aaai.v37i5.25733 2023

[37] [37]

Oakden-Rayner, L., Dunnmon, J., Carneiro, G. & Ré, C. (2020). Hidden stratification causes clinically meaningful failures in medical imaging deep learning. Proc. ACM Conf. Health Inference Learn. (CHIL) 151–159. DOI: 10.1145/3368555.3384468

work page doi:10.1145/3368555.3384468 2020

[38] [38]

Wu, E., Wu, K., Daneshjou, R., Ouyang, D., Ho, D. E. & Zou, J. (2021). How medical AI devices are evaluated: Limitations and recommendations from an analysis of FDA approvals. Nat. Med. 27, 582–584. DOI: 10.1038/s41591-021-01312-x

work page doi:10.1038/s41591-021-01312-x 2021

[39] [39]

N., Kaiser, L

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L. & Polosukhin, I. (2017). Attention is all you need. NeurIPS 2017. arXiv:1706.03762

Pith/arXiv arXiv 2017

[40] [40]

Evans, J. St. B. T. & Stanovich, K. E. (2013). Dual-process theories of higher cognition: Advancing the debate. Perspect. Psychol. Sci. 8, 223–241. DOI: 10.1177/1745691612460685

work page doi:10.1177/1745691612460685 2013

[41] [41]

& Koch, C

Itti, L. & Koch, C. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20, 1254–1259. DOI: 10.1109/34.730558

work page doi:10.1109/34.730558 1998

[42] [42]

Bainbridge, L. (1983). Ironies of automation. Automatica, 19(6), 775–779. DOI: 10.1016/0005- 1098(83)90046-8

work page doi:10.1016/0005- 1983

[43] [43]

Human Factors 39, 230–253

Parasuraman, R. & Riley, V. (1997). Humans and automation: Use, misuse, disuse, abuse. Human Factors, 39(2), 230–253. DOI: 10.1518/001872097778543886

work page doi:10.1518/001872097778543886 1997

[44] [44]

& Manzey, D

Parasuraman, R. & Manzey, D. H. (2010). Complacency and bias in human use of automation: An attentional integration. Hum. Factors 52, 381–410. DOI: 10.1177/0018720810376055

work page doi:10.1177/0018720810376055 2010

[45] [45]

T., Peterson, S

Dzindolet, M. T., Peterson, S. A., Pomranky, R. A., Pierce, L. G. & Beck, H. P. (2003). The role of trust in automation reliance. Int. J. Hum.-Comput. Stud. 58, 697–718. DOI: 10.1016/S1071-5819(03)00038-7

work page doi:10.1016/s1071-5819(03)00038-7 2003

[46] [46]

& Wyatt, J

Goddard, K., Roudsari, A. & Wyatt, J. C. (2012). Automation bias: A systematic review of frequency, effect mediators, and mitigators. J. Am. Med. Inform. Assoc. 19, 121–127. DOI: 10.1136/amiajnl-2011-000089

work page doi:10.1136/amiajnl-2011-000089 2012

[47] [47]

Kahneman, D. (2011). Thinking, Fast and Slow. Farrar, Straus and Giroux

2011

[48] [48]

& Papakyriakopoulos, O

Yu, C., Engelmann, S., Cao, R., Ali, D. & Papakyriakopoulos, O. (2026). How should AI safety benchmarks benchmark safety? arXiv:2601.23112. DOI: 10.48550/arXiv.2601.23112

work page doi:10.48550/arxiv.2601.23112 2026

[49] [49]

Torres, I., & Evals-Consensus.AI Consortium. (2026). A call to join a collective effort on AI evaluation. Patterns, 7(3), 101512. DOI: 10.1016/j.patter.2026.101512

work page doi:10.1016/j.patter.2026.101512 2026

[50] [50]

L., Nodine, C

Kundel, H. L., Nodine, C. F. & Carmody, D. (1978). Visual scanning, pattern recognition and decision-making in pulmonary nodule detection. Invest. Radiol. 13, 175–181. DOI: 10.1097/00004424-197805000-00001

work page doi:10.1097/00004424-197805000-00001 1978

[51] [51]

Krupinski, E. A. (2010). Current perspectives in medical image perception. Atten. Percept. Psychophys. 72, 1205–1217. DOI: 10.3758/APP.72.5.1205

work page doi:10.3758/app.72.5.1205 2010

[52] [52]

& Navalesi, P

de Cassai, A., Negro, S., Geraldini, F., Boscolo, A., Sella, N., Munari, M. & Navalesi, P. (2021). Inattentional blindness in anesthesiology: A gorilla is worth one thousand words. PLoS ONE 16, e0257508. DOI: 10.1371/journal.pone.0257508

work page doi:10.1371/journal.pone.0257508 2021

[53] [53]

Roadsaw: A large-scale dataset for camera- based road surface and wetness estimation,

Bogdoll, D., Nitsche, M. & Zöllner, J. M. (2022). Anomaly detection in autonomous driving: A survey. Proc. IEEE/CVF CVPR Workshops 2022. DOI: 10.1109/CVPRW56347.2022.00495. arXiv:2204.07974

work page doi:10.1109/cvprw56347.2022.00495 2022

[54] [54]

& Rottmann, M

Chan, R., Lis, K., Uhlemeyer, S., Blum, H., Honari, S., Siegwart, R., Fua, P., Salzmann, M. & Rottmann, M. (2021). SegmentMeIfYouCan: A benchmark for anomaly segmentation. Advances in Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track. arXiv:2104.14812. 21 The complete annotated bibliography of 116 works, each carrying a finding...

arXiv 2021