Task conditioning suppresses safety-critical signal reporting in language and vision models that unconstrained versions report at higher rates, creating an inattentional gap that decouples benchmark safety from real-world safety.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
Humans and LLMs exhibit similar error patterns in common-sense reasoning, consistent with shared pattern-matching mechanisms rather than abstract world models.
LLMs show representation-dependent performance on mazes and fail to build cumulative spatial world models despite detailed reasoning traces.
citing papers explorer
-
The Inattentional Gap: Task-Conditioned Language and Vision Models Omit the Safety-Critical Signals They Can Otherwise Report
Task conditioning suppresses safety-critical signal reporting in language and vision models that unconstrained versions report at higher rates, creating an inattentional gap that decouples benchmark safety from real-world safety.
-
Reasoning as Pattern Matching: Shared Mechanisms in Human and LLM Everyday Reasoning
Humans and LLMs exhibit similar error patterns in common-sense reasoning, consistent with shared pattern-matching mechanisms rather than abstract world models.
-
Do LLMs Build Spatial World Models? Evidence from Grid-World Maze Tasks
LLMs show representation-dependent performance on mazes and fail to build cumulative spatial world models despite detailed reasoning traces.