pith. machine review for the scientific record. sign in

See what you are told: Visual attention sink in large multimodal models

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

years

2026 6

verdicts

UNVERDICTED 6

representative citing papers

Large Vision-Language Models Get Lost in Attention

cs.AI · 2026-05-07 · unverdicted · novelty 6.0

In LVLMs, attention can be replaced by random Gaussian weights with little or no performance loss, indicating that current models get lost in attention rather than efficiently using visual context.

Counting to Four is still a Chore for VLMs

cs.CV · 2026-04-11 · unverdicted · novelty 6.0

VLMs fail at counting because visual evidence degrades in later language layers, and a lightweight Modality Attention Share intervention can encourage better use of image information during answer generation.

citing papers explorer

Showing 6 of 6 citing papers.