pith. sign in

Evaluating object hallucination in large vision-language models

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

fields

cs.CV 3 cs.MM 1

years

2026 4

verdicts

UNVERDICTED 4

roles

background 1

polarities

background 1

representative citing papers

Unlocking Dense Metric Depth Estimation in VLMs

cs.CV · 2026-05-15 · unverdicted · novelty 6.0 · 2 refs

DepthVLM converts a standard VLM into a dense metric depth predictor by attaching a lightweight head and training under unified vision-text supervision, outperforming prior VLMs and some pure vision models on a new indoor-outdoor benchmark.

Anisotropic Modality Align

cs.MM · 2026-05-08 · unverdicted · novelty 6.0

Modality representations share dominant semantic geometry but have an anisotropic residual gap; AnisoAlign corrects source representations boundedly using target geometry for unpaired alignment.

citing papers explorer

Showing 4 of 4 citing papers.

  • UHR-Micro: Diagnosing and Mitigating the Resolution Illusion in Earth Observation VLMs cs.CV · 2026-05-12 · unverdicted · none · ref 14

    VLMs show a resolution illusion on UHR Earth observation imagery where higher resolution does not improve micro-target perception; UHR-Micro benchmark and MAP-Agent address this via evidence-centered active inspection.

  • Vision-OPD: Learning to See Fine Details for Multimodal LLMs via On-Policy Self-Distillation cs.CV · 2026-05-18 · unverdicted · none · ref 18

    Vision-OPD transfers an MLLM's privileged regional perception to its full-image policy through on-policy token-level self-distillation, yielding competitive results on fine-grained visual benchmarks.

  • Unlocking Dense Metric Depth Estimation in VLMs cs.CV · 2026-05-15 · unverdicted · none · ref 30 · 2 links

    DepthVLM converts a standard VLM into a dense metric depth predictor by attaching a lightweight head and training under unified vision-text supervision, outperforming prior VLMs and some pure vision models on a new indoor-outdoor benchmark.

  • Anisotropic Modality Align cs.MM · 2026-05-08 · unverdicted · none · ref 8

    Modality representations share dominant semantic geometry but have an anisotropic residual gap; AnisoAlign corrects source representations boundedly using target geometry for unpaired alignment.