Toward robust hyper-detailed image captioning: A multiagent approach and dual evaluation metrics for factuality and coverage.arXiv preprint arXiv:2412.15484, 2024

Saehyung Lee, Seunghyun Yoon, Trung Bui, Jing Shi, Sungroh Yoon · 2024 · arXiv 2412.15484

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

CapRL++: Unified Reinforcement Learning with Verifiable Rewards for Dense Image and Video Captioning

cs.CV · 2026-06-08 · unverdicted · novelty 7.0

CapRL++ applies reinforcement learning with verifiable rewards to dense image and video captioning by scoring captions via the accuracy of a vision-free LLM answering MCQs from the caption alone.

Steer Where It Matters: Token-Level Visual-Sensitivity Steering for LVLMs Hallucination Mitigation

cs.CV · 2026-06-02 · unverdicted · novelty 6.0

TLVS mitigates hallucinations in LVLMs via token-level extraction and visual-sensitivity-adaptive steering applied only at critical decoding steps.

citing papers explorer

Showing 2 of 2 citing papers after filters.

CapRL++: Unified Reinforcement Learning with Verifiable Rewards for Dense Image and Video Captioning cs.CV · 2026-06-08 · unverdicted · none · ref 32
CapRL++ applies reinforcement learning with verifiable rewards to dense image and video captioning by scoring captions via the accuracy of a vision-free LLM answering MCQs from the caption alone.
Steer Where It Matters: Token-Level Visual-Sensitivity Steering for LVLMs Hallucination Mitigation cs.CV · 2026-06-02 · unverdicted · none · ref 8
TLVS mitigates hallucinations in LVLMs via token-level extraction and visual-sensitivity-adaptive steering applied only at critical decoding steps.

Toward robust hyper-detailed image captioning: A multiagent approach and dual evaluation metrics for factuality and coverage.arXiv preprint arXiv:2412.15484, 2024

fields

years

verdicts

representative citing papers

citing papers explorer