← back to paper
arxiv: 2605.02035 · 2 revisions
VIDA: A dataset for Visually Dependent Ambiguity in Multimodal Machine Translation