Vision language models are confused tourists, 2025

Patrick Amadeus Irawan, Ikhlasul Akmal Hanif, Muhammad Dehan Al Kautsar, Genta Indra Winata, Fajri Koto, Alham Fikri Aji · 2025 · arXiv 2511.17004

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

read on arXiv browse 1 citing papers

representative citing papers

Sci-Rho: A Multilingual Visually-Grounded Symbolic Benchmark for STEM Problems

cs.CV · 2026-06-06 · unverdicted · novelty 7.0

Sci-Rho is a dynamic multilingual visually-grounded symbolic benchmark for STEM problems that reveals robustness gaps in current VLMs between average and worst-case performance.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Sci-Rho: A Multilingual Visually-Grounded Symbolic Benchmark for STEM Problems cs.CV · 2026-06-06 · unverdicted · none · ref 57
Sci-Rho is a dynamic multilingual visually-grounded symbolic benchmark for STEM problems that reveals robustness gaps in current VLMs between average and worst-case performance.

Vision language models are confused tourists, 2025

fields

years

verdicts

representative citing papers

citing papers explorer