AgroBench: Vision-Language Model Benchmark in Agriculture

Risa Shinoda, Nakamasa Inoue, Hirokatsu Kataoka, Masaki Onishi, Yoshitaka Ushiku · 2025 · arXiv 2507.20519

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

Sci-Rho: A Multilingual Visually-Grounded Symbolic Benchmark for STEM Problems

cs.CV · 2026-06-06 · unverdicted · novelty 7.0

Sci-Rho is a dynamic multilingual visually-grounded symbolic benchmark for STEM problems that reveals robustness gaps in current VLMs between average and worst-case performance.

AgroVG: A Large-Scale Multi-Source Benchmark for Agricultural Visual Grounding

cs.CV · 2026-05-21 · accept · novelty 7.0

AgroVG is a new multi-source benchmark for agricultural visual grounding formulated as generalized set prediction, with protocols for box and mask grounding across single-target, multi-target, and target-absent queries from six object families.

CropVLM: A Domain-Adapted Vision-Language Model for Open-Set Crop Analysis

cs.CV · 2026-05-05 · unverdicted · novelty 5.0

CropVLM is a domain-adapted vision-language model that achieves 72.51% zero-shot crop classification accuracy and superior open-set detection performance on novel species without retraining.

Are vision-language models ready to zero-shot replace supervised classification models in agriculture?

cs.CV · 2025-12-17 · unverdicted · novelty 4.0

Zero-shot VLMs reach at most 62% accuracy on agricultural classification tasks while supervised models like YOLO11 perform markedly higher, indicating they are not ready to replace task-specific systems.

citing papers explorer

Showing 3 of 3 citing papers after filters.

Sci-Rho: A Multilingual Visually-Grounded Symbolic Benchmark for STEM Problems cs.CV · 2026-06-06 · unverdicted · none · ref 66
Sci-Rho is a dynamic multilingual visually-grounded symbolic benchmark for STEM problems that reveals robustness gaps in current VLMs between average and worst-case performance.
CropVLM: A Domain-Adapted Vision-Language Model for Open-Set Crop Analysis cs.CV · 2026-05-05 · unverdicted · none · ref 36
CropVLM is a domain-adapted vision-language model that achieves 72.51% zero-shot crop classification accuracy and superior open-set detection performance on novel species without retraining.
Are vision-language models ready to zero-shot replace supervised classification models in agriculture? cs.CV · 2025-12-17 · unverdicted · none · ref 14
Zero-shot VLMs reach at most 62% accuracy on agricultural classification tasks while supervised models like YOLO11 perform markedly higher, indicating they are not ready to replace task-specific systems.

AgroBench: Vision-Language Model Benchmark in Agriculture

fields

years

verdicts

representative citing papers

citing papers explorer