Clevr-math: A dataset for compositional lan- guage, visual and mathematical reasoning

Adam Dahlgren Lindstr ¨om, Savitha Sam Abraham · 2022 · arXiv 2208.05358

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

dataset 1

citation-polarity summary

use dataset 1

representative citing papers

Dynamic Cross-Modal Prompt Generation for Multimodal Continual Instruction Tuning

cs.CV · 2026-05-11 · unverdicted · novelty 7.0

DRAPE generates query-image conditioned prompts on the fly for multimodal continual instruction tuning and reports SOTA results on MCIT benchmarks.

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

cs.CV · 2025-04-14 · conditional · novelty 6.0

InternVL3-78B sets a new open-source SOTA of 72.2 on MMMU via native joint multimodal pre-training, V2PE, MPO, and test-time scaling while remaining competitive with proprietary models.

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

cs.CV · 2024-12-06 · unverdicted · novelty 6.0

InternVL 2.5 is the first open-source MLLM to surpass 70% on the MMMU benchmark via model, data, and test-time scaling, with a 3.7-point gain from chain-of-thought reasoning.

Targeted Exploration via Unified Entropy Control for Reinforcement Learning

cs.AI · 2026-04-16 · unverdicted · novelty 5.0

UEC-RL improves RL reasoning performance in LLMs and VLMs by activating exploration on hard prompts and stabilizing entropy, delivering a 37.9% relative gain over GRPO on Geometry3K.

MAny: Merge Anything for Multimodal Continual Instruction Tuning

cs.LG · 2026-04-15 · unverdicted · novelty 5.0

MAny addresses dual-forgetting in multimodal continual instruction tuning via CPM and LPM merging strategies, delivering up to 8.57% accuracy gains on UCIT benchmarks without additional training.

Quantifying and Understanding Uncertainty in Large Reasoning Models

cs.AI · 2026-04-15 · unverdicted · novelty 5.0

Conformal prediction adapted to reasoning-answer pairs in LRMs yields distribution-free uncertainty sets with finite-sample guarantees, paired with a Shapley explanation method that isolates provably sufficient training subsets and steps.

How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites

cs.CV · 2024-04-25 · unverdicted · novelty 4.0

InternVL 1.5 narrows the performance gap to proprietary multimodal models via a stronger transferable vision encoder, dynamic high-resolution tiling, and curated English-Chinese training data.

citing papers explorer

Showing 7 of 7 citing papers.

Dynamic Cross-Modal Prompt Generation for Multimodal Continual Instruction Tuning cs.CV · 2026-05-11 · unverdicted · none · ref 22
DRAPE generates query-image conditioned prompts on the fly for multimodal continual instruction tuning and reports SOTA results on MCIT benchmarks.
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models cs.CV · 2025-04-14 · conditional · none · ref 72
InternVL3-78B sets a new open-source SOTA of 72.2 on MMMU via native joint multimodal pre-training, V2PE, MPO, and test-time scaling while remaining competitive with proprietary models.
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling cs.CV · 2024-12-06 · unverdicted · none · ref 144
InternVL 2.5 is the first open-source MLLM to surpass 70% on the MMMU benchmark via model, data, and test-time scaling, with a 3.7-point gain from chain-of-thought reasoning.
Targeted Exploration via Unified Entropy Control for Reinforcement Learning cs.AI · 2026-04-16 · unverdicted · none · ref 8
UEC-RL improves RL reasoning performance in LLMs and VLMs by activating exploration on hard prompts and stabilizing entropy, delivering a 37.9% relative gain over GRPO on Geometry3K.
MAny: Merge Anything for Multimodal Continual Instruction Tuning cs.LG · 2026-04-15 · unverdicted · none · ref 8
MAny addresses dual-forgetting in multimodal continual instruction tuning via CPM and LPM merging strategies, delivering up to 8.57% accuracy gains on UCIT benchmarks without additional training.
Quantifying and Understanding Uncertainty in Large Reasoning Models cs.AI · 2026-04-15 · unverdicted · none · ref 4
Conformal prediction adapted to reasoning-answer pairs in LRMs yields distribution-free uncertainty sets with finite-sample guarantees, paired with a Shapley explanation method that isolates provably sufficient training subsets and steps.
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites cs.CV · 2024-04-25 · unverdicted · none · ref 58
InternVL 1.5 narrows the performance gap to proprietary multimodal models via a stronger transferable vision encoder, dynamic high-resolution tiling, and curated English-Chinese training data.

Clevr-math: A dataset for compositional lan- guage, visual and mathematical reasoning

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer