pith. sign in

hub Mixed citations

Univg- r1: Reasoning guided universal visual grounding with re- inforcement learning.arXiv preprint arXiv:2505.14231

Mixed citation behavior. Most common role is background (60%).

16 Pith papers citing it
Background 60% of classified citations

hub tools

citation-role summary

background 3 baseline 2

citation-polarity summary

years

2026 13 2025 3

representative citing papers

Leveraging Latent Visual Reasoning in Silence

cs.CV · 2026-05-18 · conditional · novelty 6.0

Latent visual reasoning improves multimodal models via training effects even without using latent tokens at inference, enabled by an attention-based RL reward that promotes interaction with text tokens.

AdaTooler-V: Adaptive Tool-Use for Images and Videos

cs.CV · 2025-12-18 · conditional · novelty 6.0

AdaTooler-V trains MLLMs to adaptively use vision tools via AT-GRPO reinforcement learning and new datasets, reaching 89.8% on V* and outperforming GPT-4o.

APRVOS: 1st Place Winner of 5th PVUW MeViS-Audio Track

cs.SD · 2026-04-20 · unverdicted · novelty 3.0

A staged pipeline using ASR transcription, visual existence verification, Sa2VA coarse segmentation, and agent-guided SAM3 refinement won first place in the PVUW MeViS-Audio track by decomposing audio-conditioned Ref-VOS into sequential verification and refinement steps.

citing papers explorer

Showing 16 of 16 citing papers.