pith. machine review for the scientific record. sign in

Ssl4rl: Revisiting self-supervised learning as intrinsic reward for visual-language reasoning

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

fields

cs.CV 3

years

2026 3

verdicts

UNVERDICTED 3

representative citing papers

Visually-Guided Policy Optimization for Multimodal Reasoning

cs.CV · 2026-04-10 · unverdicted · novelty 6.0

VGPO introduces visual attention compensation and dual-grained advantage re-weighting to reinforce visual focus in VLMs, yielding better activation and performance on multimodal reasoning tasks.

citing papers explorer

Showing 3 of 3 citing papers.