CoLVR uses latent contrastive objectives with angle-based perturbation and RL trajectory rewards to increase exploratory visual reasoning in MLLMs, delivering 5-8% gains on VSP, Jigsaw, and MMStar benchmarks.
Gpt-4o system card
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
LLM-based congestion control reduces latency by up to 50% with less than 0.3% throughput sacrifice versus traditional CCAs in static and dynamic network emulations.
VideoLLaMA3 uses a vision-centric training paradigm and token-reduction design to reach competitive results on image and video benchmarks.
citing papers explorer
-
CoLVR: Enhancing Exploratory Latent Visual Reasoning via Contrastive Optimization
CoLVR uses latent contrastive objectives with angle-based perturbation and RL trajectory rewards to increase exploratory visual reasoning in MLLMs, delivering 5-8% gains on VSP, Jigsaw, and MMStar benchmarks.
-
CCA Reimagined: An Exploratory Study of Large Language Models for Congestion Control
LLM-based congestion control reduces latency by up to 50% with less than 0.3% throughput sacrifice versus traditional CCAs in static and dynamic network emulations.
-
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding
VideoLLaMA3 uses a vision-centric training paradigm and token-reduction design to reach competitive results on image and video benchmarks.