pith. machine review for the scientific record. sign in

hub Baseline reference

Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning

Baseline reference. 67% of citing Pith papers use this work as a benchmark or comparison.

20 Pith papers citing it
Baseline 67% of classified citations

hub tools

citation-role summary

dataset 4 background 1 method 1

citation-polarity summary

clear filters

representative citing papers

ReflectCAP: Detailed Image Captioning with Reflective Memory

cs.AI · 2026-04-14 · unverdicted · novelty 6.0

ReflectCAP distills model-specific hallucination and oversight patterns into Structured Reflection Notes that steer LVLMs toward more factual and complete image captions, reaching the Pareto frontier on factuality-coverage trade-offs.

LLaVA-OneVision: Easy Visual Task Transfer

cs.CV · 2024-08-06 · unverdicted · novelty 5.0

LLaVA-OneVision is the first single open LMM to simultaneously achieve strong performance in single-image, multi-image, and video scenarios with cross-scenario transfer capabilities.

MiniCPM-V: A GPT-4V Level MLLM on Your Phone

cs.CV · 2024-08-03 · conditional · novelty 5.0

MiniCPM-Llama3-V 2.5 delivers GPT-4V-level multimodal performance on phones through architecture, pretraining, and alignment optimizations.

citing papers explorer

Showing 1 of 1 citing paper after filters.