Grounded language-image pre-training

Liunian Harold Li, Pengchuan Zhang, Haotian Zhang, Jianwei Yang, Chunyuan Li, Yiwu Zhong, Lijuan Wang, Lu Yuan, Lei Zhang, Jenq-Neng Hwang, et al · 2022

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

ShareGPT4V: Improving Large Multi-Modal Models with Better Captions

cs.CV · 2023-11-21 · conditional · novelty 6.0

A new 1.2M-caption dataset generated via GPT-4V improves LMMs on MME and MMBench by 222.8/22.0/22.3 and 2.7/1.3/1.5 points respectively when used for supervised fine-tuning.

SynSpill: Improved Industrial Spill Detection With Synthetic Data

cs.CV · 2025-08-13 · conditional · novelty 5.0

SynSpill synthetic data enables PEFT of VLMs and boosts YOLO and DETR detectors for industrial spill detection, making their performance comparable after training.

citing papers explorer

Showing 2 of 2 citing papers.

ShareGPT4V: Improving Large Multi-Modal Models with Better Captions cs.CV · 2023-11-21 · conditional · none · ref 28
A new 1.2M-caption dataset generated via GPT-4V improves LMMs on MME and MMBench by 222.8/22.0/22.3 and 2.7/1.3/1.5 points respectively when used for supervised fine-tuning.
SynSpill: Improved Industrial Spill Detection With Synthetic Data cs.CV · 2025-08-13 · conditional · none · ref 25
SynSpill synthetic data enables PEFT of VLMs and boosts YOLO and DETR detectors for industrial spill detection, making their performance comparable after training.

Grounded language-image pre-training

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer