Learning transferable visual models from natural language supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al · 2021

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

LLaVA-Video: Video Instruction Tuning With Synthetic Data

cs.CV · 2024-10-03 · unverdicted · novelty 6.0

LLaVA-Video-178K is a new synthetic video instruction dataset that, when combined with existing data to train LLaVA-Video, produces strong results on video understanding benchmarks.

citing papers explorer

Showing 1 of 1 citing paper.

LLaVA-Video: Video Instruction Tuning With Synthetic Data cs.CV · 2024-10-03 · unverdicted · none · ref 42
LLaVA-Video-178K is a new synthetic video instruction dataset that, when combined with existing data to train LLaVA-Video, produces strong results on video understanding benchmarks.

Learning transferable visual models from natural language supervision

fields

years

verdicts

representative citing papers

citing papers explorer