Multimodal-CoT achieves state-of-the-art on ScienceQA by using a two-stage process that incorporates vision into chain-of-thought rationale generation for models under 1 billion parameters.
Deep residual learning for image recognition
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2023 1verdicts
ACCEPT 1representative citing papers
citing papers explorer
-
Multimodal Chain-of-Thought Reasoning in Language Models
Multimodal-CoT achieves state-of-the-art on ScienceQA by using a two-stage process that incorporates vision into chain-of-thought rationale generation for models under 1 billion parameters.