Introduces the first active learning framework for unaligned multimodal data that selects alignments using uncertainty and diversity to cut annotation costs by up to 40% on benchmarks while preserving accuracy.
Colorswap: A color and word order dataset for multimodal evaluation.arXiv preprint arXiv:2402.04492,
2 Pith papers cite this work. Polarity classification is still indexing.
years
2025 2verdicts
UNVERDICTED 2representative citing papers
Introduces group matching score for better evaluation of compositional reasoning and Test-Time Matching (TTM) algorithm for unsupervised self-improvement in multimodal models, achieving SOTA gains including surpassing GPT-4.1 and estimated human performance.
citing papers explorer
-
Towards Multimodal Active Learning: Efficient Learning with Limited Paired Data
Introduces the first active learning framework for unaligned multimodal data that selects alignments using uncertainty and diversity to cut annotation costs by up to 40% on benchmarks while preserving accuracy.
-
Test-Time Matching: Unlocking Compositional Reasoning in Multimodal Models
Introduces group matching score for better evaluation of compositional reasoning and Test-Time Matching (TTM) algorithm for unsupervised self-improvement in multimodal models, achieving SOTA gains including surpassing GPT-4.1 and estimated human performance.