Mind the gap: Understanding the modality gap in multi-modal contrastive representation learning

Victor Weixin Liang, Y uhui Zhang, Y ongchan Kwon, Serena Y eung, James Y Zou · 2022

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

GeoFlowVLM: Geometry-Aware Joint Uncertainty for Frozen Vision-Language Embedding

cs.LG · 2026-05-13 · unverdicted · novelty 7.0

GeoFlowVLM learns joint distributions of l2-normalized VLM embeddings on the product hypersphere via Riemannian flow matching to expose both aleatoric and epistemic uncertainty through derived entropy and typicality scores.

Topology-Aware Representation Alignment for Semi-Supervised Vision-Language Learning

cs.CV · 2026-04-29 · unverdicted · novelty 6.0

ToMA uses persistent homology on H0-death and lightweight H1-birth edges to align multimodal manifolds, delivering stable gains on remote sensing and consistent benefits on fashion retrieval.

citing papers explorer

Showing 2 of 2 citing papers.

GeoFlowVLM: Geometry-Aware Joint Uncertainty for Frozen Vision-Language Embedding cs.LG · 2026-05-13 · unverdicted · none · ref 23
GeoFlowVLM learns joint distributions of l2-normalized VLM embeddings on the product hypersphere via Riemannian flow matching to expose both aleatoric and epistemic uncertainty through derived entropy and typicality scores.
Topology-Aware Representation Alignment for Semi-Supervised Vision-Language Learning cs.CV · 2026-04-29 · unverdicted · none · ref 16
ToMA uses persistent homology on H0-death and lightweight H1-birth edges to align multimodal manifolds, delivering stable gains on remote sensing and consistent benefits on fashion retrieval.

Mind the gap: Understanding the modality gap in multi-modal contrastive representation learning

fields

years

verdicts

representative citing papers

citing papers explorer