MetaEmbed trains fixed learnable Meta Tokens to produce granularity-organized multi-vector embeddings that support test-time scaling in multimodal retrieval.
Modality curation: Building universal embeddings for advanced multimodal information retrieval
4 Pith papers cite this work. Polarity classification is still indexing.
4
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
SSA-ME uses saliency-aware modeling to reduce visual neglect and semantic drift, achieving SOTA results on the MMEB benchmark for multimodal retrieval.
citing papers explorer
-
MetaEmbed: Scaling Multimodal Retrieval at Test-Time with Flexible Late Interaction
MetaEmbed trains fixed learnable Meta Tokens to produce granularity-organized multi-vector embeddings that support test-time scaling in multimodal retrieval.
-
Combating Visual Neglect and Semantic Drift in Large Multimodal Models for Enhanced Cross-Modal Retrieval
SSA-ME uses saliency-aware modeling to reduce visual neglect and semantic drift, achieving SOTA results on the MMEB benchmark for multimodal retrieval.
- Beyond Chain-of-Thought: Rewrite as a Universal Interface for Generative Multimodal Embeddings
- FreeRet: MLLMs as Training-Free Retrievers