Symmetric Nucleus Subsampling and Expert Embedding Engine reduce modality gaps in multimodal embeddings by over 90% and outperform baselines in data curation for downstream models.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
representative citing papers
NV-Embed achieves first place on the MTEB leaderboard across 56 tasks by combining a latent attention layer, causal-mask removal, two-stage contrastive training, and data curation for LLM-based embedding models.
citing papers explorer
-
Multimodal Data Curation Through Ranked Retrieval
Symmetric Nucleus Subsampling and Expert Embedding Engine reduce modality gaps in multimodal embeddings by over 90% and outperform baselines in data curation for downstream models.
-
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
NV-Embed achieves first place on the MTEB leaderboard across 56 tasks by combining a latent attention layer, causal-mask removal, two-stage contrastive training, and data curation for LLM-based embedding models.