Mteb: Massive text embedding benchmark

Niklas Muennighoff, Nouamane Tazi, Loïc Magne, Nils Reimers · 2014

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Layer-wise Representation Dynamics: An Empirical Investigation Across Embedders and Base LLMs

cs.LG · 2026-05-12 · unverdicted · novelty 6.0

LRD framework with Frenet, NRS, and GFMI metrics shows layer-wise structure in 31 models provides usable signal for model selection and pruning on MTEB tasks.

MLAIRE: Multilingual Language-Aware Information Retrieval Evaluation Protocal

cs.IR · 2026-05-08 · unverdicted · novelty 6.0

MLAIRE is a protocol that evaluates multilingual retrievers on both semantic accuracy and query-language preference using parallel passages and new metrics like LPR and Lang-nDCG, showing that standard metrics hide distinct behavioral differences among retrievers.

Causal2Vec: Improving Decoder-only LLMs as Embedding Models through a Contextual Token

cs.CL · 2025-07-31 · conditional · novelty 6.0

Causal2Vec prepends a BERT-generated contextual token to decoder-only LLMs and pools its hidden state with the EOS token to reach new SOTA on MTEB among public-data-trained embedding models.

citing papers explorer

Showing 3 of 3 citing papers.

Layer-wise Representation Dynamics: An Empirical Investigation Across Embedders and Base LLMs cs.LG · 2026-05-12 · unverdicted · none · ref 48
LRD framework with Frenet, NRS, and GFMI metrics shows layer-wise structure in 31 models provides usable signal for model selection and pruning on MTEB tasks.
MLAIRE: Multilingual Language-Aware Information Retrieval Evaluation Protocal cs.IR · 2026-05-08 · unverdicted · none · ref 13
MLAIRE is a protocol that evaluates multilingual retrievers on both semantic accuracy and query-language preference using parallel passages and new metrics like LPR and Lang-nDCG, showing that standard metrics hide distinct behavioral differences among retrievers.
Causal2Vec: Improving Decoder-only LLMs as Embedding Models through a Contextual Token cs.CL · 2025-07-31 · conditional · none · ref 5
Causal2Vec prepends a BERT-generated contextual token to decoder-only LLMs and pools its hidden state with the EOS token to reach new SOTA on MTEB among public-data-trained embedding models.

Mteb: Massive text embedding benchmark

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer