citation dossier

Gemini embedding: Generalizable embeddings from gemini

Jinhyuk Lee, Feiyang Chen, Sahil Dua, Daniel Cer, Madhuri Shanbhogue, Iftekhar Naim, Gustavo Hernández Ábrego, Zhe Li, Kaifeng Chen, Henrique Schechter Vera, Xiaoqi Ren, Shanfeng Zhang, Daniel Salz, Michael Boratko, Jay Han, Blair Chen, Shu · 2025 · arXiv 2503.07891

17Pith papers citing it

17reference links

cs.CLtop field · 11 papers

UNVERDICTEDtop verdict bucket · 17 papers

This arXiv-backed work is queued for full Pith review when it crosses the high-inbound sweep. That review runs reader · skeptic · desk-editor · referee · rebuttal · circularity · lean confirmation · RS check · pith extraction.

read on arXiv PDF

why this work matters in Pith

Pith has found this work in 17 reviewed papers. Its strongest current cluster is cs.CL (11 papers). The largest review-status bucket among citing papers is UNVERDICTED (17 papers). For highly cited works, this page shows a dossier first and a bounded explorer second; it never tries to render every citing paper at once.

representative citing papers

TabEmbed: Benchmarking and Learning Generalist Embeddings for Tabular Understanding

cs.CL · 2026-05-06 · unverdicted · novelty 7.0

TabEmbed is the first generalist embedding model for tabular data that unifies classification and retrieval in one space via contrastive learning and outperforms text embedding models on the new TabBench benchmark.

Embedding-based In-Context Prompt Training for Enhancing LLMs as Text Encoders

cs.CL · 2026-05-02 · unverdicted · novelty 7.0

EPIC trains LLMs to treat continuous embeddings as in-context prompts, yielding state-of-the-art text embedding performance on MTEB with or without prompts at inference and lower compute.

Why Mean Pooling Works: Quantifying Second-Order Collapse in Text Embeddings

cs.CL · 2026-04-30 · unverdicted · novelty 7.0

Modern text encoders resist second-order collapse under mean pooling because token embeddings concentrate tightly within texts, and this resistance correlates with stronger downstream performance.

Semantic Recall for Vector Search

cs.IR · 2026-04-22 · unverdicted · novelty 7.0

Semantic Recall is a new evaluation metric for approximate nearest neighbor search that focuses only on semantically relevant results, with Tolerant Recall as a proxy when relevance labels are unavailable.

Crowded in B-Space: Calibrating Shared Directions for LoRA Merging

cs.CL · 2026-04-18 · unverdicted · novelty 7.0

Pico reduces LoRA merge interference by calibrating over-shared directions in the B matrix before merging, yielding 3.4-8.3 point accuracy gains and sometimes beating joint training.

Task-Adaptive Embedding Refinement via Test-time LLM Guidance

cs.CL · 2026-05-12 · unverdicted · novelty 6.0

Test-time LLM feedback refines query embeddings to deliver up to 25% relative gains on zero-shot literature search, intent detection, and related benchmarks.

Topic Is Not Agenda: A Citation-Community Audit of Text Embeddings

cs.IR · 2026-05-08 · unverdicted · novelty 6.0

Embeddings retrieve same-subfield papers at 45-52% but same-agenda papers at only 15-21%; citation rerank reaches 57-59% on agenda queries.

A Survey of Reasoning-Intensive Retrieval: Progress and Challenges

cs.IR · 2026-04-30 · unverdicted · novelty 6.0

A survey that categorizes RIR benchmarks by domain and modality, proposes a taxonomy for integrating reasoning into retrieval pipelines, and outlines key challenges.

FLARE: Task-agnostic embedding model evaluation through a normalization process

cs.LG · 2026-04-19 · unverdicted · novelty 6.0

FLARE scores embedding models labellessly via normalized log-likelihood, achieving 0.90 Spearman correlation with supervised benchmarks and stable performance in dimensions over 3500 where prior methods collapse.

CLSGen: A Dual-Head Fine-Tuning Framework for Joint Probabilistic Classification and Verbalized Explanation

cs.CL · 2026-04-13 · unverdicted · novelty 6.0

CLSGen is a dual-head LLM fine-tuning framework that enables joint probabilistic classification and verbalized explanation generation without catastrophic forgetting of generative capabilities.

EgoSelf: From Memory to Personalized Egocentric Assistant

cs.CV · 2026-04-21 · unverdicted · novelty 5.0

EgoSelf uses graph-based memory of user interactions to derive personalized profiles and predict future behaviors for egocentric assistants.

FLiP: Towards understanding and interpreting multimodal multilingual sentence embeddings

cs.CL · 2026-04-20 · unverdicted · novelty 5.0

FLiP recovers more than 75% lexical content from pretrained sentence embeddings across languages and modalities, outperforming non-factorized baselines and exposing intrinsic biases.

Understanding Performance Gap Between Parallel and Sequential Sampling in Large Reasoning Models

cs.CL · 2026-04-07 · unverdicted · novelty 5.0

Lack of exploration from conditioning on prior answers is the primary reason parallel sampling outperforms sequential sampling in large reasoning models.

BLUEmed: Retrieval-Augmented Multi-Agent Debate for Clinical Error Detection

cs.CL · 2026-04-12 · unverdicted · novelty 4.0

BLUEmed combines hybrid RAG with structured multi-agent debate and a safety filter to detect terminology substitution errors in clinical notes, reaching 69.13% accuracy under few-shot prompting and outperforming single-agent and debate-only baselines.

Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking

cs.CL · 2026-01-08 · unverdicted · novelty 4.0

Qwen3-VL-Embedding-8B achieves state-of-the-art performance with a 77.8 overall score on the MMEB-V2 multimodal embedding benchmark.

Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

cs.CL · 2025-06-05 · unverdicted · novelty 4.0

Qwen3 Embedding models in 0.6B-8B sizes achieve state-of-the-art results on MTEB and retrieval tasks including code, cross-lingual, and multilingual retrieval through unsupervised pre-training, supervised fine-tuning, and model merging on Qwen3 backbones.

Benchmarking LLMs on the Massive Sound Embedding Benchmark (MSEB)

cs.SD · 2026-05-06 · unverdicted · novelty 3.0

LLMs exhibit a persistent modality gap versus specialized audio encoders on MSEB tasks, with no conclusive evidence favoring audio-native over cascaded architectures.

citing papers explorer

Showing 17 of 17 citing papers.

TabEmbed: Benchmarking and Learning Generalist Embeddings for Tabular Understanding cs.CL · 2026-05-06 · unverdicted · none · ref 17
TabEmbed is the first generalist embedding model for tabular data that unifies classification and retrieval in one space via contrastive learning and outperforms text embedding models on the new TabBench benchmark.
Embedding-based In-Context Prompt Training for Enhancing LLMs as Text Encoders cs.CL · 2026-05-02 · unverdicted · none · ref 48
EPIC trains LLMs to treat continuous embeddings as in-context prompts, yielding state-of-the-art text embedding performance on MTEB with or without prompts at inference and lower compute.
Why Mean Pooling Works: Quantifying Second-Order Collapse in Text Embeddings cs.CL · 2026-04-30 · unverdicted · none · ref 7
Modern text encoders resist second-order collapse under mean pooling because token embeddings concentrate tightly within texts, and this resistance correlates with stronger downstream performance.
Semantic Recall for Vector Search cs.IR · 2026-04-22 · unverdicted · none · ref 17
Semantic Recall is a new evaluation metric for approximate nearest neighbor search that focuses only on semantically relevant results, with Tolerant Recall as a proxy when relevance labels are unavailable.
Crowded in B-Space: Calibrating Shared Directions for LoRA Merging cs.CL · 2026-04-18 · unverdicted · none · ref 16
Pico reduces LoRA merge interference by calibrating over-shared directions in the B matrix before merging, yielding 3.4-8.3 point accuracy gains and sometimes beating joint training.
Task-Adaptive Embedding Refinement via Test-time LLM Guidance cs.CL · 2026-05-12 · unverdicted · none · ref 28
Test-time LLM feedback refines query embeddings to deliver up to 25% relative gains on zero-shot literature search, intent detection, and related benchmarks.
Topic Is Not Agenda: A Citation-Community Audit of Text Embeddings cs.IR · 2026-05-08 · unverdicted · none · ref 24
Embeddings retrieve same-subfield papers at 45-52% but same-agenda papers at only 15-21%; citation rerank reaches 57-59% on agenda queries.
A Survey of Reasoning-Intensive Retrieval: Progress and Challenges cs.IR · 2026-04-30 · unverdicted · none · ref 33
A survey that categorizes RIR benchmarks by domain and modality, proposes a taxonomy for integrating reasoning into retrieval pipelines, and outlines key challenges.
FLARE: Task-agnostic embedding model evaluation through a normalization process cs.LG · 2026-04-19 · unverdicted · none · ref 2
FLARE scores embedding models labellessly via normalized log-likelihood, achieving 0.90 Spearman correlation with supervised benchmarks and stable performance in dimensions over 3500 where prior methods collapse.
CLSGen: A Dual-Head Fine-Tuning Framework for Joint Probabilistic Classification and Verbalized Explanation cs.CL · 2026-04-13 · unverdicted · none · ref 7
CLSGen is a dual-head LLM fine-tuning framework that enables joint probabilistic classification and verbalized explanation generation without catastrophic forgetting of generative capabilities.
EgoSelf: From Memory to Personalized Egocentric Assistant cs.CV · 2026-04-21 · unverdicted · none · ref 23
EgoSelf uses graph-based memory of user interactions to derive personalized profiles and predict future behaviors for egocentric assistants.
FLiP: Towards understanding and interpreting multimodal multilingual sentence embeddings cs.CL · 2026-04-20 · unverdicted · none · ref 33
FLiP recovers more than 75% lexical content from pretrained sentence embeddings across languages and modalities, outperforming non-factorized baselines and exposing intrinsic biases.
Understanding Performance Gap Between Parallel and Sequential Sampling in Large Reasoning Models cs.CL · 2026-04-07 · unverdicted · none · ref 14
Lack of exploration from conditioning on prior answers is the primary reason parallel sampling outperforms sequential sampling in large reasoning models.
BLUEmed: Retrieval-Augmented Multi-Agent Debate for Clinical Error Detection cs.CL · 2026-04-12 · unverdicted · none · ref 33
BLUEmed combines hybrid RAG with structured multi-agent debate and a safety filter to detect terminology substitution errors in clinical notes, reaching 69.13% accuracy under few-shot prompting and outperforming single-agent and debate-only baselines.
Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking cs.CL · 2026-01-08 · unverdicted · none · ref 14
Qwen3-VL-Embedding-8B achieves state-of-the-art performance with a 77.8 overall score on the MMEB-V2 multimodal embedding benchmark.
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models cs.CL · 2025-06-05 · unverdicted · none · ref 6
Qwen3 Embedding models in 0.6B-8B sizes achieve state-of-the-art results on MTEB and retrieval tasks including code, cross-lingual, and multilingual retrieval through unsupervised pre-training, supervised fine-tuning, and model merging on Qwen3 backbones.
Benchmarking LLMs on the Massive Sound Embedding Benchmark (MSEB) cs.SD · 2026-05-06 · unverdicted · none · ref 23
LLMs exhibit a persistent modality gap versus specialized audio encoders on MSEB tasks, with no conclusive evidence favoring audio-native over cascaded architectures.

Gemini embedding: Generalizable embeddings from gemini

why this work matters in Pith

fields

years

verdicts

representative citing papers

citing papers explorer