Introduces LOES, a constructive spectral method to select task-discriminative subspaces from intermediate layer embeddings, and GeoReg for enforcing simplicial class geometry during fine-tuning, with reported gains increasing with model depth across modalities.
Title resolution pending
5 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
NV-Embed achieves first place on the MTEB leaderboard across 56 tasks by combining a latent attention layer, causal-mask removal, two-stage contrastive training, and data curation for LLM-based embedding models.
TaDSE learns dialogue sentence embeddings via template-guided self-supervised contrastive learning plus synthetic slot-filling augmentation and reports gains on five downstream benchmarks.
BLOOM is a 176B-parameter open-access multilingual language model trained on the ROOTS corpus that achieves competitive performance on benchmarks, with improved results after multitask prompted finetuning.
CUD reshapes the teacher's predictive distribution before distillation so that students receive calibrated uncertainty signals alongside accuracy, yielding more robust and better-calibrated models on high-cardinality and distribution-shift benchmarks.
citing papers explorer
-
Uncovering the Latent Potential of Deep Intermediate Representations
Introduces LOES, a constructive spectral method to select task-discriminative subspaces from intermediate layer embeddings, and GeoReg for enforcing simplicial class geometry during fine-tuning, with reported gains increasing with model depth across modalities.
-
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
NV-Embed achieves first place on the MTEB leaderboard across 56 tasks by combining a latent attention layer, causal-mask removal, two-stage contrastive training, and data curation for LLM-based embedding models.
-
Template-assisted Contrastive Learning of Task-oriented Dialogue Sentence Embeddings
TaDSE learns dialogue sentence embeddings via template-guided self-supervised contrastive learning plus synthetic slot-filling augmentation and reports gains on five downstream benchmarks.
-
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BLOOM is a 176B-parameter open-access multilingual language model trained on the ROOTS corpus that achieves competitive performance on benchmarks, with improved results after multitask prompted finetuning.
-
Trust the uncertain teacher: distilling dark knowledge via calibrated uncertainty
CUD reshapes the teacher's predictive distribution before distillation so that students receive calibrated uncertainty signals alongside accuracy, yielding more robust and better-calibrated models on high-cardinality and distribution-shift benchmarks.