Universal Sentence Encoder

Brian Strope, Chris Tar, Daniel Cer, Mario Guajardo-Cespedes, Nan Hua, Nicole Limtiaco, Noah Constant, Ray Kurzweil, Rhomni St. John, Sheng-yi Kong, Steve Yuan, Yinfei Yang, Yun-hsuan Sung

Authors on Pith no claims yet

classification 💻 cs.CL

keywords transferlearningmodelssentencewordembeddingsencodingperformance

0 comments

read the original abstract

We present models for encoding sentences into embedding vectors that specifically target transfer learning to other NLP tasks. The models are efficient and result in accurate performance on diverse transfer tasks. Two variants of the encoding models allow for trade-offs between accuracy and compute resources. For both variants, we investigate and report the relationship between model complexity, resource consumption, the availability of transfer task training data, and task performance. Comparisons are made with baselines that use word level transfer learning via pretrained word embeddings as well as baselines do not use any transfer learning. We find that transfer learning using sentence embeddings tends to outperform word level transfer. With transfer learning via sentence embeddings, we observe surprisingly good performance with minimal amounts of supervised training data for a transfer task. We obtain encouraging results on Word Embedding Association Tests (WEAT) targeted at detecting model bias. Our pre-trained sentence encoding models are made freely available for download and on TF Hub.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 12 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

REALISTA: Realistic Latent Adversarial Attacks that Elicit LLM Hallucinations
cs.CL 2026-05 unverdicted novelty 8.0

REALISTA optimizes continuous combinations of valid editing directions in latent space to produce realistic adversarial prompts that elicit hallucinations more effectively than prior methods, including on large reason...
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
cs.CL 2019-08 unverdicted novelty 8.0

Sentence-BERT adapts BERT with siamese and triplet networks to produce sentence embeddings for efficient cosine-similarity comparisons, cutting computation time from hours to seconds on similarity search while matchin...
Enhancing Healthcare Search Intent Recognition with Query Representation Learning and Session Context
cs.IR 2026-05 unverdicted novelty 7.0

Clustering-based query representations with a novel multi-intent loss and a concordance rate metric improve healthcare search intent classification on two real-world log datasets.
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances
cs.RO 2022-04 accept novelty 7.0

SayCan combines an LLM's high-level semantic knowledge with robot skill value functions to select only feasible actions, enabling completion of abstract natural-language instructions on a real mobile manipulator.
Real vs. Semi-Simulated: Rethinking Evaluation for Treatment Effect Estimation
cs.LG 2026-05 unverdicted novelty 6.0

Counterfactual metrics on semi-simulated benchmarks fail to identify the treatment effect estimators preferred by observable metrics on real datasets, with simple meta-learners outperforming specialized causal models.
Is Textual Similarity Invariant under Machine Translation? Evidence Based on the Political Manifesto Corpus
cs.CL 2026-05 unverdicted novelty 6.0

Machine translation preserves embedding similarity structure for ten languages but distorts it for four in the Manifesto Corpus, via a new non-inferiority testing framework.
Semantics-Aware Hierarchical Token Communication: Clustering, Bit Mapping, and Power Allocation
eess.SP 2026-04 unverdicted novelty 6.0

H-TokCom groups tokens by semantic similarity and protects cluster-level bits with higher power, raising semantic similarity from 0.206 to 0.279 at 3 dB SNR on COCO data.
REZE: Representation Regularization for Domain-adaptive Text Embedding Pre-finetuning
cs.CL 2026-04 unverdicted novelty 6.0

REZE controls representation shifts in contrastive pre-finetuning of text embeddings via eigenspace decomposition of anchor-positive pairs and adaptive soft-shrinkage on task-variant directions.
TR-EduVSum: A Turkish-Focused Dataset and Consensus Framework for Educational Video Summarization
cs.CL 2026-04 unverdicted novelty 6.0

Presents TR-EduVSum dataset and AutoMUP consensus framework for generating gold-standard summaries from multiple human annotations of Turkish educational videos.
Measuring Embedding Sensitivity to Authorial Style in French: Comparing Literary Texts with Language Model Rewritings
cs.CL 2026-05 unverdicted novelty 5.0

Embeddings reliably capture authorial stylistic features in French literary texts, and these signals persist after LLM rewriting while showing model-specific patterns.
LLMs Uncertainty Quantification via Adaptive Conformal Semantic Entropy
cs.LG 2026-05 unverdicted novelty 5.0

ACSE estimates LLM prompt uncertainty via adaptive clustering of semantic entropy across multiple responses and uses conformal prediction to bound error rates on accepted answers with distribution-free guarantees.
Exploring Ethical Concerns of Mobile Applications from App Reviews: A Literature Survey
cs.SE 2026-04 accept novelty 4.0

A systematic literature survey of 37 studies synthesizes methods for identifying ethical concerns from app reviews and proposes a research agenda focused on automated detection.