hub

ELECTRA: Pre-training text encoders as discriminators rather than generators

Kevin Clark, Minh-Thang Luong, Quoc V Le, Christopher D Manning · 2020 · arXiv 2003.10555

15 Pith papers cite this work. Polarity classification is still indexing.

15 Pith papers citing it

read on arXiv browse 15 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

cs.LG · 2019-10-23 · unverdicted · novelty 7.0

T5 casts all NLP tasks as text-to-text generation, systematically explores pre-training choices, and reaches strong performance on summarization, QA, classification and other tasks via large-scale training on the Colossal Clean Crawled Corpus.

On the (In-)Security of the Shuffling Defense in the Transformer Secure Inference

cs.CR · 2026-05-06 · conditional · novelty 6.0

An attack aligns differently shuffled intermediate activations from secure Transformer inference queries to recover model weights with low error using roughly one dollar of queries.

ADE: Adaptive Dictionary Embeddings -- Scaling Multi-Anchor Representations to Large Language Models

cs.CL · 2026-04-27 · unverdicted · novelty 6.0

ADE scales multi-anchor word representations to transformers via Vocabulary Projection, Grouped Positional Encoding, and context-aware reweighting, achieving 98.7% fewer trainable parameters than DeBERTa-v3-base while matching or exceeding it on two text-classification benchmarks and compressing the

Empirical Insights of Test Selection Metrics under Multiple Testing Objectives and Distribution Shifts

cs.SE · 2026-04-25 · unverdicted · novelty 6.0

A broad empirical benchmark shows how 15 existing test selection metrics perform for fault detection, performance estimation, and retraining under corrupted, adversarial, temporal, natural, and label shifts across image, text, and Android data.

Bangla Key2Text: Text Generation from Keywords for a Low Resource Language

cs.CL · 2026-04-21 · conditional · novelty 6.0

Bangla Key2Text releases 2.6M keyword-text pairs and demonstrates that fine-tuned mT5 and BanglaT5 outperform zero-shot LLMs on keyword-conditioned Bangla text generation.

Entities as Retrieval Signals: A Systematic Study of Coverage, Supervision, and Evaluation in Entity-Oriented Ranking

cs.IR · 2026-04-06 · conditional · novelty 6.0

Entity signals cover only 19.7% of relevant documents on Robust04 and no configuration among 443 systems improves MAP by more than 0.05 in open-world evaluation, despite gains when entities are pre-restricted.

Compiling Code LLMs into Lightweight Executables

cs.SE · 2026-03-31 · conditional · novelty 6.0

Ditto quantizes Code LLMs with K-Means codebooks and compiles inference via LLVM-BLAS replacement to deliver up to 10.5x faster, 6.4x smaller, and 10.5x lower-energy execution on commodity hardware while losing only 0.27% pass@1 accuracy.

EVA-CLIP: Improved Training Techniques for CLIP at Scale

cs.CV · 2023-03-27 · conditional · novelty 6.0

EVA-CLIP delivers improved CLIP training recipes that yield 82.0% zero-shot ImageNet-1K accuracy for a 5B-parameter model after only 9 billion samples.

HuggingFace's Transformers: State-of-the-art Natural Language Processing

cs.CL · 2019-10-09 · accept · novelty 6.0

Hugging Face releases an open-source Python library that supplies a unified API and pretrained weights for major Transformer architectures used in natural language processing.

Automatic Reflection Level Classification in Hungarian Student Essays

cs.CL · 2026-05-04 · unverdicted · novelty 5.0

Classical machine learning models outperform Hungarian transformers slightly in overall performance (71% vs 68% average score) for classifying reflection levels in student essays, though transformers handle rare classes better.

ESsEN: Training Compact Discriminative Vision-Language Transformers in a Low-Resource Setting

cs.CV · 2026-04-20 · unverdicted · novelty 5.0

ESsEN is a parameter-efficient two-tower vision-language transformer that matches larger models on discriminative tasks after training end-to-end with limited data and resources.

Case-Grounded Evidence Verification: A Framework for Constructing Evidence-Sensitive Supervision

cs.CL · 2026-04-10 · unverdicted · novelty 5.0

A supervision construction procedure generates explicit support and controlled non-support examples (counterfactual and topic-related negatives) without manual annotation, producing verifiers that demonstrate genuine evidence dependence in radiology tasks.

Detecting Alarming Student Verbal Responses using Text and Audio Classifier

cs.CL · 2026-04-17 · unverdicted · novelty 4.0

A hybrid text-plus-audio classifier framework is introduced to identify potentially troubling student responses by analyzing both what is said and how it is said.

LLMs Struggle with Abstract Meaning Comprehension More Than Expected

cs.CL · 2026-04-13 · unverdicted · novelty 3.0

LLMs struggle with abstract meaning comprehension on SemEval-2021 Task 4 more than fine-tuned models, and a new bidirectional attention classifier yields small accuracy gains of 3-4%.

Large Language Models: A Survey

cs.CL · 2024-02-09 · accept · novelty 3.0

The paper surveys key large language models, their training methods, datasets, evaluation benchmarks, and future research directions in the field.

citing papers explorer

Showing 15 of 15 citing papers.

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer cs.LG · 2019-10-23 · unverdicted · none · ref 14
T5 casts all NLP tasks as text-to-text generation, systematically explores pre-training choices, and reaches strong performance on summarization, QA, classification and other tasks via large-scale training on the Colossal Clean Crawled Corpus.
On the (In-)Security of the Shuffling Defense in the Transformer Secure Inference cs.CR · 2026-05-06 · conditional · none · ref 56
An attack aligns differently shuffled intermediate activations from secure Transformer inference queries to recover model weights with low error using roughly one dollar of queries.
ADE: Adaptive Dictionary Embeddings -- Scaling Multi-Anchor Representations to Large Language Models cs.CL · 2026-04-27 · unverdicted · none · ref 1
ADE scales multi-anchor word representations to transformers via Vocabulary Projection, Grouped Positional Encoding, and context-aware reweighting, achieving 98.7% fewer trainable parameters than DeBERTa-v3-base while matching or exceeding it on two text-classification benchmarks and compressing the
Empirical Insights of Test Selection Metrics under Multiple Testing Objectives and Distribution Shifts cs.SE · 2026-04-25 · unverdicted · none · ref 14
A broad empirical benchmark shows how 15 existing test selection metrics perform for fault detection, performance estimation, and retraining under corrupted, adversarial, temporal, natural, and label shifts across image, text, and Android data.
Bangla Key2Text: Text Generation from Keywords for a Low Resource Language cs.CL · 2026-04-21 · conditional · none · ref 34
Bangla Key2Text releases 2.6M keyword-text pairs and demonstrates that fine-tuned mT5 and BanglaT5 outperform zero-shot LLMs on keyword-conditioned Bangla text generation.
Entities as Retrieval Signals: A Systematic Study of Coverage, Supervision, and Evaluation in Entity-Oriented Ranking cs.IR · 2026-04-06 · conditional · none · ref 3
Entity signals cover only 19.7% of relevant documents on Robust04 and no configuration among 443 systems improves MAP by more than 0.05 in open-world evaluation, despite gains when entities are pre-restricted.
Compiling Code LLMs into Lightweight Executables cs.SE · 2026-03-31 · conditional · none · ref 12
Ditto quantizes Code LLMs with K-Means codebooks and compiles inference via LLVM-BLAS replacement to deliver up to 10.5x faster, 6.4x smaller, and 10.5x lower-energy execution on commodity hardware while losing only 0.27% pass@1 accuracy.
EVA-CLIP: Improved Training Techniques for CLIP at Scale cs.CV · 2023-03-27 · conditional · none · ref 13
EVA-CLIP delivers improved CLIP training recipes that yield 82.0% zero-shot ImageNet-1K accuracy for a 5B-parameter model after only 9 billion samples.
HuggingFace's Transformers: State-of-the-art Natural Language Processing cs.CL · 2019-10-09 · accept · none · ref 150
Hugging Face releases an open-source Python library that supplies a unified API and pretrained weights for major Transformer architectures used in natural language processing.
Automatic Reflection Level Classification in Hungarian Student Essays cs.CL · 2026-05-04 · unverdicted · none · ref 8
Classical machine learning models outperform Hungarian transformers slightly in overall performance (71% vs 68% average score) for classifying reflection levels in student essays, though transformers handle rare classes better.
ESsEN: Training Compact Discriminative Vision-Language Transformers in a Low-Resource Setting cs.CV · 2026-04-20 · unverdicted · none · ref 19
ESsEN is a parameter-efficient two-tower vision-language transformer that matches larger models on discriminative tasks after training end-to-end with limited data and resources.
Case-Grounded Evidence Verification: A Framework for Constructing Evidence-Sensitive Supervision cs.CL · 2026-04-10 · unverdicted · none · ref 9
A supervision construction procedure generates explicit support and controlled non-support examples (counterfactual and topic-related negatives) without manual annotation, producing verifiers that demonstrate genuine evidence dependence in radiology tasks.
Detecting Alarming Student Verbal Responses using Text and Audio Classifier cs.CL · 2026-04-17 · unverdicted · none · ref 4
A hybrid text-plus-audio classifier framework is introduced to identify potentially troubling student responses by analyzing both what is said and how it is said.
LLMs Struggle with Abstract Meaning Comprehension More Than Expected cs.CL · 2026-04-13 · unverdicted · none · ref 7
LLMs struggle with abstract meaning comprehension on SemEval-2021 Task 4 more than fine-tuned models, and a new bidirectional attention classifier yields small accuracy gains of 3-4%.
Large Language Models: A Survey cs.CL · 2024-02-09 · accept · none · ref 46
The paper surveys key large language models, their training methods, datasets, evaluation benchmarks, and future research directions in the field.

ELECTRA: Pre-training text encoders as discriminators rather than generators

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer