super hub Mixed citations

HuggingFace's Transformers: State-of-the-art Natural Language Processing

Anthony Moi, Clement Delangue, Julien Chaumond, Lysandre Debut, Thomas Wolf, Victor Sanh · 2019 · cs.CL · arXiv 1910.03771

Mixed citation behavior. Most common role is background (54%).

135 Pith papers citing it

Background 54% of classified citations

open full Pith review browse 135 citing papers more from Anthony Moi arXiv PDF

abstract

Recent progress in natural language processing has been driven by advances in both model architecture and model pretraining. Transformer architectures have facilitated building higher-capacity models and pretraining has made it possible to effectively utilize this capacity for a wide variety of tasks. \textit{Transformers} is an open-source library with the goal of opening up these advances to the wider machine learning community. The library consists of carefully engineered state-of-the art Transformer architectures under a unified API. Backing this library is a curated collection of pretrained models made by and available for the community. \textit{Transformers} is designed to be extensible by researchers, simple for practitioners, and fast and robust in industrial deployments. The library is available at \url{https://github.com/huggingface/transformers}.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 14 method 8 other 4

citation-polarity summary

background 14 use method 8 unclear 4

claims ledger

abstract Recent progress in natural language processing has been driven by advances in both model architecture and model pretraining. Transformer architectures have facilitated building higher-capacity models and pretraining has made it possible to effectively utilize this capacity for a wide variety of tasks. \textit{Transformers} is an open-source library with the goal of opening up these advances to the wider machine learning community. The library consists of carefully engineered state-of-the art Transformer architectures under a unified API. Backing this library is a curated collection of pretrain

authors

Anthony Moi Clement Delangue Julien Chaumond Lysandre Debut Thomas Wolf Victor Sanh

co-cited works

representative citing papers

EnergyAgentBench: Benchmarking LLM Agents on Live Energy Infrastructure Data

econ.EM · 2026-05-13 · accept · novelty 8.0

EnergyAgentBench is a new benchmark with 70 task variants that evaluates LLM agents on live energy data for datacenter siting, long-horizon optimization, and causal grid diagnosis.

Sieve: Dynamic Expert-Aware PIM Acceleration for Evolving Mixture-of-Experts Models

cs.AR · 2026-05-11 · conditional · novelty 8.0

Sieve dynamically schedules MoE experts across GPU and PIM hardware to handle bimodal token distributions, achieving 1.3x to 1.6x gains in throughput and interactivity over static prior PIM systems on three large models.

VibeServe: Can AI Agents Build Bespoke LLM Serving Systems?

cs.AI · 2026-05-07 · unverdicted · novelty 8.0

VibeServe demonstrates that AI agents can synthesize bespoke LLM serving systems end-to-end, remaining competitive with vLLM in standard settings while outperforming it in six non-standard scenarios involving unusual models, workloads, or hardware.

Learning to (Learn at Test Time): RNNs with Expressive Hidden States

cs.LG · 2024-07-05 · conditional · novelty 8.0

TTT layers treat the hidden state as a trainable model updated at test time, allowing linear-complexity sequence models to scale perplexity reduction with context length unlike Mamba.

RULER: What's the Real Context Size of Your Long-Context Language Models?

cs.CL · 2024-04-09 · accept · novelty 8.0

RULER shows most long-context LMs drop sharply in performance on complex tasks as length and difficulty increase, with only half maintaining results at 32K tokens.

Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling

cs.CL · 2023-04-03 · accept · novelty 8.0

Pythia releases 16 identically trained LLMs with full checkpoints and data tools to study training dynamics, scaling, memorization, and bias in language models.

Editing Models with Task Arithmetic

cs.LG · 2022-12-08 · accept · novelty 8.0

Task vectors from weight differences allow arithmetic operations to edit pre-trained models, improving multiple tasks simultaneously and enabling analogical inference on unseen tasks.

Discovering Latent Knowledge in Language Models Without Supervision

cs.CL · 2022-12-07 · conditional · novelty 8.0

An unsupervised technique extracts latent yes-no knowledge from language model activations by locating a direction that satisfies logical consistency properties, outperforming zero-shot accuracy by 4% on average across models and datasets.

Polar: A Benchmark for Evaluating Political Bias in LLMs

cs.CL · 2026-06-11 · unverdicted · novelty 7.0

Polar is a new cross-context benchmark showing LLM political bias measurements are not fixed but vary with country, issue, model, and language.

Small LLMs for Biomedical Claim Verification: Cost-Effective Fine-Tuning, Structural Dataset Shortcuts, and Cross-Domain Generalization

cs.CL · 2026-06-11 · unverdicted · novelty 7.0

Fine-tuned Mistral-7B via QLoRA achieves up to 12% higher F1 than GPT-4o on biomedical claim verification with 1008 examples, identifies a structural shortcut in SciFact, and shows robust cross-domain transfer from sound data.

M*: A Modular, Extensible, Serving System for Multimodal Models

cs.LG · 2026-06-10 · unverdicted · novelty 7.0

M* introduces the Walk Graph abstraction to serve arbitrary compositions of multimodal model components and reports latency and throughput gains over vLLM-Omni and other baselines on text-to-image, text-to-speech, and robotic planning workloads.

A Unifying Framework for Concept-Based Representational Similarity

cs.LG · 2026-06-08 · unverdicted · novelty 7.0

A unifying framework decomposes concept alignment into instance-wise and distributional translation and concept consistency, introduces the InterVenchA benchmark, and shows that joint optimization via CoSAE recovers strong alignment even with 0.1% paired data.

Closed-Form Spectral Regularization for Multi-Task Model Merging

cs.LG · 2026-06-05 · unverdicted · novelty 7.0

Iterative solvers in layer-wise model merging act as spectral regularizers on an ill-posed interference operator; closed-form SWUDI and adaptive SWUDI-A match or exceed SOTA merging accuracy with 28-72x wall-clock speedup.

Not All Errors Are Equal: A Systematic Study of Error Propagation in Large Language Model Inference

cs.DC · 2026-06-01 · unverdicted · novelty 7.0

A new fault-injection framework enables a systematic empirical study that produces 17 takeaways on error propagation in LLM inference and four software-only mitigation directions.

Test-Time Training Undermines Safety Guardrails

cs.LG · 2026-05-21 · unverdicted · novelty 7.0

Test-time training enables three new threat models that raise jailbreak attack success rates on language models to averages of 95% and 93% ASR@10 under LoRA for few-shot and generation-phase attacks across model families.

Which Way Did It Move? Diagnosing and Overcoming Directional Motion Blindness in Video-LLMs

cs.CV · 2026-05-21 · conditional · novelty 7.0

Video-LLMs exhibit directional motion blindness from a direction binding gap; DeltaDirect projector objective lifts synthetic accuracy to 85.4% and real accuracy by 21.9 points while preserving other video capabilities.

Interference-Aware Multi-Task Unlearning

cs.AI · 2026-05-18 · unverdicted · novelty 7.0

Introduces interference-aware multi-task unlearning with task-aware gradient projection and instance-level gradient orthogonalization, reducing interference scores by 30.3% and 52.9% on vision benchmarks.

TokAlign++: Advancing Vocabulary Adaptation via Better Token Alignment

cs.CL · 2026-05-13 · unverdicted · novelty 7.0

TokAlign++ learns token alignments between LLM vocabularies from monolingual representations to enable faster adaptation, better text compression, and effective token-level distillation across 15 languages with minimal steps.

EdgeFlowerTune: Evaluating Federated LLM Fine-Tuning Under Realistic Edge System Constraints

cs.CL · 2026-05-09 · unverdicted · novelty 7.0

EdgeFlowerTune is a real-device benchmark that jointly assesses model quality and system costs for federated LLM fine-tuning on edge hardware using three protocols: Quality-under-Budget, Cost-to-Target, and Robustness.

How Many Iterations to Jailbreak? Dynamic Budget Allocation for Multi-Turn LLM Evaluation

cs.LG · 2026-05-07 · unverdicted · novelty 7.0

DAPRO provides the first dynamic, theoretically guaranteed way to allocate interaction budgets across test cases for bounding time-to-event in multi-turn LLM evaluations, achieving tighter coverage than static conformal survival methods.

Manifold Steering Reveals the Shared Geometry of Neural Network Representation and Behavior

cs.LG · 2026-05-06 · unverdicted · novelty 7.0

Manifold steering along activation geometry induces behavioral trajectories matching the natural manifold of outputs, while linear steering produces off-manifold unnatural behaviors.

Auto-FlexSwitch: Efficient Dynamic Model Merging via Learnable Task Vector Compression

cs.LG · 2026-04-30 · unverdicted · novelty 7.0

Auto-FlexSwitch achieves efficient dynamic model merging by decomposing task vectors into sparse masks, signs, and scalars, then making the compression learnable via gating and adaptive bit selection with KNN-based retrieval.

SecureRouter: Encrypted Routing for Efficient Secure Inference

cs.CR · 2026-04-16 · unverdicted · novelty 7.0

SecureRouter accelerates secure transformer inference by 1.95x via an encrypted router that selects input-adaptive models from an MPC-optimized pool with negligible accuracy loss.

Too Nice to Tell the Truth: Quantifying Agreeableness-Driven Sycophancy in Role-Playing Language Models

cs.CL · 2026-04-12 · unverdicted · novelty 7.0

Agreeableness in AI personas reliably predicts sycophantic behavior in 9 of 13 tested language models.

citing papers explorer

Showing 35 of 135 citing papers.

EinSort: Sorting is All We Need for Tensorizing LLM cs.LG · 2026-06-07 · unverdicted · none · ref 86 · internal anchor
Sorting tensor indices enables an adaptive tensorization method that discovers low-rank structure in LLM weights and KV caches, yielding better reconstruction quality than baselines.
Minimizing the Hidden Cost of Scales: Graph-Guided Ultra-Low-Bit Quantization for Large Language Models cs.AI · 2026-06-03 · unverdicted · none · ref 42 · internal anchor
SAGE-PTQ is a graph-guided ultra-low-bit PTQ framework that achieves 1.03 average weight bits and 0.004 scaling bits per matrix on LLMs while reporting lower perplexity and memory use than BiLLM and PB-LLM.
Linguistic Productivity in Large Language Models: Models Coerce, but do not Preempt cs.CL · 2026-06-01 · unverdicted · none · ref 50 · internal anchor
Larger LLMs reproduce constructional productivity via entrenchment in coercion cases with nonce words but fail to use statistical preemption to avoid overgeneralizing semantically plausible but unobserved patterns.
Mapping Whisper Representations to Human ECoG Responses with Interpretable Time-Resolved Neural Encoding q-bio.NC · 2026-06-01 · unverdicted · none · ref 131 · internal anchor
The paper introduces a time-resolved neural encoder combining Whisper embeddings with recurrent temporal modeling and soft attention to predict ECoG responses, finding strongest alignment in intermediate layers and anatomically coherent phoneme organization in electrodes.
DAG-MoE: From Simple Mixture to Structural Aggregation in Mixture-of-Experts cs.AI · 2026-05-31 · unverdicted · none · ref 17 · internal anchor
DAG-MoE uses a lightweight module to learn DAG-based structural aggregation of selected experts, expanding combination space and enabling intra-layer multi-step reasoning compared to standard weighted-sum MoE.
QVGGT: Post-Training Quantized Visual Geometry Grounded Transformer cs.CV · 2026-05-29 · unverdicted · none · ref 43 · internal anchor
QVGGT uses per-block mixed-precision analysis, outlier token filtering with PCA compensation, and task-aware scale search to achieve near-lossless W4A16 quantization of VGGT with 3-4.9x memory savings and 2.8x speedup.
SemStruct: Contextualizing Semantic Embeddings with Structural Information for Schema Matching cs.LG · 2026-05-29 · unverdicted · none · ref 38 · internal anchor
SemStruct models tables as heterogeneous graphs with GNNs on frozen PLM embeddings to incorporate row co-occurrences for schema matching and reports SOTA results on Valentine and SOTAB-SM benchmarks.
Personalized Generative Models for Contextual Debiasing cs.CV · 2026-05-25 · unverdicted · none · ref 59 · internal anchor
DecoupleGen personalizes diffusion models to create images with uncommon contexts for debiasing object recognition, yielding consistent gains on scene classification tasks.
Latency Analysis and Optimization of Alpamayo 1 via Efficient Trajectory Generation cs.AI · 2026-05-09 · unverdicted · none · ref 30 · internal anchor
Redesigning Alpamayo 1 to single-reasoning and optimizing diffusion action generation cuts inference latency by 69.23% while preserving trajectory diversity and prediction quality.
Reasoning Compression with Mixed-Policy Distillation cs.AI · 2026-05-09 · unverdicted · none · ref 36 · internal anchor
Mixed-Policy Distillation transfers concise reasoning behavior from larger to smaller LLMs by having the teacher compress student-generated trajectories, cutting token usage up to 27% while raising benchmark scores.
EGAD: Entropy-Guided Adaptive Distillation for Token-Level Knowledge Transfer cs.CL · 2026-05-03 · unverdicted · none · ref 37 · internal anchor
EGAD adaptively distills LLM knowledge at the token level by using entropy to create a curriculum from low- to high-entropy tokens, adjust temperature, and switch between logits-only and feature-based branches.
GiVA: Gradient-Informed Bases for Vector-Based Adaptation cs.CL · 2026-04-23 · unverdicted · none · ref 62 · internal anchor
GiVA uses gradients to initialize vector adapters so they match LoRA performance at eight times lower rank while keeping extreme parameter efficiency.
Towards Better Static Code Analysis Reports: Sentence Transformer-based Filtering of Non-Actionable Alerts cs.SE · 2026-04-20 · conditional · none · ref 45 · internal anchor
STAF applies sentence embeddings from transformers to classify SCA findings, reaching 89% F1 and beating prior filters by 11% within projects and 6% across projects.
Reconstruction of a 3D wireframe from a single line drawing via generative depth estimation cs.CV · 2026-04-15 · unverdicted · none · ref 30 · internal anchor
A latent diffusion model conditioned on line drawings estimates dense depth to reconstruct 3D wireframes, reporting 5.3% average depth error after training on over one million pairs.
FedSpy-LLM: Towards Scalable and Generalizable Data Reconstruction Attacks from Gradients on LLMs cs.CR · 2026-04-07 · unverdicted · none · ref 48 · internal anchor
FedSpy-LLM uses gradient decomposition and iterative alignment to reconstruct larger batches and longer sequences of training data from LLM gradients in federated settings, including with PEFT methods.
Ranking Reasoning LLMs under Test-Time Scaling cs.LG · 2026-03-11 · accept · none · ref 9 · internal anchor
Many established statistical ranking techniques produce orderings of reasoning LLMs under test-time scaling that closely match a Bayesian gold standard, with mean Kendall tau_b of 0.93-0.95 at full trials and best methods reaching 0.86 at single trials.
A Simple Method to Enhance Pre-trained Language Models with Speech Tokens for Classification cs.CL · 2025-12-08 · unverdicted · none · ref 34 · internal anchor
Lasso-selected speech tokens enhance text LLMs for multimodal classification by reducing long audio sequences to task-relevant features via self-supervised adaptation.
Automated Annotation of Shearographic Measurements Enabling Weakly Supervised Defect Detection cs.CV · 2025-12-05 · conditional · none · ref 33 · internal anchor
An automated annotation pipeline combining Grounded DINO and SAM produces usable bounding boxes and masks for weakly supervised defect detection in shearography.
Different types of syntactic agreement recruit the same units within large language models cs.CL · 2025-12-03 · unverdicted · none · ref 65 · internal anchor
Different types of syntactic agreement recruit overlapping units within LLMs, indicating that agreement forms a meaningful functional category across English, Russian, Chinese, and structurally similar languages.
Genome-Factory: A Library for Tuning, Deploying, and Interpreting Genomic Foundation Models q-bio.GN · 2025-09-13 · conditional · none · ref 29 · internal anchor
Genome-Factory is an open-source Python library that integrates data pipelines, model tuning, inference, benchmarks, and biological interpretation for genomic foundation models.
Shared representations in brains and models reveal a two-route cortical organization during scene perception q-bio.NC · 2025-07-18 · unverdicted · none · ref 83 · internal anchor
RSA on 7T fMRI during natural scene viewing identifies ventromedial and lateral occipitotemporal representational routes for scene context versus animate content, with differential alignment to vision and language models.
RAP: Runtime Adaptive Pruning for LLM Inference cs.LG · 2025-05-22 · unverdicted · none · ref 32 · internal anchor
RAP is a reinforcement learning framework for runtime-adaptive pruning of LLMs that jointly optimizes model weights and KV-cache usage under varying memory budgets.
Diffusion Models are Secretly Zero-Shot 3DGS Harmonizers cs.CV · 2025-03-09 · unverdicted · none · ref 34 · internal anchor
D3DR optimizes inserted 3DGS objects with a DDS-inspired diffusion objective plus a new personalization step to match scene lighting, reporting 2 dB PSNR gain over prior methods.
The Platonic Representation Hypothesis cs.LG · 2024-05-13 · unverdicted · none · ref 118 · internal anchor
Representations learned by large AI models are converging toward a shared statistical model of reality.
Accelerating Reproducible Research in Synthetic EHR Generation cs.LG · 2026-06-05 · unverdicted · none · ref 17 · internal anchor
A new end-to-end benchmarking framework unifies synthetic EHR generators for longitudinal ICD codes with standardized training and architecture-agnostic evaluation including bootstrapped confidence intervals.
KliniskVestBERT: BERT Model Specialised to Norwegian Clinical Texts cs.CL · 2026-06-01 · unverdicted · none · ref 23 · internal anchor
Three BERT models are further pre-trained on Norwegian clinical notes and discharge summaries, then shown to outperform their base models on synthetic clinical benchmarks and real-world tasks.
OpenSOC-AI: Democratizing Security Operations with Parameter Efficient LLM Log Analysis cs.CR · 2026-04-29 · unverdicted · none · ref 10 · internal anchor
LoRA fine-tuning of TinyLlama-1.1B on 450 SOC examples produces 68% threat classification accuracy and 58% severity accuracy on 50 held-out logs, with full code, weights, and data released.
Are vision-language models ready to zero-shot replace supervised classification models in agriculture? cs.CV · 2025-12-17 · unverdicted · none · ref 1 · internal anchor
Zero-shot VLMs reach at most 62% accuracy on agricultural classification tasks while supervised models like YOLO11 perform markedly higher, indicating they are not ready to replace task-specific systems.
LayerScope: Predictive Cross-Layer Scheduling for Efficient Multi-Batch MoE Inference on Legacy Servers cs.LG · 2025-09-28 · unverdicted · none · ref 50 · internal anchor
PreScope combines a layer-aware activation predictor, cross-layer prefetch scheduling, and asynchronous I/O to deliver 141% higher throughput and 74.6% lower latency for MoE inference on legacy hardware.
Adapting Vision-Language Models for Neutrino Event Classification in High-Energy Physics cs.LG · 2025-09-10 · unverdicted · none · ref 28 · internal anchor
Fine-tuned LLaMA 3.2 VLM outperforms CNN baselines on neutrino event classification while adding interpretability via language reasoning.
Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities cs.LG · 2024-08-14 · accept · none · ref 251 · internal anchor
The paper introduces a new taxonomy for model merging methods and reviews their applications in LLMs, MLLMs, continual learning, multi-task learning, and other subfields while outlining open challenges.
Predicting Liquidity-Aware Bond Yields using Causal GANs and Deep Reinforcement Learning with LLM Evaluation q-fin.CP · 2025-02-24 · unverdicted · none · ref 34 · internal anchor
CausalGAN + SAC RL pipeline generates synthetic bond yield data; fine-tuned Qwen2.5-7B LLM produces trading signals, with reported MAE 0.103, 60% profit rate, and LLM score 3.37/5.
Search for Truth from Reasoning: A Dynamic Representation Editing Framework for Steering LLM Trajectories cs.AI · 2026-06-26 · unreviewed · ref 28 · internal anchor
Attention at Rest Stays at Rest: Breaking Visual Inertia for Cognitive Hallucination Mitigation cs.CV · 2026-04-02 · unreviewed · ref 48 · internal anchor
Page image classification for content-specific data processing cs.IR · 2025-07-11 · unreviewed · ref 19 · internal anchor

HuggingFace's Transformers: State-of-the-art Natural Language Processing

hub tools

citation-role summary

citation-polarity summary

claims ledger

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer