super hub Mixed citations

LoRA: Low-Rank Adaptation of Large Language Models

Edward J. Hu, Phillip Wallis, Shean Wang, Yelong Shen, Yuanzhi Li, Zeyuan Allen-Zhu · 2021 · cs.CL · arXiv 2106.09685

Mixed citation behavior. Most common role is background (57%).

563 Pith papers citing it

Background 57% of classified citations

open full Pith review browse 563 citing papers more from Edward J. Hu arXiv PDF

abstract

An important paradigm of natural language processing consists of large-scale pre-training on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, full fine-tuning, which retrains all model parameters, becomes less feasible. Using GPT-3 175B as an example -- deploying independent instances of fine-tuned models, each with 175B parameters, is prohibitively expensive. We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks. Compared to GPT-3 175B fine-tuned with Adam, LoRA can reduce the number of trainable parameters by 10,000 times and the GPU memory requirement by 3 times. LoRA performs on-par or better than fine-tuning in model quality on RoBERTa, DeBERTa, GPT-2, and GPT-3, despite having fewer trainable parameters, a higher training throughput, and, unlike adapters, no additional inference latency. We also provide an empirical investigation into rank-deficiency in language model adaptation, which sheds light on the efficacy of LoRA. We release a package that facilitates the integration of LoRA with PyTorch models and provide our implementations and model checkpoints for RoBERTa, DeBERTa, and GPT-2 at https://github.com/microsoft/LoRA.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 59 method 38 baseline 2 dataset 2

citation-polarity summary

background 58 use method 36 unclear 3 baseline 2 use dataset 2

claims ledger

abstract An important paradigm of natural language processing consists of large-scale pre-training on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, full fine-tuning, which retrains all model parameters, becomes less feasible. Using GPT-3 175B as an example -- deploying independent instances of fine-tuned models, each with 175B parameters, is prohibitively expensive. We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, grea

authors

Edward J. Hu Phillip Wallis Shean Wang Yelong Shen Yuanzhi Li Zeyuan Allen-Zhu

co-cited works

representative citing papers

Decodable Is Not Grounded: A Vision-Ablation Arbiter for VLM Spatial Reasoning

cs.CV · 2026-06-30 · unverdicted · novelty 8.0

A blank-image ablation test reveals that high probe accuracy on VLM spatial reasoning frequently reflects priors or inverted signs rather than image grounding, with horizontal grounded, vertical prior, and depth inverted.

GPTQ-intrinsic LoRA: A Near-optimal Algorithm for Low-precision Quantization with Low-rank Adaptation

cs.LG · 2026-05-31 · unverdicted · novelty 8.0

GPTQ-intrinsic LoRA augments GPTQ with intrinsic low-rank compensation via Hessian modification to achieve layer-wise reconstruction bounds that match information-theoretic lower bounds under structural assumptions.

On the Generation and Mitigation of Harmful Geometry in Image-to-3D Models

cs.CR · 2026-05-10 · conditional · novelty 8.0

Image-to-3D models successfully generate harmful geometries in most cases with under 0.3% caught by commercial filters; existing safeguards are weak but a stacked defense cuts harmful outputs to under 1% at 11% false-positive cost.

PhysInOne: Visual Physics Learning and Reasoning in One Suite

cs.CV · 2026-04-10 · unverdicted · novelty 8.0

PhysInOne is a new dataset of 2 million videos across 153,810 dynamic 3D scenes covering 71 physical phenomena, shown to improve AI performance on physics-aware video generation, prediction, property estimation, and motion transfer.

FETAL-GAUGE: A Benchmark for Assessing Vision-Language Models in Fetal Ultrasound

cs.CV · 2025-12-25 · unverdicted · novelty 8.0

Fetal-Gauge benchmark shows state-of-the-art vision-language models reach only 55% accuracy on fetal ultrasound tasks, well below clinical needs and highlighting the requirement for domain-adapted models.

ZO-Act: Efficient Zeroth-Order Fine-Tuning via One-Shot Activation-Informed Low-Rank Subspaces

cs.LG · 2026-07-01 · unverdicted · novelty 7.0

ZO-Act performs zeroth-order fine-tuning of LLMs by optimizing lightweight coefficient matrices inside one-shot activation-derived low-rank subspaces.

Supersede: Diagnosing and Training the Memory-Update Gap in LLM Agents

cs.CL · 2026-06-25 · unverdicted · novelty 7.0

The supersession gap in LLM agents—failing to use current facts and discard superseded ones—is a distinct failure not fixed by scale or memory size, but improvable via RL training on a new environment.

Contextualizing Biological Language Models across Modalities via Logit-Space Contrastive Alignment

cs.LG · 2026-06-17 · unverdicted · novelty 7.0

LOGICA adds context to pretrained biological LMs via logit-space contrastive alignment with gated adapters, improving AUC on held-out drug-resistance mutation ranking from ~0.55 to ~0.65 while preserving token likelihoods.

When Probing Accuracy Saturates, Fragility Resolves: A Complementary Metric for LLM Pre-Training Analysis

cs.CL · 2026-06-09 · unverdicted · novelty 7.0

Fragility, the activation noise level causing probe accuracy collapse, reveals evolving lexical-to-compositional moral encoding, layer robustness gradients, and fine-tuning differences invisible to saturated probing accuracy.

GNSS-FM: A Self-Supervised Foundation Model for Daily GNSS Displacement Time Series

physics.geo-ph · 2026-06-05 · unverdicted · novelty 7.0

GNSS-FM is a self-supervised foundation model for GNSS displacement time series that outperforms task-specific baselines on 90-day forecasting and seismic step localization after pretraining on global station data.

Robotic Policy Adaptation via Weight-Space Meta-Learning

cs.RO · 2026-06-05 · unverdicted · novelty 7.0

WIZARD meta-learns to map task evidence directly to LoRA updates for VLA policies, reporting up to 14x gains on unseen tasks in simulation and real-robot experiments without test-time optimization or action labels.

Entropy as a Structural Prior: How a Log-Barrier on DiT Belief Space Drives Musical Diversity and Development

cs.SD · 2026-06-05 · unverdicted · novelty 7.0

An entropy-based log-barrier on DiT outputs acts as an online curriculum in supervised diffusion fine-tuning, producing higher thematic development and textural diversity than standard training on MusicCaps.

WHU-Infra3D: A Full-stack Multi-modal Dataset and Benchmark for 3D Roadside Infrastructure Inventory

cs.CV · 2026-06-03 · unverdicted · novelty 7.0

WHU-Infra3D is a new large-scale multi-modal dataset and benchmark for 3D roadside infrastructure inventory, providing over 175k 2D boxes, thousands of 3D instances, and 181k annotations across five core tasks while exposing cross-city gaps and long-tailed defect vulnerabilities.

Toward Calibrated, Fair, and accurate Deepfake Detection

cs.LG · 2026-06-03 · unverdicted · novelty 7.0

Face-Feature Tuning is a label-free logit remapping method that reduces FPR/TPR gaps across groups in deepfake detection while preserving overall accuracy.

Low-Rank Decay for Grokking in Scale-Invariant Transformers: A Spectral-Geometric View

cs.LG · 2026-06-03 · unverdicted · novelty 7.0

Low-Rank Decay induces effective-rank collapse in Query/Key matrices and widens the grokking regime on modular arithmetic tasks in scale-invariant Transformers.

The Regularizing Power of Language-Training Deepfake Detectors

cs.CV · 2026-05-29 · unverdicted · novelty 7.0

A dual-encoder deepfake detector pairs a frozen specialist with a LoRA-tuned MLLM, trained first via binary alignment then via RL to reward explain-then-classify behavior, yielding improved cross-dataset performance and interpretability.

GGT-100K: Generative Ground Truth for Generalizable Real-World Image Restoration

cs.CV · 2026-05-29 · unverdicted · novelty 7.0

GGT-100K is a 103k-pair LQ-HQ dataset generated via MFMs to enhance real-world generalization of image restoration models.

Shortcuts in the Tail: Debiasing via Post-Hoc Spectral Compression of Fine-Tuning Updates

cs.LG · 2026-05-29 · unverdicted · novelty 7.0

Post-hoc truncation of the tail of the SVD of ΔW reduces spurious-group gaps by up to 5× with <2 pp accuracy loss across 0.5B–7B models and four benchmarks.

PInVerify: An Offline Embodied Benchmark for Active Instance Verification

cs.CV · 2026-05-28 · unverdicted · novelty 7.0

PInVerify is a new offline embodied benchmark for active instance verification that supplies multi-view captures and 6-sector navigation topology, with MLLM baselines reaching 85.6% after fine-tuning but showing no reliable benefit from tested next-best-view strategies.

Seeing Isn't Knowing: Do VLMs Know When Not to Answer Spatial Questions (and Why)?

cs.CV · 2026-05-28 · unverdicted · novelty 7.0

Frontier VLMs overconfidently answer spatial questions under occlusion (~30% accuracy) and perspective ambiguity (<10% accuracy) instead of abstaining, and often fail to select helpful additional views.

Bounded Behavioral Indistinguishability for Black-Box LLM Distillation

cs.LG · 2026-05-28 · unverdicted · novelty 7.0

Introduces (ε,q,t,A)-behavioral indistinguishability and shows via Qwen/Llama experiments that LoRA distillation boosts semantic similarity but leaves detectable behavioral differences under adversarial evaluation.

Measuring Alignment-Induced Activation Shifts Correctly: A Template-Controlled Difference-in-Differences Protocol

cs.LG · 2026-05-23 · unverdicted · novelty 7.0

Introduces a template-controlled difference-in-differences protocol that corrects chat-template confounding when measuring alignment-induced activation shifts in LLMs and recovers the refusal direction with higher fidelity.

GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction

cs.CV · 2026-05-22 · unverdicted · novelty 7.0

GenRecon lifts object-level generative priors to scene-scale reconstruction by chunking scenes and using projection-based conditioning on multi-view features, claiming 16% better results than prior methods.

EgoInteract: Synthetic Egocentric Videos Generation for Interaction Understanding and Anticipation

cs.CV · 2026-05-18 · unverdicted · novelty 7.0 · 2 refs

EgoInteract is a new simulator for generating synthetic egocentric videos with precise control over camera, body, hand, and object motions, producing a dataset that improves model performance on real-world benchmarks for temporal action segmentation, next-active object detection, interaction Anticip

citing papers explorer

Showing 38 of 38 citing papers after filters.

PhysInOne: Visual Physics Learning and Reasoning in One Suite cs.CV · 2026-04-10 · unverdicted · none · ref 42 · internal anchor
PhysInOne is a new dataset of 2 million videos across 153,810 dynamic 3D scenes covering 71 physical phenomena, shown to improve AI performance on physics-aware video generation, prediction, property estimation, and motion transfer.
Likelihood scoring for continuations of mathematical text: a self-supervised benchmark with tests for shortcut vulnerabilities cs.LG · 2026-05-11 · unverdicted · none · ref 12 · 2 links · internal anchor
Presents a likelihood-based benchmark for equation-suffix prediction in technical papers with controls to detect shortcut vulnerabilities in model forecasts.
Omni-Persona: Systematic Benchmarking and Improving Omnimodal Personalization cs.CV · 2026-05-11 · unverdicted · none · ref 24 · internal anchor
Omni-Persona benchmark with 18 tasks shows open-source models have audio-visual grounding gaps, RLVR narrows them but leads to conservative outputs, and scale or recall alone fail as diagnostics.
Reddit2Deezer: A Scalable Dataset for Real-World Grounded Conversational Music Recommendation cs.IR · 2026-05-09 · unverdicted · none · ref 18 · internal anchor
Reddit2Deezer supplies 190k authentic Reddit dialogues grounded in Deezer music entities for scalable conversational music recommendation research.
MatryoshkaLoRA: Learning Accurate Hierarchical Low-Rank Representations for LLM Fine-Tuning cs.CL · 2026-05-08 · unverdicted · none · ref 10 · internal anchor
MatryoshkaLoRA inserts a crafted diagonal matrix P into LoRA to learn accurate nested low-rank adapters that support dynamic rank selection with minimal performance drop.
RAG-Reflect: Agentic Retrieval-Augmented Generation with Reflections for Comment-Driven Code Maintenance on Stack Overflow cs.SE · 2026-04-24 · unverdicted · none · ref 14 · internal anchor
RAG-Reflect achieves F1=0.78 on valid comment-edit prediction using retrieval-augmented reasoning and self-reflection, outperforming baselines and approaching fine-tuned models without retraining.
Fine-Tuning Small Reasoning Models for Quantum Field Theory cs.LG · 2026-04-21 · unverdicted · none · ref 216 · internal anchor
Small 7B reasoning models were fine-tuned on synthetic and curated QFT problems using RL and SFT, yielding performance gains, error analysis, and public release of data and traces.
UIPress: Bringing Optical Token Compression to UI-to-Code Generation cs.CL · 2026-04-10 · unverdicted · none · ref 14 · internal anchor
UIPress is the first encoder-side learned optical compression method for UI-to-Code that compresses visual tokens to 256, outperforming the uncompressed baseline by 7.5% CLIP score and the best inference-time baseline by 4.6% while delivering 9.1x TTFT speedup.
QLoRA: Efficient Finetuning of Quantized LLMs cs.LG · 2023-05-23 · conditional · none · ref 28 · internal anchor
QLoRA finetunes 4-bit quantized LLMs via LoRA adapters to match full-precision performance while using far less memory, enabling 65B-scale training on single GPUs and producing Guanaco models near ChatGPT level.
EmbGen: Teaching with Reassembled Corpora cs.CL · 2026-05-19 · unverdicted · none · ref 6 · internal anchor
EmbGen creates synthetic QA data by entity decomposition, embedding-based reassembly into clusters, and multi-level sampling with cluster-specific prompts, yielding up to 88.9% higher Binary Accuracy than baselines on heterogeneous datasets under fixed token budgets.
GRAFT: Graph-Tokenized LLMs for Tool Planning cs.LG · 2026-05-12 · unverdicted · none · ref 37 · internal anchor
GRAFT internalizes tool dependency graphs via dedicated special tokens in LLMs and applies on-policy context distillation to achieve higher exact sequence matching and dependency legality than prior external-graph methods.
Filtering Memorization from Parameter-Space in Diffusion Models cs.CV · 2026-05-11 · unverdicted · none · ref 3 · internal anchor
BAF reduces memorization in diffusion LoRAs by filtering spectral channels of the adaptation weights that show weak alignment with the base model's principal subspace.
FocuSFT: Bilevel Optimization for Dilution-Aware Long-Context Fine-Tuning cs.CL · 2026-05-11 · unverdicted · none · ref 17 · internal anchor
FocuSFT uses an inner optimization loop to adapt fast-weight parameters into a parametric memory that sharpens attention on relevant content, then conditions outer-loop supervised fine-tuning on this representation, yielding gains on long-context benchmarks.
SpaceMind++: Toward Allocentric Cognitive Maps for Spatially Grounded Video MLLMs cs.CV · 2026-05-10 · unverdicted · none · ref 24 · internal anchor
SpaceMind++ adds an explicit voxelized allocentric cognitive map and coordinate-guided fusion to video MLLMs, claiming SOTA on VSI-Bench and improved out-of-distribution generalization on three other 3D benchmarks.
Improving Parameter-Efficient Federated Learning with Differentially Private Refactorization cs.CR · 2026-05-08 · unverdicted · none · ref 8 · internal anchor
FedPower improves the accuracy-privacy tradeoff in differentially private LoRA-based federated learning by reconstructing and clipping full-rank updates then using PowerDP to inject noise before orthonormalization in low-rank factorization.
Beyond the False Trade-off: Adaptive EWC for Stealthy and Generalizable T2I Backdoors cs.LG · 2026-05-08 · unverdicted · none · ref 8 · internal anchor
Cosine-Aware Adaptive EWC dynamically adjusts parameter regularization using cosine semantic utility and adaptive scheduling to achieve higher ASR-fidelity balance and OOD robustness in T2I backdoor attacks.
The Position Curse: LLMs Struggle to Locate the Last Few Items in a List cs.LG · 2026-05-08 · unverdicted · none · ref 10 · internal anchor
LLMs exhibit the Position Curse, with backward position retrieval in lists lagging far behind forward retrieval, showing only partial gains from PosBench fine-tuning.
Can LLMs Predict Polymer Physics Just by Reading Synthesis and Processing Prose? cs.LG · 2026-05-07 · unverdicted · none · ref 9 · internal anchor
PolyLM fine-tunes a 9B-parameter LLM on 185k papers to predict polymer properties from text alone, achieving median R² of 0.74 on 68k held-out samples.
Gradient Extrapolation-Based Policy Optimization cs.LG · 2026-05-07 · unverdicted · none · ref 13 · internal anchor
GXPO approximates longer local lookahead in GRPO training via gradient extrapolation from two optimizer steps using three backward passes total, improving pass@1 accuracy by 1.65-5.00 points over GRPO and delivering up to 4x step speedup.
You Snooze, You Lose: Automatic Safety Alignment Restoration through Neural Weight Translation cs.CR · 2026-05-06 · unverdicted · none · ref 55 · internal anchor
NeWTral is a non-linear weight translation framework using MoE routing that reduces average attack success rate from 70% to 13% on unsafe domain adapters across Llama, Mistral, Qwen, and Gemma models up to 72B while retaining 90% knowledge fidelity.
SIEVES: Selective Prediction Generalizes through Visual Evidence Scoring cs.CV · 2026-04-28 · conditional · none · ref 23 · 2 links · internal anchor
SIEVES improves selective prediction coverage by up to 3x on OOD VQA benchmarks by training a selector to score the quality of visual evidence produced by reasoner models, generalizing across benchmarks and proprietary models without internal access or per-task retraining.
Fine-Tuning Vision-Language-Action Models: Optimizing Speed and Success cs.RO · 2025-02-27 · accept · none · ref 14 · internal anchor
OpenVLA-OFT fine-tuning boosts LIBERO success rate from 76.5% to 97.1%, speeds action generation 26x, and outperforms baselines on real bimanual dexterous tasks.
Improving Video Generation with Human Feedback cs.CV · 2025-01-23 · unverdicted · none · ref 24 · internal anchor
A human preference dataset and VideoReward model enable Flow-DPO and Flow-NRG to produce smoother, better-aligned videos from text prompts in flow-based generators.
OpenVLA: An Open-Source Vision-Language-Action Model cs.RO · 2024-06-13 · unverdicted · none · ref 27 · internal anchor
OpenVLA achieves 16.5% higher task success than the 55B RT-2-X model across 29 tasks with 7x fewer parameters while enabling effective fine-tuning and quantization without performance loss.
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets cs.CV · 2023-11-25 · conditional · none · ref 45 · internal anchor
Stable Video Diffusion scales latent video diffusion models via text-to-image pretraining, video pretraining on curated data, and high-quality finetuning to produce competitive text-to-video and image-to-video results while enabling motion LoRA and multi-view 3D applications.
Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning cs.CV · 2023-06-26 · accept · none · ref 9 · internal anchor
A new dataset of 400k visual instructions including negative examples at three semantic levels reduces hallucinations in models like MiniGPT-4 when used for fine-tuning while improving benchmark performance.
AudioFace: Language-Assisted Speech-Driven Facial Animation with Multimodal Language Models cs.CV · 2026-05-08 · unverdicted · none · ref 20 · internal anchor
AudioFace improves speech-driven facial animation by guiding blendshape prediction with linguistic and articulatory information extracted via multimodal language models.
Bridging Linguistic Gaps: Cross-Lingual Mapping in Pre-Training and Dataset for Enhanced Multilingual LLM Performance cs.CL · 2026-04-12 · unverdicted · none · ref 18 · internal anchor
A new pre-training task that maps languages bidirectionally in embedding space improves machine translation by up to 11.9 BLEU, cross-lingual QA by 6.72 BERTScore points, and understanding accuracy by over 5% over strong baselines.
Measurement of Generative AI Workload Power Profiles for Whole-Facility Data Center Infrastructure Planning eess.SY · 2026-04-08 · unverdicted · none · ref 37 · internal anchor
High-resolution power profiles for AI workloads on H100 GPUs are measured and scaled to whole-facility energy demand using a bottom-up model, with the dataset made public.
Hallucination of Multimodal Large Language Models: A Survey cs.CV · 2024-04-29 · accept · none · ref 66 · internal anchor
The survey organizes causes of hallucinations in MLLMs, reviews evaluation benchmarks and metrics, and outlines mitigation approaches plus open questions.
PriorNet: Prior-Guided Engagement Estimation from Face Video cs.CV · 2026-05-05 · unverdicted · none · ref 8 · internal anchor
PriorNet improves engagement estimation from face videos by injecting priors into preprocessing, adaptation, and objective design, showing improvements on multiple benchmarks.
Toward Zero-Egress Psychiatric AI: On-Device LLM Deployment for Privacy-Preserving Mental Health Decision Support cs.AI · 2026-04-20 · unverdicted · none · ref 54 · internal anchor
A cross-platform mobile application deploys an ensemble of quantized open-source LLMs for fully local, DSM-5-aligned psychiatric decision support with claimed accuracy comparable to prior cloud versions.
Mira-Embeddings-V1: Domain-Adapted Semantic Reranking for Recruitment via LLM-Synthesized Data cs.CL · 2026-04-20 · conditional · none · ref 9 · internal anchor
Mira-Embeddings-V1 adapts embeddings for recruitment reranking by synthesizing positive and hard-negative samples with LLMs, then applies JD-JD contrastive and JD-CV triplet training plus a BoundaryHead MLP, lifting Recall@50 from 68.89% to 77.55% and Recall@200 from 0.5969 to 0.7047.
Can Hierarchical Cross-Modal Fusion Predict Human Perception of AI Dubbed Content? eess.AS · 2026-03-30 · unverdicted · none · ref 23 · internal anchor
A hierarchical cross-modal fusion architecture with LoRA adapters predicts human perception of AI-dubbed clips at PCC > 0.75 after training on 12k Hindi-English examples and human MOS fine-tuning.
Large Language Models for Multi-Robot Systems: A Survey cs.RO · 2025-02-06 · unverdicted · none · ref 39 · internal anchor
A survey that categorizes LLM uses in multi-robot systems across task allocation, motion planning, action generation, and human interaction, while noting challenges and future research opportunities.
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey cs.LG · 2024-03-21 · accept · none · ref 76 · internal anchor
A comprehensive survey of PEFT algorithms for large models, covering their performance, overhead, applications, and real-world system implementations.
A Survey on the Memory Mechanism of Large Language Model based Agents cs.AI · 2024-04-21 · accept · none · ref 131 · internal anchor
A systematic review of memory designs, evaluation methods, applications, limitations, and future directions for LLM-based agents.
GUI-Perturbed: Domain Randomization Reveals Systematic Brittleness in GUI Grounding Models cs.LG · 2026-04-15 · unreviewed · ref 11 · internal anchor

LoRA: Low-Rank Adaptation of Large Language Models

hub tools

citation-role summary

citation-polarity summary

claims ledger

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer