MASCing uses an LSTM surrogate and optimized steering masks to enable flexible, inference-time control over MoE expert routing for safety objectives, improving jailbreak defense and content generation success rates substantially across multiple models.
hub
Scaling down to scale up: A guide to parameter-efficient fine-tuning.arXiv preprint arXiv:2303.15647
15 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
CLIPoint3D is the first CLIP-based framework for few-shot unsupervised 3D point cloud domain adaptation that reports 3-16% accuracy gains on PointDA-10 and GraspNetPC-10.
Localized model averaging with covariate-dependent weights achieves asymptotic optimality and weight consistency for combining pre-trained models under a general loss framework.
Diagonal plus Low-Rank (DLoR) neural networks achieve universal approximation for general activations by additive or multiplicative decompositions of full-rank transformations.
Small LLMs under 2B parameters achieve better economic break-even, energy efficiency, and hardware density than larger models on legacy GPUs for industrial tasks.
LLMs using in-context learning and fine-tuning on listener experiment data generate equalization settings that align better with population preferences than random sampling or static presets.
PEFT-Bench is a standardized end-to-end benchmark for 7 PEFT methods across 27 NLP datasets on autoregressive LLMs, accompanied by the PSCP metric that penalizes based on trainable parameters, inference speed, and training memory.
LoRA adapters enable a 61.47M-parameter aerodynamics Transformer pretrained on four vehicle families to adapt to a held-out fifth family with 20 samples, reaching R²=0.85 and outperforming full fine-tuning and from-scratch training with 3x more data.
PEFT-Factory supplies a ready-to-use, extensible codebase that unifies 19 PEFT methods and evaluation pipelines for fine-tuning large autoregressive language models.
MeZO enables larger models for on-device fine-tuning by estimating gradients via forward passes only, with theoretical size estimates and numerical results showing accuracy benefits when wall-clock time is sufficient.
CLIP-SVD performs parameter-efficient adaptation of CLIP by fine-tuning singular values from SVD of weight matrices, reporting SOTA few-shot accuracy on 21 datasets plus a language-based interpretability analysis.
A systematic literature review that organizes recent work on LLMs for code generation into a taxonomy covering data curation, model advances, evaluations, ethics, environmental impact, and applications, with benchmark comparisons.
A tutorial guide outlining phases for integrating LLMs into medical research, including task formulation, model choice, prompt engineering, fine-tuning, and deployment with ethical considerations.
citing papers explorer
-
MASCing: Configurable Mixture-of-Experts Behavior via Activation Steering Masks
MASCing uses an LSTM surrogate and optimized steering masks to enable flexible, inference-time control over MoE expert routing for safety objectives, improving jailbreak defense and content generation success rates substantially across multiple models.
-
CLIPoint3D: Language-Grounded Few-Shot Unsupervised 3D Point Cloud Domain Adaptation
CLIPoint3D is the first CLIP-based framework for few-shot unsupervised 3D point cloud domain adaptation that reports 3-16% accuracy gains on PointDA-10 and GraspNetPC-10.
-
Combining pre-trained models via localized model averaging
Localized model averaging with covariate-dependent weights achieves asymptotic optimality and weight consistency for combining pre-trained models under a general loss framework.
-
Structural Correspondence and Universal Approximation in Diagonal plus Low-Rank Neural Networks
Diagonal plus Low-Rank (DLoR) neural networks achieve universal approximation for general activations by additive or multiplicative decompositions of full-rank transformations.
-
Are Large Language Models Economically Viable for Industry Deployment?
Small LLMs under 2B parameters achieve better economic break-even, energy efficiency, and hardware density than larger models on legacy GPUs for industrial tasks.
-
One Prompt, Many Sounds: Modeling Listener Variability in LLM-Based Equalization
LLMs using in-context learning and fine-tuning on listener experiment data generate equalization settings that align better with population preferences than random sampling or static presets.
-
PEFT-Bench: A Parameter-Efficient Fine-Tuning Methods Benchmark
PEFT-Bench is a standardized end-to-end benchmark for 7 PEFT methods across 27 NLP datasets on autoregressive LLMs, accompanied by the PSCP metric that penalizes based on trainable parameters, inference speed, and training memory.
-
Adapting Automotive Aerodynamics Surrogates to New Vehicle Families via Transfer Learning
LoRA adapters enable a 61.47M-parameter aerodynamics Transformer pretrained on four vehicle families to adapt to a held-out fifth family with 20 samples, reaching R²=0.85 and outperforming full fine-tuning and from-scratch training with 3x more data.
-
PEFT-Factory: Unified Parameter-Efficient Fine-Tuning of Autoregressive Large Language Models
PEFT-Factory supplies a ready-to-use, extensible codebase that unifies 19 PEFT methods and evaluation pipelines for fine-tuning large autoregressive language models.
-
On-Device Fine-Tuning via Backprop-Free Zeroth-Order Optimization
MeZO enables larger models for on-device fine-tuning by estimating gradients via forward passes only, with theoretical size estimates and numerical results showing accuracy benefits when wall-clock time is sufficient.
-
CLIP-SVD: Efficient and Interpretable Vision-Language Adaptation via Singular Values
CLIP-SVD performs parameter-efficient adaptation of CLIP by fine-tuning singular values from SVD of weight matrices, reporting SOTA few-shot accuracy on 21 datasets plus a language-based interpretability analysis.
-
A Survey on Large Language Models for Code Generation
A systematic literature review that organizes recent work on LLMs for code generation into a taxonomy covering data curation, model advances, evaluations, ethics, environmental impact, and applications, with benchmark comparisons.
-
Entry-level guide to the use of large language models for medical research
A tutorial guide outlining phases for integrating LLMs into medical research, including task formulation, model choice, prompt engineering, fine-tuning, and deployment with ethical considerations.
- Parameter-Efficient Neuroevolution for Diverse LLM Generation: Quality-Diversity Optimization via Prompt Embedding Evolution
- Fine-Tuning Causal LLMs for Text Classification: Embedding-Based vs. Instruction-Based Approaches