pith. sign in

arxiv: 2602.09492 · v2 · pith:XRYAC2FZnew · submitted 2026-02-10 · 💻 cs.LG · cs.AI

Beware of the Batch Size: Hyperparameter Bias in Evaluating LoRA

classification 💻 cs.LG cs.AI
keywords sizebatchloravariantsoftenadaptationapproacharise
0
0 comments X
read the original abstract

Low-rank adaptation (LoRA) is a standard approach for fine-tuning large language models, yet its many variants report conflicting empirical gains, often on the same benchmarks. We show that these contradictions arise from a single overlooked factor: the batch size. When properly tuned, vanilla LoRA often matches the performance of more complex variants. We further propose a proxy-based, cost-efficient strategy for batch size tuning, revealing the impact of rank, dataset size, and model capacity on the optimal batch size. Our findings elevate batch size from a minor implementation detail to a first-order design parameter, reconciling prior inconsistencies and enabling more reliable evaluations of LoRA variants.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. ALTO: Adaptive LoRA Tuning and Orchestration for Heterogeneous LoRA Training Workloads

    cs.LG 2026-04 unverdicted novelty 7.0

    ALTO accelerates LoRA tuning up to 13.8x by monitoring loss trajectories for early stopping, using fused grouped GEMM with rank-local adapter parallelism, and combining intra- and inter-task scheduling for heterogeneo...

  2. Annotation Entropy Predicts Per-Example Learning Dynamics in LoRA Fine-Tuning

    cs.LG 2026-03 unverdicted novelty 7.0

    Annotation entropy from contested labels predicts increasing loss during LoRA fine-tuning on NLI tasks, unlike full fine-tuning.