pith. sign in

arxiv: 2509.02555 · v2 · pith:XG557LSSnew · submitted 2025-09-02 · 💻 cs.LG · cs.AI· cs.NE

Surrogate Benchmarks for Model Merging Optimization

classification 💻 cs.LG cs.AIcs.NE
keywords mergingmodeloptimizationperformancebenchmarkshyperparametersmergedmodels
0
0 comments X
read the original abstract

Model merging techniques aim to integrate the abilities of multiple models into a single model. Most model merging techniques have hyperparameters, and their setting affects the performance of the merged model. Because several existing works show that tuning hyperparameters in model merging can enhance the merging outcome, developing hyperparameter optimization algorithms for model merging is a promising direction. However, its optimization process is computationally expensive, particularly in merging LLMs. In this work, we develop surrogate benchmarks for optimization of the merging hyperparameters to realize algorithm development and performance comparison at low cost. We define two search spaces and collect data samples to construct surrogate models to predict the performance of a merged model from a hyperparameter. We demonstrate that our benchmarks can predict the performance of merged models well and simulate optimization algorithm behaviors.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. LLM Evolution as an Industry-Scale Ecosystem: A Lifecycle Perspective on Continual Learning

    cs.LG 2026-06 unverdicted novelty 5.0

    The paper reformulates industrial continual learning for LLMs as a closed-loop ecosystem problem, identifies three core challenges, and organizes solutions around five lifecycle design principles.