MixAtlas uses CLIP-based decomposition and Gaussian process optimization on small proxies to discover data mixtures that improve multimodal benchmark performance by up to 17.6% and transfer to larger models with faster convergence.
Title resolution pending
6 Pith papers cite this work. Polarity classification is still indexing.
years
2026 6representative citing papers
COBALT embeds catalogs into anchored discrete latent graphs, applies random tree decomposition and additive SAAS-GP surrogates to heteroscedastic MC-FEA data, and performs discrete trust-region acquisition to optimize high-dimensional categorical structural designs under aleatoric uncertainty while
Physics-informed graph attention networks predict multi-phase equilibria in Ag-Bi-Cu-Sn alloys with 96% exact-set accuracy on in-domain data and strong generalization to unseen sections.
A Bayesian optimal experimental design framework with Gaussian approximation of expected information gain and surrogate Fisher information enables optimized uniaxial tests that significantly improve identifiability of history-dependent constitutive parameters over random designs.
Framework for dataset subset selection via clustering, A/D-optimality, and FAFI with bootstrap intervals to preserve model rankings, showing high Spearman correlation (0.95 with 5 datasets) in TSC but limited gains in recommender systems.
Uses MPRK solvers and WENO post-processing to optimize time-varying hyperparameters in existing COVID-19 models and reports 5-day forecasts within 10% error for a Ghana case study.
citing papers explorer
-
MixAtlas: Uncertainty-aware Data Mixture Optimization for Multimodal LLM Midtraining
MixAtlas uses CLIP-based decomposition and Gaussian process optimization on small proxies to discover data mixtures that improve multimodal benchmark performance by up to 17.6% and transfer to larger models with faster convergence.
-
Categorical Optimization with Bayesian Anchored Latent Trust Regions for Structural Design under High-Dimensional Uncertainty
COBALT embeds catalogs into anchored discrete latent graphs, applies random tree decomposition and additive SAAS-GP surrogates to heteroscedastic MC-FEA data, and performs discrete trust-region acquisition to optimize high-dimensional categorical structural designs under aleatoric uncertainty while
-
Multi-Label Phase Diagram Prediction in Complex Alloys via Physics-Informed Graph Attention Networks
Physics-informed graph attention networks predict multi-phase equilibria in Ag-Bi-Cu-Sn alloys with 96% exact-set accuracy on in-domain data and strong generalization to unseen sections.
-
Optimal Experimental Design for Reliable Learning of History-Dependent Constitutive Laws
A Bayesian optimal experimental design framework with Gaussian approximation of expected information gain and surrogate Fisher information enables optimized uniaxial tests that significantly improve identifiability of history-dependent constitutive parameters over random designs.
-
Benchmarking on Tasks That Matter: Dataset Selection for Preserving Model Rankings
Framework for dataset subset selection via clustering, A/D-optimality, and FAFI with bootstrap intervals to preserve model rankings, showing high Spearman correlation (0.95 with 5 datasets) in TSC but limited gains in recommender systems.
-
Using Machine Learning to Enhance Hyperparameter Optimization in Pandemic Modeling: Case study of COVID-19 Dynamics in Ghana
Uses MPRK solvers and WENO post-processing to optimize time-varying hyperparameters in existing COVID-19 models and reports 5-day forecasts within 10% error for a Ghana case study.