BioXArena benchmarks LLM agents on generating end-to-end ML pipelines for 76 multi-modal biomedical tasks, with MLEvolve plus Gemini-3.1-Pro scoring highest at 0.666.
Transfer learning enables predictions in network biology.Nature, 618(7965):616–624
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3roles
background 2polarities
background 2representative citing papers
CellRefine adds a marker-gene-guided post-pretraining stage to single-cell models that refines the cell embedding manifold and improves downstream task performance by up to 15%.
A new benchmarking framework shows virtual cell models overestimate performance on standard tests, drop sharply on unseen contexts and perturbations, and produce inconsistent rankings across metrics.
citing papers explorer
-
BioXArena: Benchmarking LLM Agents on Multi-Modal Biomedical Machine Learning Tasks
BioXArena benchmarks LLM agents on generating end-to-end ML pipelines for 76 multi-modal biomedical tasks, with MLEvolve plus Gemini-3.1-Pro scoring highest at 0.666.
-
Prototype Guided Post-pretraining for Single-Cell Representation Learning
CellRefine adds a marker-gene-guided post-pretraining stage to single-cell models that refines the cell embedding manifold and improves downstream task performance by up to 15%.
-
Benchmarking virtual cell models for in-the-wild perturbation response
A new benchmarking framework shows virtual cell models overestimate performance on standard tests, drop sharply on unseen contexts and perturbations, and produce inconsistent rankings across metrics.