scBench-Long is a benchmark with 21 evaluations where the strongest AI model-harness pair succeeds on 25.4% of long-horizon single-cell biology tasks.
M., Mathur, M., Soderberg, C
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
years
2026 3verdicts
UNVERDICTED 3representative citing papers
Introduces SpatialBench-Long benchmark with 24 evaluations on spatial biology datasets from PDAC, glioblastoma, lung adenocarcinoma and optic nerve systems, reporting top model performance at 8/72 runs (11.1%).
Modifies Gibbs sampler for GP state-space models, introduces CFA measurement structure, and validates software via simulation-based calibration to enable reliable learning of nonlinear latent dynamics.
citing papers explorer
-
Verifiable Benchmarking of Long-Horizon Spatial Biology
Introduces SpatialBench-Long benchmark with 24 evaluations on spatial biology datasets from PDAC, glioblastoma, lung adenocarcinoma and optic nerve systems, reporting top model performance at 8/72 runs (11.1%).