AstroAlertBench evaluates multimodal LLMs on astronomical classification accuracy, reasoning, and honesty using real ZTF alerts, revealing that high accuracy often diverges from self-assessed reasoning quality.
arXiv preprint arXiv:2106.01969 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
OPMD achieves the first fast Õ(1/n) rate for offline Nash equilibrium learning in α-potential games via a new reference-anchored coverage framework.
citing papers explorer
-
AstroAlertBench: Evaluating the Accuracy, Reasoning, and Honesty of Multimodal LLMs in Astronomical Classification
AstroAlertBench evaluates multimodal LLMs on astronomical classification accuracy, reasoning, and honesty using real ZTF alerts, revealing that high accuracy often diverges from self-assessed reasoning quality.
-
Fast Rates in $\alpha$-Potential Games via Regularized Mirror Descent
OPMD achieves the first fast Õ(1/n) rate for offline Nash equilibrium learning in α-potential games via a new reference-anchored coverage framework.