Raiven mediates LLM visualization authoring via a formally defined DSL that unifies scientific and information visualization, producing deterministic, verifiable code from metadata-only inputs.
Astro- visbench: A code benchmark for scientific computing and visualization in astronomy.arXiv preprint arXiv:2505.20538, 2025
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4verdicts
UNVERDICTED 4roles
background 1polarities
background 1representative citing papers
A survey of RLM use in 28 disciplines reveals uneven adoption and introduces a maturity assessment framework showing larger gaps when limited to public resources.
VESTA introduces dynamic tool creation for VLMs that outperforms static-tool and no-tool baselines on distribution fitting, time series, and astronomy tasks in the new DAWN benchmark.
Introduces QASM-Eval, the first dataset targeting OpenQASM-3 hardware-facing features for LLM training and evaluation, with an extended verifier for syntax, states, and timelines.
citing papers explorer
-
Raiven: LLM-Based Visualization Authoring via Domain-Specific Language Mediation
Raiven mediates LLM visualization authoring via a formally defined DSL that unifies scientific and information visualization, producing deterministic, verifiable code from metadata-only inputs.
-
Reasoning4Sciences: Bridging Reasoning Language Models to All Scientific Branches
A survey of RLM use in 28 disciplines reveals uneven adoption and introduces a maturity assessment framework showing larger gaps when limited to public resources.
-
VESTA: Visual Exploration with Statistical Tool Agents
VESTA introduces dynamic tool creation for VLMs that outperforms static-tool and no-tool baselines on distribution fitting, time series, and astronomy tasks in the new DAWN benchmark.
-
QASM-Eval: A Dataset to Train and Evaluate LLMs on OpenQASM-3 Beyond Quantum Circuits
Introduces QASM-Eval, the first dataset targeting OpenQASM-3 hardware-facing features for LLM training and evaluation, with an extended verifier for syntax, states, and timelines.