Astro- visbench: A code benchmark for scientific computing and visualization in astronomy.arXiv preprint arXiv:2505.20538, 2025

Sebastian Antony Joseph, Syed Murtaza Husain, Stella S · 2025 · arXiv 2505.20538

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Raiven: LLM-Based Visualization Authoring via Domain-Specific Language Mediation

cs.HC · 2026-04-11 · unverdicted · novelty 7.0

Raiven mediates LLM visualization authoring via a formally defined DSL that unifies scientific and information visualization, producing deterministic, verifiable code from metadata-only inputs.

Reasoning4Sciences: Bridging Reasoning Language Models to All Scientific Branches

cs.AI · 2026-05-31 · unverdicted · novelty 6.0 · 2 refs

A survey of RLM use in 28 disciplines reveals uneven adoption and introduces a maturity assessment framework showing larger gaps when limited to public resources.

VESTA: Visual Exploration with Statistical Tool Agents

cs.AI · 2026-05-29 · unverdicted · novelty 6.0

VESTA introduces dynamic tool creation for VLMs that outperforms static-tool and no-tool baselines on distribution fitting, time series, and astronomy tasks in the new DAWN benchmark.

QASM-Eval: A Dataset to Train and Evaluate LLMs on OpenQASM-3 Beyond Quantum Circuits

cs.LG · 2026-04-28 · unverdicted · novelty 6.0

Introduces QASM-Eval, the first dataset targeting OpenQASM-3 hardware-facing features for LLM training and evaluation, with an extended verifier for syntax, states, and timelines.

citing papers explorer

Showing 4 of 4 citing papers.

Raiven: LLM-Based Visualization Authoring via Domain-Specific Language Mediation cs.HC · 2026-04-11 · unverdicted · none · ref 18
Raiven mediates LLM visualization authoring via a formally defined DSL that unifies scientific and information visualization, producing deterministic, verifiable code from metadata-only inputs.
Reasoning4Sciences: Bridging Reasoning Language Models to All Scientific Branches cs.AI · 2026-05-31 · unverdicted · none · ref 130 · 2 links
A survey of RLM use in 28 disciplines reveals uneven adoption and introduces a maturity assessment framework showing larger gaps when limited to public resources.
VESTA: Visual Exploration with Statistical Tool Agents cs.AI · 2026-05-29 · unverdicted · none · ref 23
VESTA introduces dynamic tool creation for VLMs that outperforms static-tool and no-tool baselines on distribution fitting, time series, and astronomy tasks in the new DAWN benchmark.
QASM-Eval: A Dataset to Train and Evaluate LLMs on OpenQASM-3 Beyond Quantum Circuits cs.LG · 2026-04-28 · unverdicted · none · ref 41
Introduces QASM-Eval, the first dataset targeting OpenQASM-3 hardware-facing features for LLM training and evaluation, with an extended verifier for syntax, states, and timelines.

Astro- visbench: A code benchmark for scientific computing and visualization in astronomy.arXiv preprint arXiv:2505.20538, 2025

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer