PDEAgent-Bench is the first multi-metric, multi-library benchmark for AI-generated PDE solvers, evaluating executability, numerical accuracy, and efficiency across DOLFINx, Firedrake, and deal.II.
arXiv preprint arXiv:2505.08783 , year=
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 5representative citing papers
A Creator-Inspector multi-agent LLM pipeline for constitutive artificial neural networks increases the rate of models satisfying all nine physical constraints to 100% or 56% depending on the LLM backbone.
LLMs prompted with domain knowledge can generate runnable, numerically valid code for stiff and non-stiff ODEs on new diagnostic and 1000-task benchmarks.
AutoPDE maintains an explicit solver strategy through PDE analysis, numerical method selection, and adaptive tuning, achieving 54.5% pass rate on PDE Agent Bench, 14.2 points above the strongest baseline.
A survey of RLM use in 28 disciplines reveals uneven adoption and introduces a maturity assessment framework showing larger gaps when limited to public resources.
citing papers explorer
-
Reasoning4Sciences: Bridging Reasoning Language Models to All Scientific Branches
A survey of RLM use in 28 disciplines reveals uneven adoption and introduces a maturity assessment framework showing larger gaps when limited to public resources.