INCARBench evaluates 19 LLMs on VASP INCAR configuration generation and repair, showing high semantic accuracy but lower scientific correctness especially for DFT+U, magnetism, and correlated materials.
Rand, and Adji Bousso Dieng
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cond-mat.mtrl-sci 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
INCARBench: A Benchmark for Scientific Configuration in VASP INCAR by Large Language Models
INCARBench evaluates 19 LLMs on VASP INCAR configuration generation and repair, showing high semantic accuracy but lower scientific correctness especially for DFT+U, magnetism, and correlated materials.