CrystalXRD-Bench is a new 250-sample benchmark for VLMs on XRD peak indexing, where the best model (GPT-5.4) reaches Jaccard 0.5888 and 37.6% exact match while most stay below 0.50, showing the task remains unsolved.
MatSci-NLP: Evaluating scientific language models on materials science language tasks using text-to-schema modeling
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
CrystalXRD-Bench: Benchmarking Vision-Language Models for XRD Peak Indexing Across Diverse Crystalline Materials
CrystalXRD-Bench is a new 250-sample benchmark for VLMs on XRD peak indexing, where the best model (GPT-5.4) reaches Jaccard 0.5888 and 37.6% exact match while most stay below 0.50, showing the task remains unsolved.