PhySciBench benchmark shows current AI models achieve at most 33.5% accuracy on physical science tasks; DelveAgent framework improves accuracy by up to 7.5 points and cuts costs to one-third.
Random compressed coding with neurons
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
physics.comp-ph 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Deep Research in Physical Sciences: A Multi-Agent Framework and Comprehensive Benchmark
PhySciBench benchmark shows current AI models achieve at most 33.5% accuracy on physical science tasks; DelveAgent framework improves accuracy by up to 7.5 points and cuts costs to one-third.