ISOSCI benchmark finds 91.3% of reasoning-mode accuracy gains in LLMs on science problems depend on domain knowledge rather than invariant logical structure.
Reasoning or knowledge: Stratified evaluation of biomedical LLMs
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
IsoSci: A Benchmark of Isomorphic Cross-Domain Science Problems for Evaluating Reasoning versus Knowledge Retrieval in LLMs
ISOSCI benchmark finds 91.3% of reasoning-mode accuracy gains in LLMs on science problems depend on domain knowledge rather than invariant logical structure.