Hypobench: Towards systematic and principled benchmarking for hypothesis generation

Haokun Liu, Sicong Huang, Jingyu Hu, Yangqiaoyu Zhou, Chenhao Tan · 2025 · arXiv 2504.11524

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

dataset 1

citation-polarity summary

use dataset 1

representative citing papers

CREATE: Testing LLMs for Associative Creativity

cs.CL · 2026-03-10 · unverdicted · novelty 7.0

CREATE is a benchmark that scores LLMs on their ability to produce many specific and diverse associative paths between concepts drawn from parametric knowledge.

Evolving Roles of LLMs in Scientific Innovation: Assistant, Collaborator, Scientist, and Evaluator

cs.DL · 2025-07-16 · unverdicted · novelty 4.0

The paper proposes a four-role framework for LLMs in scientific innovation and reviews methods, benchmarks, and limitations across Assistant, Collaborator, Scientist, and Evaluator roles.

citing papers explorer

Showing 2 of 2 citing papers.

CREATE: Testing LLMs for Associative Creativity cs.CL · 2026-03-10 · unverdicted · none · ref 2
CREATE is a benchmark that scores LLMs on their ability to produce many specific and diverse associative paths between concepts drawn from parametric knowledge.
Evolving Roles of LLMs in Scientific Innovation: Assistant, Collaborator, Scientist, and Evaluator cs.DL · 2025-07-16 · unverdicted · none · ref 100
The paper proposes a four-role framework for LLMs in scientific innovation and reviews methods, benchmarks, and limitations across Assistant, Collaborator, Scientist, and Evaluator roles.

Hypobench: Towards systematic and principled benchmarking for hypothesis generation

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer