The NazoNazo Benchmark: A Cost-Effective and Extensible Test of Insight-Based Reasoning in LLMs

Masaharu Mizumoto, Dat Nguyen, Zhiheng Han, Jiyuan Fang, Heyuan Guan, Xingfu Li · 2025 · arXiv 2509.14704

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

YOMI-Bench: A Benchmark for Evaluating Kanji Reading and Phonological Understanding of LLMs for Japanese

cs.CL · 2026-07-01 · unverdicted · novelty 6.0

YOMI-Bench is a new benchmark of four tasks for kanji reading and phonological understanding in LLMs, showing low performance even for Japanese-specific and commercial models.

citing papers explorer

Showing 1 of 1 citing paper after filters.

YOMI-Bench: A Benchmark for Evaluating Kanji Reading and Phonological Understanding of LLMs for Japanese cs.CL · 2026-07-01 · unverdicted · none · ref 7
YOMI-Bench is a new benchmark of four tasks for kanji reading and phonological understanding in LLMs, showing low performance even for Japanese-specific and commercial models.

The NazoNazo Benchmark: A Cost-Effective and Extensible Test of Insight-Based Reasoning in LLMs

fields

years

verdicts

representative citing papers

citing papers explorer