arXiv preprint arXiv:1707.07328 , year=

Adversarial Examples for Evaluating Reading Comprehension Systems , author= · 2017 · arXiv 1707.07328

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Online Learning-to-Defer with Varying Experts

stat.ML · 2026-05-12 · unverdicted · novelty 8.0

Presents the first online learning-to-defer algorithm with regret bounds O((n + n_e) T^{2/3}) generally and O((n + n_e) sqrt(T)) under low noise for multiclass classification with varying experts.

Pushing the Boundaries of Multiple Choice Evaluation to One Hundred Options

cs.CL · 2026-04-16 · unverdicted · novelty 7.0

Scaling multiple-choice questions to 100 options on a Korean error detection task shows that LLM performance on conventional benchmarks overstates true competence due to shortcut strategies.

citing papers explorer

Showing 2 of 2 citing papers.

Online Learning-to-Defer with Varying Experts stat.ML · 2026-05-12 · unverdicted · none · ref 80
Presents the first online learning-to-defer algorithm with regret bounds O((n + n_e) T^{2/3}) generally and O((n + n_e) sqrt(T)) under low noise for multiclass classification with varying experts.
Pushing the Boundaries of Multiple Choice Evaluation to One Hundred Options cs.CL · 2026-04-16 · unverdicted · none · ref 2
Scaling multiple-choice questions to 100 options on a Korean error detection task shows that LLM performance on conventional benchmarks overstates true competence due to shortcut strategies.

arXiv preprint arXiv:1707.07328 , year=

fields

years

verdicts

representative citing papers

citing papers explorer