2004.03607 , archiveprefix =

Rowan Zellers, Ari Holtzman, Elizabeth Clark, Lianhui Qin, Ali Farhadi, Yejin Choi · 2020 · arXiv 2004.03607

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Measuring Massive Multitask Language Understanding

cs.CY · 2020-09-07 · accept · novelty 8.0

Introduces the MMLU benchmark of 57 tasks and shows that current models, including GPT-3, achieve low accuracy far below expert level across academic and professional domains.

Help! Need Advice on Identifying Advice

cs.CL · 2020-10-06 · unverdicted · novelty 6.0

Introduces a new English dataset from r/AskParents and r/needadvice annotated for advice sentences plus preliminary models showing pre-trained LMs outperform rule-based systems but the task remains challenging.

citing papers explorer

Showing 2 of 2 citing papers.

Measuring Massive Multitask Language Understanding cs.CY · 2020-09-07 · accept · none · ref 290
Introduces the MMLU benchmark of 57 tasks and shows that current models, including GPT-3, achieve low accuracy far below expert level across academic and professional domains.
Help! Need Advice on Identifying Advice cs.CL · 2020-10-06 · unverdicted · none · ref 31
Introduces a new English dataset from r/AskParents and r/needadvice annotated for advice sentences plus preliminary models showing pre-trained LMs outperform rule-based systems but the task remains challenging.

2004.03607 , archiveprefix =

fields

years

verdicts

representative citing papers

citing papers explorer