arxiv: 1806.00920 · v3 · pith:AUB53BM7new · submitted 2018-06-04 · 💻 cs.CL

DRCD: a Chinese Machine Reading Comprehension Dataset

Chih Chieh Shao , Trois Liu , Yuting Lai , Yiying Tseng , Sam Tsai This is my paper

classification 💻 cs.CL

keywords datasetcomprehensionreadingchinesemachinedrcdscoreachieves

0 comments p. Extension

Add this Pith Number to your LaTeX paper

\usepackage{pith}
\pithnumber{AUB53BM7}

Prints a linked pith:AUB53BM7 badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

In this paper, we introduce DRCD (Delta Reading Comprehension Dataset), an open domain traditional Chinese machine reading comprehension (MRC) dataset. This dataset aimed to be a standard Chinese machine reading comprehension dataset, which can be a source dataset in transfer learning. The dataset contains 10,014 paragraphs from 2,108 Wikipedia articles and 30,000+ questions generated by annotators. We build a baseline model that achieves an F1 score of 89.59%. F1 score of Human performance is 93.30%.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

End-to-end Contrastive Language-Speech Pretraining Model For Long-form Spoken Question Answering
cs.SD 2025-11 unverdicted novelty 5.0

CLSR is an end-to-end contrastive language-speech retriever using an intermediate text-like conversion step to improve retrieval of relevant segments from long audio for spoken question answering.