pith. machine review for the scientific record. sign in

arxiv: 1506.03340 · v3 · submitted 2015-06-10 · 💻 cs.CL · cs.AI· cs.NE

Recognition: unknown

Teaching Machines to Read and Comprehend

Authors on Pith no claims yet
classification 💻 cs.CL cs.AIcs.NE
keywords documentsreadanswerlanguagelargemachinesquestionsreading
0
0 comments X
read the original abstract

Teaching machines to read natural language documents remains an elusive challenge. Machine reading systems can be tested on their ability to answer questions posed on the contents of documents that they have seen, but until now large scale training and test datasets have been missing for this type of evaluation. In this work we define a new methodology that resolves this bottleneck and provides large scale supervised reading comprehension data. This allows us to develop a class of attention based deep neural networks that learn to read real documents and answer complex questions with minimal prior knowledge of language structure.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension

    cs.CL 2017-05 accept novelty 8.0

    TriviaQA is a new large-scale dataset for reading comprehension that features complex compositional questions, high lexical variability, and cross-sentence reasoning requirements, where current baselines reach only 40...

  2. The Partial Testimony of Logs: Evaluation of Language Model Generation under Confounded Model Choice

    cs.LG 2026-05 unverdicted novelty 7.0

    An identification theorem shows that a randomized experiment and simulator together recover causal model values from confounded logs, with logs used only afterward to reduce estimation error.

  3. MS MARCO: A Human Generated MAchine Reading COmprehension Dataset

    cs.CL 2016-11 accept novelty 7.0

    MS MARCO is a new large-scale machine reading comprehension dataset built from real Bing search queries, human-generated answers, and web passages, supporting three tasks including answer synthesis and passage ranking.