pith. machine review for the scientific record. sign in

arxiv: 1905.10847 · v1 · submitted 2019-05-26 · 💻 cs.CL · cs.AI· cs.IR

Recognition: unknown

Simple and Effective Curriculum Pointer-Generator Networks for Reading Comprehension over Long Narratives

Authors on Pith no claims yet
classification 💻 cs.CL cs.AIcs.IR
keywords readingcomprehensionpointer-generatortrainingcurriculumdiversedocumentsenabling
0
0 comments X
read the original abstract

This paper tackles the problem of reading comprehension over long narratives where documents easily span over thousands of tokens. We propose a curriculum learning (CL) based Pointer-Generator framework for reading/sampling over large documents, enabling diverse training of the neural model based on the notion of alternating contextual difficulty. This can be interpreted as a form of domain randomization and/or generative pretraining during training. To this end, the usage of the Pointer-Generator softens the requirement of having the answer within the context, enabling us to construct diverse training samples for learning. Additionally, we propose a new Introspective Alignment Layer (IAL), which reasons over decomposed alignments using block-based self-attention. We evaluate our proposed method on the NarrativeQA reading comprehension benchmark, achieving state-of-the-art performance, improving existing baselines by $51\%$ relative improvement on BLEU-4 and $17\%$ relative improvement on Rouge-L. Extensive ablations confirm the effectiveness of our proposed IAL and CL components.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Near-optimal and Efficient First-Order Algorithm for Multi-Task Learning with Shared Linear Representation

    cs.LG 2026-05 unverdicted novelty 7.0

    A new first-order algorithm for multi-task learning with shared linear representation achieves near-optimal error rates in constant iterations, improving existing methods by a factor of k.