pith. machine review for the scientific record. sign in

arxiv: 1704.04368 · v2 · submitted 2017-04-14 · 💻 cs.CL

Recognition: unknown

Get To The Point: Summarization with Pointer-Generator Networks

Abigail See, Christopher D. Manning, Peter J. Liu

Authors on Pith no claims yet
classification 💻 cs.CL
keywords summarizationtexttheyabstractivemodelmodelsnovelpointer-generator
0
0 comments X
read the original abstract

Neural sequence-to-sequence models have provided a viable new approach for abstractive text summarization (meaning they are not restricted to simply selecting and rearranging passages from the original text). However, these models have two shortcomings: they are liable to reproduce factual details inaccurately, and they tend to repeat themselves. In this work we propose a novel architecture that augments the standard sequence-to-sequence attentional model in two orthogonal ways. First, we use a hybrid pointer-generator network that can copy words from the source text via pointing, which aids accurate reproduction of information, while retaining the ability to produce novel words through the generator. Second, we use coverage to keep track of what has been summarized, which discourages repetition. We apply our model to the CNN / Daily Mail summarization task, outperforming the current abstractive state-of-the-art by at least 2 ROUGE points.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. The Partial Testimony of Logs: Evaluation of Language Model Generation under Confounded Model Choice

    cs.LG 2026-05 unverdicted novelty 7.0

    An identification theorem shows that a randomized experiment and simulator together recover causal model values from confounded logs, with logs used only afterward to reduce estimation error.

  2. Multitask Prompted Training Enables Zero-Shot Task Generalization

    cs.LG 2021-10 conditional novelty 7.0

    Multitask fine-tuning of an encoder-decoder model on prompted datasets produces zero-shot generalization that often beats models up to 16 times larger on standard benchmarks.

  3. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

    cs.CL 2019-10 accept novelty 7.0

    BART introduces a denoising pretraining method for seq2seq models that matches RoBERTa on GLUE and SQuAD while setting new state-of-the-art results on abstractive summarization, dialogue, and QA with up to 6 ROUGE gains.

  4. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

    cs.LG 2019-10 unverdicted novelty 7.0

    T5 casts all NLP tasks as text-to-text generation, systematically explores pre-training choices, and reaches strong performance on summarization, QA, classification and other tasks via large-scale training on the Colo...

  5. Fine-Tuning Language Models from Human Preferences

    cs.CL 2019-09 unverdicted novelty 7.0

    Language models fine-tuned via RL on 5k-60k human preference comparisons produce stylistically better text continuations and human-preferred summaries that sometimes copy input sentences.