pith. sign in

arxiv: 1704.04368 · v2 · pith:OYNSHDWSnew · submitted 2017-04-14 · 💻 cs.CL

Get To The Point: Summarization with Pointer-Generator Networks

classification 💻 cs.CL
keywords summarizationtexttheyabstractivemodelmodelsnovelpointer-generator
0
0 comments X
read the original abstract

Neural sequence-to-sequence models have provided a viable new approach for abstractive text summarization (meaning they are not restricted to simply selecting and rearranging passages from the original text). However, these models have two shortcomings: they are liable to reproduce factual details inaccurately, and they tend to repeat themselves. In this work we propose a novel architecture that augments the standard sequence-to-sequence attentional model in two orthogonal ways. First, we use a hybrid pointer-generator network that can copy words from the source text via pointing, which aids accurate reproduction of information, while retaining the ability to produce novel words through the generator. Second, we use coverage to keep track of what has been summarized, which discourages repetition. We apply our model to the CNN / Daily Mail summarization task, outperforming the current abstractive state-of-the-art by at least 2 ROUGE points.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 10 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. The Partial Testimony of Logs: Evaluation of Language Model Generation under Confounded Model Choice

    cs.LG 2026-05 unverdicted novelty 7.0

    An identification theorem shows that a randomized experiment and simulator together recover causal model values from confounded logs, with logs used only afterward to reduce estimation error.

  2. Multitask Prompted Training Enables Zero-Shot Task Generalization

    cs.LG 2021-10 conditional novelty 7.0

    Multitask fine-tuning of an encoder-decoder model on prompted datasets produces zero-shot generalization that often beats models up to 16 times larger on standard benchmarks.

  3. Learning to summarize from human feedback

    cs.CL 2020-09 conditional novelty 7.0

    Reinforcement learning on a reward model trained from human summary comparisons produces summaries humans prefer over supervised fine-tuning or human references on TL;DR and transfers to CNN/DM.

  4. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

    cs.CL 2019-10 accept novelty 7.0

    BART introduces a denoising pretraining method for seq2seq models that matches RoBERTa on GLUE and SQuAD while setting new state-of-the-art results on abstractive summarization, dialogue, and QA with up to 6 ROUGE gains.

  5. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

    cs.LG 2019-10 unverdicted novelty 7.0

    T5 casts all NLP tasks as text-to-text generation, systematically explores pre-training choices, and reaches strong performance on summarization, QA, classification and other tasks via large-scale training on the Colo...

  6. Fine-Tuning Language Models from Human Preferences

    cs.CL 2019-09 unverdicted novelty 7.0

    Language models fine-tuned via RL on 5k-60k human preference comparisons produce stylistically better text continuations and human-preferred summaries that sometimes copy input sentences.

  7. Enriching and Controlling Global Semantics for Text Summarization

    cs.CL 2021-09 unverdicted novelty 5.0

    A normalizing-flow neural topic model plus control mechanism are added to Transformer summarizers to supply and regulate global semantics, with reported gains over prior models on five benchmarks.

  8. Do Transformer Attention Heads Provide Transparency in Abstractive Summarization?

    cs.CL 2019-07 unverdicted novelty 5.0

    Analysis of transformer attention heads in abstractive summarization shows specialization in some heads and proposes a method to measure model reliance on learned attention distributions.

  9. Ranking sentences from product description & bullets for better search

    cs.IR 2019-07 unverdicted novelty 4.0

    Two RL-based extractive summarization models rank sentences from product fields by leveraging titles and click-through logs to improve search relevance.

  10. Saliency Maps Generation for Automatic Text Summarization

    cs.LG 2019-07 unverdicted novelty 4.0

    LRP saliency maps on a seq2seq summarization model sometimes reflect actual input feature usage and sometimes do not, requiring quantitative counterfactual validation.