pith. sign in

arxiv: 1808.10006 · v2 · pith:5GXO22KVnew · submitted 2018-08-29 · 💻 cs.CL

Correcting Length Bias in Neural Machine Translation

classification 💻 cs.CL
keywords beamtranslationbiascorrectingmachineneuralproblemproblems
0
0 comments X
read the original abstract

We study two problems in neural machine translation (NMT). First, in beam search, whereas a wider beam should in principle help translation, it often hurts NMT. Second, NMT has a tendency to produce translations that are too short. Here, we argue that these problems are closely related and both rooted in label bias. We show that correcting the brevity problem almost eliminates the beam problem; we compare some commonly-used methods for doing this, finding that a simple per-word reward works well; and we introduce a simple and quick way to tune this reward using the perceptron algorithm.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Learning Variable-Length Tokenization for Generative Recommendation

    cs.LG 2026-05 unverdicted novelty 7.0

    VarLenRec learns variable-length semantic IDs for generative recommendation by allocating longer codes to tail items via popularity-weighted information budget allocation, hyperbolic residual quantization, and a diffe...

  2. PARM: Pipeline-Adapted Reward Model

    cs.AI 2026-04 unverdicted novelty 6.0

    PARM adapts reward models to multi-stage LLM pipelines via pipeline data and direct preference optimization, improving execution rate and solving accuracy on optimization benchmarks and showing transfer to GSM8K.

  3. Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation

    cs.CL 2023-02 unverdicted novelty 6.0

    Semantic entropy improves uncertainty estimation in natural language generation by incorporating semantic equivalences, outperforming standard entropy baselines on predicting model accuracy for question answering.

  4. TokUR: Token-Level Uncertainty Estimation for Large Language Model Reasoning

    cs.LG 2025-05 unverdicted novelty 5.0

    TokUR estimates token-level uncertainty via low-rank weight perturbations in LLMs, aggregates signals to correlate with correctness, and uses them to improve reasoning performance on math tasks.