pith. machine review for the scientific record. sign in

arxiv: 1409.1259 · v2 · submitted 2014-09-03 · 💻 cs.CL · stat.ML

Recognition: unknown

On the Properties of Neural Machine Translation: Encoder-Decoder Approaches

Authors on Pith no claims yet
classification 💻 cs.CL stat.ML
keywords neuraltranslationmachinesentenceconvolutionaldecoderencodergated
0
0 comments X
read the original abstract

Neural machine translation is a relatively new approach to statistical machine translation based purely on neural networks. The neural machine translation models often consist of an encoder and a decoder. The encoder extracts a fixed-length representation from a variable-length input sentence, and the decoder generates a correct translation from this representation. In this paper, we focus on analyzing the properties of the neural machine translation using two models; RNN Encoder--Decoder and a newly proposed gated recursive convolutional neural network. We show that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase. Furthermore, we find that the proposed gated recursive convolutional network learns a grammatical structure of a sentence automatically.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 9 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Session-based Recommendations with Recurrent Neural Networks

    cs.LG 2015-11 conditional novelty 8.0

    RNNs with ranking loss outperform item-to-item baselines for session-based recommendations on two datasets.

  2. TCD-Arena: Assessing Robustness of Time Series Causal Discovery Methods Against Assumption Violations

    cs.LG 2026-05 unverdicted novelty 7.0

    TCD-Arena is a new customizable testing framework that runs millions of experiments to map how 33 different assumption violations affect time series causal discovery methods and shows ensembles can boost overall robustness.

  3. Neural architectures for resolving references in program code

    cs.LG 2026-04 unverdicted novelty 6.0

    New seq2seq architectures for permutation indexing outperform baselines on synthetic reference-resolution tasks and reduce real decompilation error rates by 42%.

  4. The illusory simplicity of the feedforward pass: evidence for the dynamical nature of stimulus encoding along the primate ventral stream

    q-bio.NC 2026-04 unverdicted novelty 6.0

    Primate ventral stream encodes visual stimuli through evolving neural dynamics that carry category information beyond any fixed spatial pattern during the initial feedforward pass.

  5. Leveraging Artist Catalogs for Cold-Start Music Recommendation

    cs.IR 2026-04 unverdicted novelty 6.0

    ACARec attends over artist catalogs to generate CF embeddings for new tracks, more than doubling recall and NDCG versus content-only baselines in music recommendation.

  6. EHR-RAGp: Retrieval-Augmented Prototype-Guided Foundation Model for Electronic Health Records

    cs.IR 2026-05 unverdicted novelty 5.0

    EHR-RAGp is a retrieval-augmented EHR foundation model that employs prototype-guided retrieval to dynamically integrate relevant historical patient context, outperforming prior models on clinical prediction tasks.

  7. Stable-GFlowNet: Toward Diverse and Robust LLM Red-Teaming via Contrastive Trajectory Balance

    cs.LG 2026-05 unverdicted novelty 5.0

    Stable-GFlowNet improves training stability and attack diversity in LLM red-teaming by eliminating Z estimation via contrastive trajectory balance while preserving GFN optimality.

  8. Delta6: A Low-Cost, 6-DOF Force-Sensing Flexible End-Effector

    cs.RO 2026-04 unverdicted novelty 5.0

    Delta6 delivers a low-cost 6-DOF force-sensing end-effector with 3.8% FS accuracy using sequence models, validated on robot-arm tasks like buffing and tight assembly.

  9. Large Language Models: A Survey

    cs.CL 2024-02 accept novelty 3.0

    The paper surveys key large language models, their training methods, datasets, evaluation benchmarks, and future research directions in the field.