hub

On the Properties of Neural Machine Translation: Encoder-Decoder Approaches

Kyunghyun Cho, Bart Van Merriënboer, Dzmitry Bahdanau, Yoshua Bengio · 2014 · cs.CL · arXiv 1409.1259

40 Pith papers cite this work. Polarity classification is still indexing.

40 Pith papers citing it

open full Pith review browse 40 citing papers arXiv PDF

abstract

Neural machine translation is a relatively new approach to statistical machine translation based purely on neural networks. The neural machine translation models often consist of an encoder and a decoder. The encoder extracts a fixed-length representation from a variable-length input sentence, and the decoder generates a correct translation from this representation. In this paper, we focus on analyzing the properties of the neural machine translation using two models; RNN Encoder--Decoder and a newly proposed gated recursive convolutional neural network. We show that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase. Furthermore, we find that the proposed gated recursive convolutional network learns a grammatical structure of a sentence automatically.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

Session-based Recommendations with Recurrent Neural Networks

cs.LG · 2015-11-21 · conditional · novelty 8.0

RNNs with ranking loss outperform item-to-item baselines for session-based recommendations on two datasets.

Streaming Reinforcement Learning under Partial Observability with Real-Time Recurrent Learning

cs.LG · 2026-05-23 · unverdicted · novelty 7.0

Recurrent trace units enable exact RTRL with linear time/memory for streaming RL under partial observability, sustaining performance on long-chain memory tasks where TBPTT baselines collapse.

Weakly Supervised Cross-Modal Learning for 4D Radar Scene Flow Estimation

cs.CV · 2026-05-18 · unverdicted · novelty 7.0 · 2 refs

A task-specific iterative framework for weakly supervised 4D radar scene flow estimation uses instance-aware self-supervised losses from 2D tracking/segmentation and a rigid static loss from odometry to outperform LiDAR-dependent cross-modal and fully supervised methods on the VoD dataset.

PluRule: A Benchmark for Moderating Pluralistic Communities on Social Media

cs.CL · 2026-05-16 · unverdicted · novelty 7.0

PluRule is a new multimodal multilingual benchmark showing that state-of-the-art vision-language models perform only marginally better than a trivial baseline at detecting specific rule violations in pluralistic online communities.

Reduced-Order Surrogates for Forced Flexible Mesh Coastal-Ocean Models

cs.CE · 2026-02-05 · unverdicted · novelty 7.0

Koopman autoencoders with forcings and temporal unrolling deliver accurate year-long predictions for coastal-ocean models at 300-1400x speedup, outperforming POD in two of three cases.

Recurrent Video Masked Autoencoders

cs.CV · 2025-12-15 · unverdicted · novelty 7.0

RVM uses recurrent computation inside a masked autoencoder to learn video representations that match or exceed prior video and image models on classification, tracking, and dense spatial tasks with up to 30x better parameter efficiency.

TCD-Arena: Assessing Robustness of Time Series Causal Discovery Methods Against Assumption Violations

cs.LG · 2026-05-04 · unverdicted · novelty 7.0

TCD-Arena is a new customizable testing framework that runs millions of experiments to map how 33 different assumption violations affect time series causal discovery methods and shows ensembles can boost overall robustness.

Estimation--Prediction Tradeoff in Causal Probabilistic Temporal Graphs

cs.LG · 2026-06-26 · unverdicted · novelty 6.0

Characterizes an estimation-prediction tradeoff in binary logistic models for causal probabilistic temporal graphs and proposes a framework to jointly evaluate temporal link prediction with causal parameter recovery via Cramér-Rao bounds.

Conserved Kinematic Representations enable Zero-Shot Decoding in Handwriting BCIs

q-bio.NC · 2026-05-18 · unverdicted · novelty 6.0

A zero-shot machine learning decoder for handwriting BCIs achieves 64% hits@3 retrieval on unseen letters by exploiting conserved kinematic neural representations.

Stable-GFlowNet: Toward Diverse and Robust LLM Red-Teaming via Contrastive Trajectory Balance

cs.LG · 2026-05-01 · unverdicted · novelty 6.0 · 2 refs

Stable-GFlowNet stabilizes GFN training for LLM red-teaming by eliminating Z estimation via pairwise comparisons and robust masking against noisy rewards while adding a fluency stabilizer.

M$^2$RNN: Non-Linear RNNs with Matrix-Valued States for Scalable Language Modeling

cs.LG · 2026-03-15 · unverdicted · novelty 6.0

M²RNN achieves perfect state tracking at unseen lengths and outperforms Gated DeltaNet hybrids by 0.4-0.5 perplexity on 7B models with 3x smaller recurrent states.

SpectraLLM: Uncovering the Ability of LLMs for Molecular Structure Elucidation from Multi-Spectral Data

q-bio.QM · 2025-08-04 · unverdicted · novelty 6.0

SpectraLLM is an LLM fine-tuned to predict small-molecule structures from single or multiple spectra, reporting state-of-the-art results on four public benchmarks with gains from multi-modal input.

DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory

cs.CV · 2023-08-16 · unverdicted · novelty 6.0

DragNUWA integrates text, image, and trajectory controls into a diffusion video model using a Trajectory Sampler, Multiscale Fusion, and Adaptive Training to enable fine-grained open-domain video generation.

Variational Context: Exploiting Visual and Textual Context for Grounding Referring Expressions

cs.CV · 2019-07-08 · unverdicted · novelty 6.0

A variational Bayesian framework exploits reciprocity between referents and context plus semantic reproduction to improve referring expression grounding over pairwise methods in supervised and unsupervised settings.

Creating A Neural Pedagogical Agent by Jointly Learning to Review and Assess

cs.LG · 2019-06-26 · unverdicted · novelty 6.0

Bidirectional RNN with attention models real-time user knowledge from question-response sequences to predict correctness, outperforming baselines especially for new users on a large TOEIC mobile app dataset.

Recurrent Adversarial Service Times

stat.ML · 2019-06-24 · unverdicted · novelty 6.0

RNN for arrivals paired with recurrent GAN for service times to model queuing dynamics without assuming specific inter-event distributions.

Learning Belief Representations for Imitation Learning in POMDPs

cs.LG · 2019-06-22 · unverdicted · novelty 6.0

BMIL learns belief modules jointly with policies for GAIL-style imitation learning in POMDPs, outperforming separate training and standard GAIL on continuous control tasks.

Neural architectures for resolving references in program code

cs.LG · 2026-04-15 · unverdicted · novelty 6.0

New seq2seq architectures for permutation indexing outperform baselines on synthetic reference-resolution tasks and reduce real decompilation error rates by 42%.

Leveraging Artist Catalogs for Cold-Start Music Recommendation

cs.IR · 2026-04-08 · unverdicted · novelty 6.0

ACARec attends over artist catalogs to generate CF embeddings for new tracks, more than doubling recall and NDCG versus content-only baselines in music recommendation.

Leveraging Multimodality for Real-Time Classification of Transients and Variables found by the Zwicky Transient Facility

astro-ph.IM · 2026-06-30 · unverdicted · novelty 5.0

ORACLE-2 multimodal classifiers raise macro F1 from 0.52-0.66 (light-curve only) to 0.73 on ZTF Bright Transient Survey data and reach 0.88 on simulated ELAsTiCC data.

Direct Advantage Estimation for Scalable and Sample-efficient Deep Reinforcement Learning

cs.LG · 2026-06-18 · unverdicted · novelty 5.0

Extends DAE theory to POMDPs with minimal changes and introduces discrete latent dynamics to cut computational cost, with ALE experiments showing scalability and retained sample efficiency.

Linear Recurrent Unit with Semantic Modulation for Image Super-Resolution

cs.CV · 2026-06-18 · unverdicted · novelty 5.0

Introduces an LRU-based network with semantic modulation that claims to outperform prior super-resolution methods at similar computational cost.

Probabilistic Verification of Recurrent Neural Networks for Single and Multi-Agent Reinforcement Learning

cs.AI · 2026-05-14 · unverdicted · novelty 5.0

RNN-ProVe uses policy-driven sampling and statistical error bounds to produce high-confidence probabilistic estimates of behavioral violations in RNN policies for single- and multi-agent POMDPs.

On Safer Reinforcement Learning for Sedation and Analgesia in Intensive Care

cs.LG · 2026-01-30 · unverdicted · novelty 5.0

Offline RL for ICU sedation shows that adding 30-day mortality to the objective yields policies whose clinician agreement correlates negatively with mortality, unlike pain-only versions.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Probabilistic Verification of Recurrent Neural Networks for Single and Multi-Agent Reinforcement Learning cs.AI · 2026-05-14 · unverdicted · none · ref 2 · internal anchor
RNN-ProVe uses policy-driven sampling and statistical error bounds to produce high-confidence probabilistic estimates of behavioral violations in RNN policies for single- and multi-agent POMDPs.

On the Properties of Neural Machine Translation: Encoder-Decoder Approaches

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer