Non-Autoregressive Neural Machine Translation with Enhanced Decoder Input

Di He; Junliang Guo; Linli Xu; Tao Qin; Tie-Yan Liu; Xu Tan

arxiv: 1812.09664 · v1 · pith:FYWGW25Knew · submitted 2018-12-23 · 💻 cs.CL · cs.LG

Non-Autoregressive Neural Machine Translation with Enhanced Decoder Input

Junliang Guo , Xu Tan , Di He , Tao Qin , Linli Xu , Tie-Yan Liu This is my paper

classification 💻 cs.CL cs.LG

keywords decoderinputsembeddingsmodelstokenstranslationwordaccuracy

0 comments

read the original abstract

Non-autoregressive translation (NAT) models, which remove the dependence on previous target tokens from the inputs of the decoder, achieve significantly inference speedup but at the cost of inferior accuracy compared to autoregressive translation (AT) models. Previous work shows that the quality of the inputs of the decoder is important and largely impacts the model accuracy. In this paper, we propose two methods to enhance the decoder inputs so as to improve NAT models. The first one directly leverages a phrase table generated by conventional SMT approaches to translate source tokens to target tokens, which are then fed into the decoder as inputs. The second one transforms source-side word embeddings to target-side word embeddings through sentence-level alignment and word-level adversary learning, and then feeds the transformed word embeddings into the decoder as inputs. Experimental results show our method largely outperforms the NAT baseline~\citep{gu2017non} by $5.11$ BLEU scores on WMT14 English-German task and $4.72$ BLEU scores on WMT16 English-Romanian task.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation
cs.CL 2019-06 unverdicted novelty 6.0

Reinforce-NAT and FS-decoder retrieve target sequential information for non-autoregressive translation, yielding higher BLEU than baseline NAT while preserving fast decoding and approaching autoregressive quality.