Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Lukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, & Jeffrey Dean (2016), References, Textbook of AI

Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Lukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, & Jeffrey Dean (2016)

arXiv:1609.08144.

URL: https://arxiv.org/abs/1609.08144

Abstract. The GNMT system paper. Describes Google Translate's first production neural machine translation system: a deep LSTM encoder-decoder with attention, wordpiece tokenisation, model parallelism across multiple GPUs and a battery of inference-time tricks including length normalisation and a coverage penalty. The deployment in 2016 cut translation error rates by 60% relative to the previous phrase-based system and is one of the canonical industrial deep-learning success stories. The length-normalisation and coverage-penalty formulae from this paper are widely cited as the standard recipes for beam-search decoding in sequence models.

Tags: sequence-models machine-translation history

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).

Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation