Samy Bengio, Oriol Vinyals, Navdeep Jaitly, & Noam Shazeer (2015)
Advances in Neural Information Processing Systems 28.
URL: https://arxiv.org/abs/1506.03099
Abstract. Identifies the exposure-bias problem in teacher-forced sequence models, at training time the model conditions on ground-truth tokens, but at inference time it must condition on its own previous predictions, leading to a distribution shift that compounds across long sequences. Proposes scheduled sampling: at each step during training, replace the ground-truth token with the model's own previous prediction with some probability that is annealed up over training. Shows improved BLEU on machine translation and constituency parsing.
Tags: sequence-models rnn training