Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, & Yusuke Iwasawa (2022), References, Textbook of AI

Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, & Yusuke Iwasawa (2022)

Advances in Neural Information Processing Systems 35.

URL: https://arxiv.org/abs/2205.11916

Abstract. Demonstrates that the chain-of-thought capability documented by Wei et al. can be elicited zero-shot , with no exemplars, simply by prepending the prompt with the phrase "Let's think step by step." On GSM8K and other reasoning benchmarks the trick takes models from near-chance to near-CoT-prompted performance. The paper is the empirical foundation of zero-shot CoT and shaped the prompt-engineering literature that followed; it is also a clear demonstration that latent reasoning capability is present in pre-trained models and merely needs to be unlocked.

Tags: language-models reasoning chain-of-thought

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).

Large Language Models are Zero-Shot Reasoners