Geoffrey Irving, People, Textbook of AI

1980s–, Computer scientist; AI safety researcher

Geoffrey Irving is an American computer scientist who has been one of the leading AI alignment researchers of the modern era. He worked at OpenAI on debate-based alignment proposals (AI Safety via Debate, 2018, with Christiano and Askell), then at DeepMind on alignment for advanced systems, and from 2024 as Chief Scientist of the UK AI Security Institute (founded as the AI Safety Institute in 2023, renamed in February 2025) at the Department for Science, Innovation and Technology.

Irving's research has covered scalable oversight (how to provide useful training signal for AI tasks where humans cannot easily evaluate behaviour), debate-based alignment, mathematical software (he was a major contributor to Lean's mathematical library) and theorem proving. He has been a prominent voice in UK AI policy.

Video

Related people: Paul Christiano

Works cited in this book:

AI safety via debate (2018) (with Paul Christiano, Dario Amodei)
Taxonomy of Risks posed by Language Models (2022) (with Laura Weidinger, Jonathan Uesato, Maribeth Rauh, Conor Griffin, Po-Sen Huang, John Mellor, Amelia Glaese, Myra Cheng, Borja Balle, Atoosa Kasirzadeh, Courtney Biles, Sasha Brown, Zac Kenton, Will Hawkins, Tom Stepleton, Abeba Birhane, Lisa Anne Hendricks, Laura Rimell, William Isaac, Julia Haas, Sean Legassick, Iason Gabriel)

Discussed in:

Chapter 16: Ethics & Safety, AI Safety

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).