OpenAI Codex (2025 generation), Glossary, Textbook of AI

OpenAI Codex (2025 generation) is OpenAI's family of agentic software-engineering products launched through 2025. The name reuses an earlier brand: the original Codex (2021) was a fine-tuned GPT-3 model trained on public code that powered the first version of GitHub Copilot. The 2025 Codex is a different beast: a cloud-based agent product built on top of o3-class reasoning models, with dedicated infrastructure for long-horizon coding tasks.

Components. The 2025 Codex line comprises several deployment surfaces:

Codex (cloud): a hosted environment in chatgpt.com/codex where the user describes a task, Codex spins up a sandboxed container with the repository checked out, and the agent works for minutes to hours, returning a pull request.
Codex CLI: a terminal-based local agent, comparable to Anthropic's Claude Code, that runs against the user's own machine.
Codex IDE extensions for VS Code and JetBrains IDEs that delegate longer tasks to the cloud agent while handling short edits locally.

Underlying models. Codex agents run over reasoning-trained models in the o3, GPT-5, and successor families, often a Codex-specialised variant fine-tuned on software-engineering trajectories. The agent issues tool calls for shell, file editing, web browsing and Git operations, and reasons explicitly between steps using thinking tokens.

Performance. On SWE-Bench Verified Codex (cloud) is among the leaders, posting scores in the 70%+ band through 2025. On the OpenAI internal "engineering tasks" benchmark and on customer-defined tasks, Codex is positioned as competitive with Claude Code and Devin.

Distinction from 2021 Codex. The earlier model was a single-shot autoregressive code completer with no notion of tools, agency or planning. It was retired in March 2023 in favour of GPT-3.5 / GPT-4. The 2025 Codex resurrected the name to mark OpenAI's return to a software-engineering-focused product, but the underlying technology shares almost nothing with its namesake: it is a reasoning-model-driven agent, not a code-completion model.

Position in the field. As of early 2026, the competitive landscape for cloud coding agents has three major frontier-lab entrants: Anthropic Claude Code, OpenAI Codex (2025) and Google Jules, plus Devin and a long tail of open-source agents (OpenHands, Aider, Plandex). Each has converged on a similar architecture: reasoning model, sandbox, file/Git/shell tools, optional browser, MCP for extensibility, and pull-request-shaped output. Differentiation is mostly in reliability on multi-hour tasks, integration with company workflows, and pricing.

The Codex story illustrates a broader 2025 trend: products are increasingly defined by the scaffolding and harness around the model rather than the model alone, even though the model remains the primary determinant of capability.

Discussed in:

Chapter 16: Ethics & Safety, Agents

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).