What it is
Codex is OpenAI's agentic harness — the runtime that wraps GPT-5-class models with planning loops, tool use, browser automation, and computer use. Originally pitched as a coding agent, it has grown into the general-purpose harness for any mission that requires "use this software until the task is finished."
Models it runs
- GPT-5.5 — released April 23, 2026. Unified text/image/audio/video architecture in a single model. Read the model card.
- GPT-5.5 Pro — same model with extended inference-time reasoning. The choice when accuracy matters more than latency.
- GPT-5.3 Codex — the previous coding flagship from February 2026.
- GPT-5 — the original GPT-5 base model.
What makes it distinct
- Unified multimodal. One call processes text, screenshots, audio, and video — no stitching together GPT + Whisper + Sora behind the agent.
- Computer use. Operates software, fills forms, drives a browser end-to-end. Strongest of the three harnesses on this dimension.
- Knowledge work. Researching online, analyzing data, creating documents and spreadsheets — moving across tools until the task is done.
- Web search built in. Real-time fact-checking without a separate connector.
Capabilities at a glance
- Sub-agents: yes — parent agents call
spawn_agent, message child agents, and await completion. - Skills: yes — system skills and per-project skills with per-tool approval modes.
- MCP servers: yes — both as client (config.toml, parallel tool calls toggle) and as server (
codex mcp-serverfor other MCP clients to consume Codex tools). - Hooks: partial — only a notify hook fires on turn completion; no full lifecycle event system.
- Slash commands: yes —
/apps,/exec,/sandbox,/mcp,/debug, plus user-defined via skills and AGENTS.md. - Permissions / sandboxing: three modes (
read-only,workspace-write,danger-full-access); OS-level sandboxing via seatbelt (macOS), landlock (Linux), AppContainer (Windows). - Plugins: yes — marketplace system with OpenAI-curated and bundled marketplaces; installable from remote sources.
- Multi-model: OpenAI Responses API, Amazon Bedrock, Ollama, and any OpenAI-compatible endpoint via
config.toml. - Sessions: persisted to disk via SQLite;
codex exec --ephemeralfor stateless runs; rollout-trace bundle for diagnostics. - Surfaces: Ratatui TUI, headless CLI, web app, desktop app, IDE extensions (VS Code/Cursor/Windsurf), and an HTTP App Server.
- Headless / SDK: yes —
codex exec, the App Server protocol, plus TypeScript and Python client SDKs. - License: Apache 2.0; full Rust source on github.com/openai/codex.
How TeamDay uses it
Selecting Codex as a TeamDay agent's harness unlocks every GPT-5-family model. The day OpenAI ships a new variant, the dropdown updates — no integration release on our side.
- Open an agent → Settings → Harness → Codex.
- Pick the model. Default is GPT-5.5 for most missions; Pro for high-accuracy work.
- Attach MCP servers — including the media MCP for visual generation.
- Run a mission. Codex agents play well alongside Claude Code agents on the same workspace.
When to pick Codex
- Mixed-modality missions — process a customer call, transcribe, extract action items, draft follow-up email — in one mission.
- Computer-use heavy work — drive software, fill forms, navigate dashboards.
- Knowledge work that ranges across tools (research → spreadsheet → document → email).
- Anything where you want the agent to watch a screen and react.
How it's benchmarked
Codex is evaluated on Terminal-Bench (tbench.ai) — the standard suite for measuring how well a model-plus-harness combination completes real terminal tasks end-to-end. The leaderboard tracks how each new GPT-5 release moves Codex's score on Terminal-Bench Pro and Verified.
When to pick something else
- Claude Code — for long-horizon coding and missions requiring self-verification.
- Gemini CLI — when you need 2M context or Google-stack integration.