OpenAI Codex

vGPT-5.3-Codex era

OpenAI

Agentcoding-agentcloud-sandboxopenaiparallel-tasks
82
Strong
About This Agent

OpenAI's coding agent spanning a cloud agent that runs tasks in isolated containers and an open-source CLI. Delegates parallel software tasks (features, fixes, PRs) powered by GPT-5.3-Codex, with network access disabled by default in the cloud.

Last Evaluated: June 10, 2026
Official Website

Trust Vector Analysis

Dimension Breakdown

🚀Performance & Reliability
+
task completion accuracy

Benchmark review and hands-on evaluation of PR-producing cloud tasks

Evidence
OpenAI: Introducing CodexCloud agent completes real-world engineering tasks ending in tested, citable PRs
GPT-5-Codex upgradeGPT-5-Codex (now GPT-5.3-Codex) substantially improved long-task completion and code quality
highVerified: 2026-06-10
tool use reliability

Testing of in-container command execution, editing, and test running across repositories

Evidence
Codex cloud documentationReliable shell, editor, and test-runner usage inside containers; setup scripts configure project dependencies
highVerified: 2026-06-10
multi step planning

Long-horizon task evaluation from issue description to passing tests and PR

Evidence
OpenAI: Introducing CodexHandles long-horizon tasks autonomously: reading codebases, implementing changes, running tests, and iterating until passing
highVerified: 2026-06-10
memory persistence

Review of AGENTS.md guidance persistence and per-task container statelessness

Evidence
Codex documentation (AGENTS.md)AGENTS.md files persist project conventions and instructions across tasks; cloud tasks are otherwise stateless per container
mediumVerified: 2026-06-10
error recovery

Observed recovery behavior from failing tests, build errors, and missing dependencies

Evidence
Codex cloud documentationIterates on failing tests and lint errors until passing; surfaces logs and citations when blocked
mediumVerified: 2026-06-10
agent collaboration

Assessment of parallel task fan-out and coordination model

Evidence
OpenAI: Introducing CodexMany tasks run in parallel containers simultaneously, though without inter-agent handoff primitives
mediumVerified: 2026-06-10
🛡️Security
+
tool sandboxing

Review of container isolation, default-deny network policy, and CLI sandbox mechanisms

Evidence
Codex cloud documentationCloud tasks run in isolated containers with internet access disabled by default during execution; configurable domain allowlists
Codex CLI repositoryCLI offers OS-level sandboxing (Seatbelt on macOS, Landlock/seccomp on Linux) with approval modes
highVerified: 2026-06-10
access control

Assessment of repository scoping, environment controls, and approval mode granularity

Evidence
Codex cloud documentationScoped GitHub repo access, per-environment configuration, and CLI approval modes (suggest/auto-edit/full-auto)
highVerified: 2026-06-10
prompt injection defense

Review of network-isolation mitigations and model-level injection defenses

Evidence
Codex cloud documentationDefault-disabled internet during execution sharply limits exfiltration from injected instructions; agent-specific safety training applied
mediumVerified: 2026-06-10
data isolation

Architecture review of per-task container isolation and environment scoping

Evidence
Codex cloud documentationEach task gets its own ephemeral container preloaded only with the target repository and configured environment
highVerified: 2026-06-10
open source transparency

License and source availability review of CLI versus cloud service

Evidence
Codex CLI repositoryCodex CLI is open source under Apache-2.0; the cloud agent service and models remain proprietary
highVerified: 2026-06-10
🔒Privacy & Compliance
+
data retention

Review of OpenAI retention and training policies across ChatGPT plan tiers

Evidence
OpenAI enterprise privacyBusiness/Enterprise data excluded from training by default; consumer ChatGPT plan settings govern Codex task data
mediumVerified: 2026-06-10
gdpr compliance

Compliance certification and DPA availability review

Evidence
OpenAI Trust PortalSOC 2 Type II and DPA available for business tiers covering Codex usage
mediumVerified: 2026-06-10
third party data sharing

Data flow analysis of repository access, GitHub integration, and network policy

Evidence
Codex cloud documentationRepository code is processed by OpenAI; default-off internet prevents task-time data flows to other third parties
mediumVerified: 2026-06-10
local deployment option

Deployment options assessment of local CLI versus cloud-only agent and models

Evidence
Codex CLI repositoryOpen-source CLI runs locally and supports OpenAI-compatible endpoints, but flagship Codex models require OpenAI's cloud
highVerified: 2026-06-10
👁️Trust & Transparency
+
documentation quality

Documentation completeness review across cloud, CLI, and IDE surfaces

Evidence
Codex developer documentationDedicated developer docs covering cloud environments, CLI, IDE integration, AGENTS.md, and pricing
highVerified: 2026-06-10
execution traceability

Review of task logs, test output citations, and diff provenance

Evidence
OpenAI: Introducing CodexTasks produce verifiable evidence: terminal logs, test outputs, and citations for every action taken
highVerified: 2026-06-10
decision explainability

Assessment of task summaries, cited reasoning, and pre-merge review surfaces

Evidence
Codex cloud documentationAgent explains its approach in task summaries with linked evidence before users accept changes
mediumVerified: 2026-06-10
open source code

Open source assessment weighting open CLI against proprietary cloud service

Evidence
Codex CLI repositoryApache-2.0 CLI with active public development; cloud agent and models closed
highVerified: 2026-06-10
community activity

Community engagement analysis via GitHub activity and release cadence

Evidence
Codex CLI repositoryTens of thousands of stars, rapid release cadence, and a large contributor community since April 2025
highVerified: 2026-06-10
⚙️Operational Excellence
+
ease of integration

Setup and integration surface assessment across ChatGPT, CLI, IDE, and GitHub

Evidence
Codex developer documentationAvailable in ChatGPT (web/mobile), as a CLI, IDE extensions, and GitHub integration with @codex mentions
highVerified: 2026-06-10
scalability

Assessment of parallel container execution and plan-tier task throughput

Evidence
OpenAI: Introducing CodexCloud architecture runs many tasks in parallel isolated containers, enabling fleet-style delegation
highVerified: 2026-06-10
cost predictability

Pricing model analysis of plan-based limits, Pro 5x tier, and typical-usage estimates

Evidence
Codex pricing documentationIncluded in ChatGPT plans with a $100/mo Pro 5x tier added 2026-04-09; typical usage estimated ~$100-200/dev/month
highVerified: 2026-06-10
monitoring capabilities

Review of task logs, usage visibility, and admin monitoring features

Evidence
Codex cloud documentationPer-task logs, usage dashboards, and admin controls for business plans; deeper APM requires external tooling
mediumVerified: 2026-06-10
production readiness

Maturity assessment from rollout timeline, model upgrades, and enterprise availability

Evidence
Codex rollout historyResearch preview 2025-05-16, ChatGPT Plus rollout 2025-06-03, GPT-5-Codex upgrades Sept 2025; now mature on GPT-5.3-Codex
highVerified: 2026-06-10
Strengths
  • +Strong isolation: per-task containers with internet disabled by default during execution
  • +Runs many tasks in parallel for fleet-style software delegation
  • +Verifiable outputs with terminal logs, test results, and action citations
  • +Open-source Apache-2.0 CLI with local OS-level sandboxing
  • +Deep GitHub integration from task to reviewed pull request
  • +Continuously upgraded models, currently GPT-5.3-Codex
Limitations
  • !Default network isolation can block tasks needing external dependencies unless allowlists are configured
  • !Cloud agent and Codex models are proprietary with no self-hosted option
  • !Plan-based limits are opaque; heavy users may need the $100/mo Pro 5x tier (~$100-200/dev/month typical)
  • !Stateless per-task containers limit cross-task memory beyond AGENTS.md
  • !Environment setup scripts add onboarding friction for complex monorepos
Metadata
license: Cloud agent proprietary; Codex CLI Apache-2.0 (github.com/openai/codex)
supported models
0: GPT-5.3-Codex (current)
1: GPT-5-Codex (Sept 2025)
2: codex-1 (launch)
programming languages
0: Language-agnostic (any language in the repository)
deployment type: Managed cloud containers + local open-source CLI
tool support
0: Container shell and editor
1: Test runners
2: GitHub PR integration
3: Configurable network allowlists
4: AGENTS.md project instructions
first release: CLI April 2025; cloud agent research preview 2025-05-16; ChatGPT Plus rollout 2025-06-03
pricing: Included in ChatGPT plans; $100/mo Pro 5x tier (added 2026-04-09); typical usage ~$100-200/dev/month
interfaces
0: ChatGPT web and mobile
1: Codex CLI
2: IDE extensions
3: GitHub (@codex mentions)

Use Case Ratings

code generation

Core use case: parallel feature work, bug fixes, refactors, and PR generation with test evidence

data analysis

Capable of scripted analysis within containers, though network-off defaults limit live data access

research assistant

Strong at codebase Q&A and architecture exploration; not aimed at general web research

education

Cited logs and diffs make its work reviewable for learning, but it targets professional workflows