codex

Specify platform support in AGENTS.md (#27966 )

Codex seems to do interesting things with `cfg`'s sometimes and it seems
it would be good to give it guidance about how broadly our Rust needs to
work.

This adds a very brief section to AGENTS.md explaining that we target
the major desktop OSes and that we want the vast majority of our logic
to be portable across them.

Adam Perry @ OpenAI · 2026-06-12 23:25:58 +00:00

fdd72e9cd9

[codex] Add crate API surface review rule (#27939 )

## Why

Review guidance should explicitly discourage widening crate APIs for
testing convenience. Keeping those boundaries narrow reduces accidental
coupling and prevents one-off test utilities from becoming durable
public surface area.

## What

- Add a crate API surface rule to `AGENTS.md`.
- Ask reviewers to keep crate APIs small and avoid proliferating
test-only helpers.

## Test plan

- Not run (documentation-only change).

pakrym-oai · 2026-06-12 13:00:21 -07:00

2ee854eaa5

lint: allow self-documenting builder arguments (#27507 )

Builder-style setters often repeat the setting name in both the method
and its sole argument. Calls such as `.enabled(false)` are already
self-documenting, so requiring `/*enabled*/` adds noise without
clarifying the call.

## What changed

- Exempt a method's sole non-self argument when its resolved parameter
name matches the method name.
- Continue validating any explicit argument comment against the resolved
parameter name.
- Continue requiring comments when method and parameter names differ or
when a method has multiple non-self arguments.
- Document the exception in `AGENTS.md` and the lint's own behavior
documentation.

## Examples

Before this change we'd need redundant comments like this:

```rust
builder.enabled(/*false*/ false);
builder.retry_count(/*retry_count*/ 3);
builder.base_url(/*base_url*/ None);
```

Now can be written like this:

```rust
builder.enabled(false);
builder.retry_count(3);
builder.base_url(None);
```

Still disallowed:

```rust
client.set_flag(true); // Method name does not match parameter `enabled`.
options.enabled(false, /*retry_count*/ 3); // More than one non-self argument.
options.enabled(/*value*/ false); // Explicit comment does not match `enabled`.
```

## Validation

Added UI coverage for boolean, numeric, and `None` builder arguments,
multi-argument methods, and explicit comment mismatches. Ran `rustup run
nightly-2025-09-18 cargo test` in `tools/argument-comment-lint`.

Adam Perry @ OpenAI · 2026-06-11 10:24:42 -07:00

52db447c77

Remove just bench-smoke from just test. (#26716 )

## Why

`just test` should run the test suite without also compiling and
executing benchmark smoke tests. Keeping benchmark validation explicit
avoids adding unrelated work to every project-specific test invocation.

## What changed

- Remove the `just bench-smoke` step from the Unix and Windows `test`
recipes.
- Document `just bench` and `just bench-smoke` as the explicit benchmark
commands in `AGENTS.md`.

## Validation

- `just test -p codex-arg0`
- `just --dry-run test`
- `just --dry-run bench-smoke`

Adam Perry @ OpenAI · 2026-06-05 18:53:12 -07:00

3ea9e98333

[codex] Add /usr/bin/bash shell fallback (#26538 )

## Why

Some Linux environments expose `bash` at `/usr/bin/bash` instead of
`/bin/bash`. The shell detection fallback list should cover both
standard locations once PATH/user-shell probing fails.

Stacked on #26480.

## What changed

- Add `/usr/bin/bash` to the bash fallback path list in
`codex-shell-command`.
- Extend shell type detection coverage for `/usr/bin/bash`.
- Add AGENTS.md testing guidance to avoid tests for statically defined
values and negative tests for removed logic.

## Verification

- `just test -p codex-shell-command`

pakrym-oai · 2026-06-05 09:38:26 -07:00

9ddb1de633

Move code review rules into AGENTS (#25738 )

## Why
Codex Review now supports repository-specific review rules in AGENTS.md.
Adding the review prompts there makes the guidance available as
repository review rules next to the code it governs while keeping the
existing local review skills intact.

## What changed
- Added a `## Code Review Rules` section to `AGENTS.md` with the
existing review prompts for model context, breaking changes, test
authoring, and change size.
- Preserved the existing `.codex/skills/code-review*` skill files.

## Verification
- `git diff --check origin/main...HEAD`

pakrym-oai · 2026-06-02 01:41:04 +00:00

c955f73078

Add Python version compatibility guidance (#25690 )

## Why

Python contributions in this repository should target the declared
Python 3 runtime instead of carrying Python 2 compatibility patterns
forward. When compatibility across Python 3 point releases matters,
contributors need a consistent source of truth for the minimum supported
version.

## What changed

- Added Python development guidance to `AGENTS.md` stating that the
repository uses Python 3+ and should not use the `__future__` module.
- Documented that contributors should check the nearest `pyproject.toml`
`requires-python` field when evaluating Python 3 point-release
compatibility.

## Testing

Not run (guidance-only change).

Adam Perry @ OpenAI · 2026-06-01 14:05:54 -07:00

433ac84102

[codex] document out-of-line test module convention (#25682 )

## Why

New unit test modules should follow one consistent layout so
implementation files stay focused and test suites remain easy to locate,
without creating cleanup churn in existing inline test modules.

## What changed

- Added `AGENTS.md` guidance requiring new test modules to use separate
sibling `*_tests.rs` files with an explicit `#[path = "..._tests.rs"]`
attribute.
- Clarified that existing inline `#[cfg(test)] mod tests { ... }`
modules should not be moved solely to follow the new convention.

## Validation

- Ran `git diff --check`.

Adam Perry @ OpenAI · 2026-06-01 13:36:16 -07:00

a29a5b0861

Check root Python script formatting in CI (#25165 )

## Why

Python files under `scripts/` were not covered by the repository
formatting recipe or the CI formatting job, so formatting drift could
merge unnoticed.

## What

- Add a dedicated `scripts/pyproject.toml` and `scripts/uv.lock` so
root-script formatting uses a locked Ruff version.
- Extend `just fmt` to format root Python scripts and add
`fmt-scripts-check` for CI.
- Run `just fmt-scripts-check` from `.github/workflows/ci.yml`,
installing `uv` through SHA-pinned `astral-sh/setup-uv` while retaining
the `uv` `0.11.3` pin.
- Apply Ruff formatting to the root Python scripts, including
`scripts/just-shell.py`, and extend
`sdk/python/tests/test_artifact_workflow_and_binaries.py` to cover the
root formatting recipe.
- Update `AGENTS.md` so agents run `just fmt` after code changes
anywhere in the repository.

## Validation

- Extended the existing Python SDK workflow test to assert that `just
fmt` includes root Python scripts.

Adam Perry @ OpenAI · 2026-06-01 18:50:23 +00:00

281b416c44

[codex] Remove external client session reset plumbing (#24157 )

## Why

The turn loop no longer needs to decide when a `ModelClientSession`
should reset its websocket state after compaction. That reset behavior
belongs inside the model client, where the websocket cache and retry
state are owned. The repo guidance now calls this out explicitly so
future changes let the incremental request logic decide whether the
previous request can be reused.

## What Changed

- Removed the `reset_client_session` return value from pre-sampling and
auto-compact helpers in `core/src/session/turn.rs`.
- Changed compaction helpers to return `CodexResult<()>` so callers only
handle success or failure.
- Made `ModelClientSession::reset_websocket_session` private to
`core/src/client.rs`, leaving it callable only from model-client
internals.
- Added `AGENTS.md` guidance not to call `reset_client_session`
unnecessarily.

## Validation

- `just test -p codex-core session::turn`

pakrym-oai · 2026-05-22 16:46:25 -07:00

6ad3a83509

Prefer just test over cargo test in docs (#23910 )

`cargo test` for the core and other crates fails on a fresh macOS
checkout without the right stack size variable. This change encourages
using the just test command that sets the environment up correctly.

As a bonus, this should encourage agents to get more benefit out of
nextest's parallel execution.

anp-oai · 2026-05-22 16:58:14 +00:00

d53e68954a

Clarify docs folder guidance in AGENTS.md (#21772 )

## Summary

Codex keeps trying to add documentation to the `docs/` directory. With
the exception of app server API documentation, the docs for Codex should
not live in this repo. We don't want the local `docs/` folder to become
a stale shadow of the official docs.

This PR updates `AGENTS.md` to make that boundary explicit and scopes
the existing API documentation guidance to app-server docs/examples. It
also removes the extra `docs/config.md` sections that were recently
added.

Eric Traut · 2026-05-08 10:11:57 -07:00

0a0d09ad21

Ensure all mentions of cargo-install are --locked (#21592 )

There's already a preference for this in the codebase, but a few of them
have drifted away. Generally `--locked` is preferred to reduce exposure
to supply-chain attacks (and just generally improve reproducibility).

In an ideal world these dependencies would maybe even be pinned to
versions but Cargo is kinda bad at that for devtools. Still better to
use --locked than not.

Aria Desires · 2026-05-07 15:30:37 -07:00

80a8563e48

docs: discourage #[async_trait] and #[allow(async_fn_in_trait)] (#20242 )

## Why

We have run into two avoidable problems when introducing async trait
APIs in Rust:

- `#[async_trait]` has caused materially worse build times in this
repository.
- `#[allow(async_fn_in_trait)]` makes it too easy to ship a public trait
without spelling out whether the returned future is `Send`, which hides
an important part of the trait contract.

We already have a good example of the preferred alternative in
[#16630](https://github.com/openai/codex/pull/16630) /
[`3c7f013f9735`](https://github.com/openai/codex/commit/3c7f013f9735),
but that guidance currently lives only as prior art in the codebase.
This PR documents the rule in `AGENTS.md` so contributors are more
likely to follow the native RPITIT pattern before these two shortcuts
spread further.

## What Changed

- added Rust guidance in `AGENTS.md` discouraging both `#[async_trait]`
and `#[allow(async_fn_in_trait)]`
- pointed contributors to the native RPITIT pattern with explicit `Send`
bounds on the returned future
- clarified that implementations may still use `async fn` when they
satisfy that trait contract

## Verification

- docs-only change; no tests run

Michael Bolin · 2026-04-29 15:29:29 -07:00

b1546008fc

feat(tui): add clear-context plan implementation (#17499 )

## TL;DR

- Adds a second Plan Mode handoff: implement the approved plan after
clearing context.
- Keeps the existing same-thread `Yes, implement this plan` action
unchanged.
- Reuses the `/clear` thread-start path and submits the approved plan as
the fresh thread's first prompt.
- Covers the new popup option, event plumbing, initial-message behavior,
and disabled states in TUI tests.

## Problem

Plan Mode already asks whether to implement an approved plan, but the
only affirmative path continues in the same thread. That is useful when
the planning conversation itself is still valuable, but it does not
support the workflow where exploratory planning context is discarded and
implementation starts from the final approved plan as the only
model-visible handoff.

<img width="1253" height="869" alt="image"
src="https://github.com/user-attachments/assets/90023d75-c330-4919-bed8-518671c3474b"
/>

## Mental model

There are now two implementation choices after a proposed plan. The
existing choice, `Yes, implement this plan`, is unchanged: it switches
to Default mode and submits `Implement the plan.` in the current thread.
The new choice, `Yes, clear context and implement`, treats the proposed
plan as a handoff artifact. It clears the UI/session context through the
same thread-start source used by `/clear`, then submits an initial
prompt containing the approved plan after the fresh thread is
configured.

The important distinction is that the new path is not compaction. The
model receives a deliberate implementation prompt built from the
approved plan markdown, not a summary of the previous planning
transcript. Both implementation choices require the Default
collaboration preset to be available, so the popup does not offer a
coding handoff when the fresh thread would fall back to another mode.

## Non-goals

This change does not alter `/clear`, `/compact`, or the existing
same-context Plan Mode implementation option. It does not add protocol
surface area or app-server schema changes. It also does not carry the
previous transcript path or a generated planning summary into the new
model context.

## Tradeoffs

The fresh-context option relies on the approved plan being sufficiently
complete. That matches the Plan Mode contract, but it means vague plans
will produce weaker implementation starts than a compacted transcript
would. The upside is that rejected ideas, exploratory dead ends, and
planning corrections do not leak into the implementation turn.

The current implementation stores the latest proposed plan in
`ChatWidget` rather than deriving it from history cells at selection
time. This keeps the popup action simple and deterministic, but it makes
the cache lifecycle important: it must be reset when a new task starts
so an old plan cannot be submitted later.

## Architecture

The TUI stores the most recent completed proposed-plan markdown when a
plan item completes. The Plan Mode approval popup uses that cache to
enable the fresh-context option and to build a first-turn prompt that
instructs the model to implement the approved plan in a fresh context.

Selecting the new option emits a TUI-internal
`ClearUiAndSubmitUserMessage` event. `App` handles that event by reusing
the existing clear flow: clear terminal state, reset app UI state, start
a new app-server thread with `ThreadStartSource::Clear`, and attach a
replacement `ChatWidget` with an initial user message. The existing
initial-message suppression in `enqueue_primary_thread_session` ensures
the prompt is submitted only after the new session is configured and any
startup replay is rendered.

## Observability

The previous thread remains resumable through the existing clear-session
summary hint. There is no new telemetry or protocol event for this path,
so debugging should start at the TUI event boundary: confirm the popup
emitted `ClearUiAndSubmitUserMessage`, confirm the app-server thread
start used `ThreadStartSource::Clear`, then confirm the fresh widget
submitted the initial user message after `SessionConfigured`.

## Tests

The Plan Mode popup snapshots cover the new option and preserve the
original option as the first/default action. Unit coverage verifies the
original same-context option still emits `SubmitUserMessageWithMode`,
the new option emits `ClearUiAndSubmitUserMessage` with the approved
plan embedded verbatim, and the clear-context option is disabled when
Default mode is unavailable or no approved plan exists. The broader
`codex-tui` test package passes with the updated fresh-thread
initial-message plumbing.

Felipe Coury · 2026-04-17 14:30:09 -03:00

d3692b14c9

feat: add Codex Apps sediment file remapping (#15197 )

## Summary
- bridge Codex Apps tools that declare `_meta["openai/fileParams"]`
through the OpenAI file upload flow
- mask those file params in model-visible tool schemas so the model
provides absolute local file paths instead of raw file payload objects
- rewrite those local file path arguments client-side into
`ProvidedFilePayload`-shaped objects before the normal MCP tool call

## Details
- applies to scalar and array file params declared in
`openai/fileParams`
- Codex uploads local files directly to the backend and uses the
uploaded file metadata to build the MCP tool arguments locally
- this PR is input-only

## Verification
- `just fmt`
- `cargo test -p codex-core mcp_tool_call -- --nocapture`

---------

Co-authored-by: Codex <noreply@openai.com>

Casey Chow · 2026-04-09 14:10:44 -04:00

244b15c95d

[codex] reduce module visibility (#16978 )

## Summary
- reduce public module visibility across Rust crates, preferring private
or crate-private modules with explicit crate-root public exports
- update external call sites and tests to use the intended public crate
APIs instead of reaching through module trees
- add the module visibility guideline to AGENTS.md

## Validation
- `cargo check --workspace --all-targets --message-format=short` passed
before the final fix/format pass
- `just fix` completed successfully
- `just fmt` completed successfully
- `git diff --check` passed

pakrym-oai · 2026-04-07 08:03:35 -07:00

413c1e1fdf

docs: update argument_comment_lint instructions in AGENTS.md (#16375 )

I noticed that Codex was spending more time on running this lint check
locally than I would like. Now that we have the linter running
cross-platform using Bazel in CI, I find it's best just to update the PR
ASAP to get CI going than to wait for `just argument-comment-lint` to
finish locally before updating the PR.

Michael Bolin · 2026-04-01 15:44:34 +00:00

5cca5c0093

Refactor external auth to use a single trait (#16356 )

## Summary
- Replace the separate external auth enum and refresher trait with a
single `ExternalAuth` trait in login auth flow
- Move bearer token auth behind `BearerTokenRefresher` and update
`AuthManager` and app-server wiring to use the generic external auth API

Eric Traut · 2026-03-31 14:54:18 -06:00

103acdfb06

Rename tui_app_server to tui (#16104 )

This is a follow-up to https://github.com/openai/codex/pull/15922. That
previous PR deleted the old `tui` directory and left the new
`tui_app_server` directory in place. This PR renames `tui_app_server` to
`tui` and fixes up all references.

Eric Traut · 2026-03-28 11:23:07 -06:00

61429a6c10

Remove the legacy TUI split (#15922 )

This is the part 1 of 2 PRs that will delete the `tui` /
`tui_app_server` split. This part simply deletes the existing `tui`
directory and marks the `tui_app_server` feature flag as removed. I left
the `tui_app_server` feature flag in place for now so its presence
doesn't result in an error. It is simply ignored.

Part 2 will rename the `tui_app_server` directory `tui`. I did this as
two parts to reduce visible code churn.

Eric Traut · 2026-03-27 22:56:44 +00:00

d65deec617

docs: update AGENTS.md to discourage adding code to codex-core (#15910 )

## Why

`codex-core` is already the largest crate in `codex-rs`, so defaulting
to it for new functionality makes it harder to keep the workspace
modular. The repo guidance should make it explicit that contributors are
expected to look for an existing non-`codex-core` crate, or introduce a
new crate, before growing `codex-core` further.

## What Changed

- Added a dedicated `The \`codex-core\` crate` section to `AGENTS.md`.
- Documented why `codex-core` should be treated as a last resort for new
functionality.
- Added concrete guidance for both implementation and review: prefer an
existing non-`codex-core` crate when possible, introduce a new workspace
crate when that is the cleaner boundary, and push back on PRs that grow
`codex-core` unnecessarily.

Michael Bolin · 2026-03-26 14:56:43 -07:00

609019c6e5

chore: ask agents md not to play with PIDs (#15877 )

Ask Codex to be patient with Rust

jif-oai · 2026-03-26 15:43:19 +00:00

a5824e37db

Use released DotSlash package for argument-comment lint (#15199 )

## Why
The argument-comment lint now has a packaged DotSlash artifact from
[#15198](https://github.com/openai/codex/pull/15198), so the normal repo
lint path should use that released payload instead of rebuilding the
lint from source every time.

That keeps `just clippy` and CI aligned with the shipped artifact while
preserving a separate source-build path for people actively hacking on
the lint crate.

The current alpha package also exposed two integration wrinkles that the
repo-side prebuilt wrapper needs to smooth over:
- the bundled Dylint library filename includes the host triple, for
example `@nightly-2025-09-18-aarch64-apple-darwin`, and Dylint derives
`RUSTUP_TOOLCHAIN` from that filename
- on Windows, Dylint's driver path also expects `RUSTUP_HOME` to be
present in the environment

Without those adjustments, the prebuilt CI jobs fail during `cargo
metadata` or driver setup. This change makes the checked-in prebuilt
wrapper normalize the packaged library name to the plain
`nightly-2025-09-18` channel before invoking `cargo-dylint`, and it
teaches both the wrapper and the packaged runner source to infer
`RUSTUP_HOME` from `rustup show home` when the environment does not
already provide it.

After the prebuilt Windows lint job started running successfully, it
also surfaced a handful of existing anonymous literal callsites in
`windows-sandbox-rs`. This PR now annotates those callsites so the new
cross-platform lint job is green on the current tree.

## What Changed
- checked in the current
`tools/argument-comment-lint/argument-comment-lint` DotSlash manifest
- kept `tools/argument-comment-lint/run.sh` as the source-build wrapper
for lint development
- added `tools/argument-comment-lint/run-prebuilt-linter.sh` as the
normal enforcement path, using the checked-in DotSlash package and
bundled `cargo-dylint`
- updated `just clippy` and `just argument-comment-lint` to use the
prebuilt wrapper
- split `.github/workflows/rust-ci.yml` so source-package checks live in
a dedicated `argument_comment_lint_package` job, while the released lint
runs in an `argument_comment_lint_prebuilt` matrix on Linux, macOS, and
Windows
- kept the pinned `nightly-2025-09-18` toolchain install in the prebuilt
CI matrix, since the prebuilt package still relies on rustup-provided
toolchain components
- updated `tools/argument-comment-lint/run-prebuilt-linter.sh` to
normalize host-qualified nightly library filenames, keep the `rustup`
shim directory ahead of direct toolchain `cargo` binaries, and export
`RUSTUP_HOME` when needed for Windows Dylint driver setup
- updated `tools/argument-comment-lint/src/bin/argument-comment-lint.rs`
so future published DotSlash artifacts apply the same nightly-filename
normalization and `RUSTUP_HOME` inference internally
- fixed the remaining Windows lint violations in
`codex-rs/windows-sandbox-rs` by adding the required `/*param*/`
comments at the reported callsites
- documented the checked-in DotSlash file, wrapper split, archive
layout, nightly prerequisite, and Windows `RUSTUP_HOME` requirement in
`tools/argument-comment-lint/README.md`

Michael Bolin · 2026-03-20 03:19:22 +00:00

fa2a2f0be9

Move TUI on top of app server (parallel code) (#14717 )

This PR replicates the `tui` code directory and creates a temporary
parallel `tui_app_server` directory. It also implements a new feature
flag `tui_app_server` to select between the two tui implementations.

Once the new app-server-based TUI is stabilized, we'll delete the old
`tui` directory and feature flag.

Eric Traut · 2026-03-16 10:49:19 -06:00

db89b73a9c

Add argument-comment Dylint runner (#14651 )

Michael Bolin · 2026-03-14 08:18:04 -07:00

4b31848f5b

client: extend custom CA handling across HTTPS and websocket clients (#14239 )

## Stacked PRs

This work is now effectively split across two steps:

- #14178: add custom CA support for browser and device-code login flows,
docs, and hermetic subprocess tests
- #14239: extend that shared custom CA handling across Codex HTTPS
clients and secure websocket TLS

Note: #14240 was merged into this branch while it was stacked on top of
this PR. This PR now subsumes that websocket follow-up and should be
treated as the combined change.

Builds on top of #14178.

## Problem

Custom CA support landed first in the login path, but the real
requirement is broader. Codex constructs outbound TLS clients in
multiple places, and both HTTPS and secure websocket paths can fail
behind enterprise TLS interception if they do not honor
`CODEX_CA_CERTIFICATE` or `SSL_CERT_FILE` consistently.

This PR broadens the shared custom-CA logic beyond login and applies the
same policy to websocket TLS, so the enterprise-proxy story is no longer
split between “HTTPS works” and “websockets still fail”.

## What This Delivers

Custom CA support is no longer limited to login. Codex outbound HTTPS
clients and secure websocket connections can now honor the same
`CODEX_CA_CERTIFICATE` / `SSL_CERT_FILE` configuration, so enterprise
proxy/intercept setups work more consistently end-to-end.

For users and operators, nothing new needs to be configured beyond the
same CA env vars introduced in #14178. The change is that more of Codex
now respects them, including websocket-backed flows that were previously
still using default trust roots.

I also manually validated the proxy path locally with mitmproxy using:
`CODEX_CA_CERTIFICATE=~/.mitmproxy/mitmproxy-ca-cert.pem
HTTPS_PROXY=http://127.0.0.1:8080 just codex`
with mitmproxy installed via `brew install mitmproxy` and configured as
the macOS system proxy.

## Mental model

`codex-client` is now the owner of shared custom-CA policy for outbound
TLS client construction. Reqwest callers start from the builder
configuration they already need, then pass that builder through
`build_reqwest_client_with_custom_ca(...)`. Websocket callers ask the
same module for a rustls client config when a custom CA bundle is
configured.

The env precedence is the same everywhere:
- `CODEX_CA_CERTIFICATE` wins
- otherwise fall back to `SSL_CERT_FILE`
- otherwise use system roots

The helper is intentionally narrow. It loads every usable certificate
from the configured PEM bundle into the appropriate root store and
returns either a configured transport or a typed error that explains
what went wrong.

## Non-goals

This does not add handshake-level integration tests against a live TLS
endpoint. It does not validate that the configured bundle forms a
meaningful certificate chain. It also does not try to force every
transport in the repo through one abstraction; it extends the shared CA
policy across the reqwest and websocket paths that actually needed it.

## Tradeoffs

The main tradeoff is centralizing CA behavior in `codex-client` while
still leaving adoption up to call sites. That keeps the implementation
additive and reviewable, but it means the rule "outbound Codex TLS that
should honor enterprise roots must use the shared helper" is still
partly enforced socially rather than by types.

For websockets, the shared helper only builds an explicit rustls config
when a custom CA bundle is configured. When no override env var is set,
websocket callers still use their ordinary default connector path.

## Architecture

`codex-client::custom_ca` now owns CA bundle selection, PEM
normalization, mixed-section parsing, certificate extraction, typed
CA-loading errors, and optional rustls client-config construction for
websocket TLS.

The affected consumers now call into that shared helper directly rather
than carrying login-local CA behavior:
- backend-client
- cloud-tasks
- RMCP client paths that use `reqwest`
- TUI voice HTTP paths
- `codex-core` default reqwest client construction
- `codex-api` websocket clients for both responses and realtime
websocket connections

The subprocess CA probe, env-sensitive integration tests, and shared PEM
fixtures also live in `codex-client`, which is now the actual owner of
the behavior they exercise.

## Observability

The shared CA path logs:
- which environment variable selected the bundle
- which path was loaded
- how many certificates were accepted
- when `TRUSTED CERTIFICATE` labels were normalized
- when CRLs were ignored
- where client construction failed

Returned errors remain user-facing and include the relevant env var,
path, and remediation hint. That same error model now applies whether
the failure surfaced while building a reqwest client or websocket TLS
configuration.

## Tests

Pure unit tests in `codex-client` cover env precedence and PEM
normalization behavior. Real client construction remains in subprocess
tests so the suite can control process env and avoid the macOS seatbelt
panic path that motivated the hermetic test split.

The subprocess coverage verifies:
- `CODEX_CA_CERTIFICATE` precedence over `SSL_CERT_FILE`
- fallback to `SSL_CERT_FILE`
- single-cert and multi-cert bundles
- malformed and empty-file errors
- OpenSSL `TRUSTED CERTIFICATE` handling
- CRL tolerance for well-formed CRL sections

The websocket side is covered by the existing `codex-api` / `codex-core`
websocket test suites plus the manual mitmproxy validation above.

---------

Co-authored-by: Ivan Zakharchanka <3axap4eHko@gmail.com>
Co-authored-by: Codex <noreply@openai.com>

Josh McKinney · 2026-03-13 00:59:26 +00:00

6912da84a8

Add keyboard based fast switching between agents in TUI (#13923 )

gabec-openai · 2026-03-11 12:33:10 -07:00

180a5820fc

feat: discourage the use of the --all-features flag (#12429 )

## Why

Developers are frequently running low on disk space, and routine use of
`--all-features` contributes to larger Cargo build caches in `target/`
by compiling additional feature combinations.

This change updates local workflow guidance to avoid `--all-features` by
default and reserve it for cases where full feature coverage is
specifically needed.

## What Changed

- Updated `AGENTS.md` guidance for `codex-rs` to recommend `cargo test`
/ `just test` for full-suite local runs, and to call out the disk-usage
cost of routine `--all-features` usage.
- Updated the root `justfile` so `just fix` and `just clippy` no longer
pass `--all-features` by default.
- Updated `docs/install.md` to explicitly describe `cargo test
--all-features` as an optional heavier-weight run (more build time and
`target/` disk usage).

## Verification

- Confirmed the `justfile` parses and the recipes list successfully with
`just --list`.

Michael Bolin · 2026-02-20 23:02:24 -08:00

264fc444b6

bazel: enforce MODULE.bazel.lock sync with Cargo.lock (#11790 )

## Why this change

When Cargo dependencies change, it is easy to end up with an unexpected
local diff in
`MODULE.bazel.lock` after running Bazel. That creates noisy working
copies and pushes lockfile fixes
later in the cycle. This change addresses that pain point directly.

## What this change enforces

The expected invariant is: after dependency updates, `MODULE.bazel.lock`
is already in sync with
Cargo resolution. In practice, running `bazel mod deps` should not
mutate the lockfile in a clean
state. If it does, the dependency update is incomplete.

## How this is enforced

This change adds a single lockfile check script that snapshots
`MODULE.bazel.lock`, runs
`bazel mod deps`, and fails if the file changes. The same check is wired
into local workflow
commands (`just bazel-lock-update` and `just bazel-lock-check`) and into
Bazel CI (Linux x86_64 job)
so drift is caught early and consistently. The developer documentation
is updated in
`codex-rs/docs/bazel.md` and `AGENTS.md` to make the expected flow
explicit.

`MODULE.bazel.lock` is also refreshed in this PR to match the current
Cargo dependency resolution.

## Expected developer workflow

After changing `Cargo.toml` or `Cargo.lock`, run `just
bazel-lock-update`, then run
`just bazel-lock-check`, and include any resulting `MODULE.bazel.lock`
update in the same change.

## Testing

Ran `just bazel-lock-check` locally.

Josh McKinney · 2026-02-14 02:11:19 +00:00

de93cef5b7

docs: require insta snapshot coverage for UI changes (#10669 )

Adds an explicit requirement in AGENTS.md that any user-visible UI
change includes corresponding insta snapshot coverage and that snapshots
are reviewed/accepted in the PR.

Tests: N/A (docs only)

Josh McKinney · 2026-02-12 22:47:09 +00:00

75e79cf09a

feat(app-server): experimental flag to persist extended history (#11227 )

This PR adds an experimental `persist_extended_history` bool flag to
app-server thread APIs so rollout logs can retain a richer set of
EventMsgs for non-lossy Thread > Turn > ThreadItems reconstruction (i.e.
on `thread/resume`).

### Motivation
Today, our rollout recorder only persists a small subset (e.g. user
message, reasoning, assistant message) of `EventMsg` types, dropping a
good number (like command exec, file change, etc.) that are important
for reconstructing full item history for `thread/resume`, `thread/read`,
and `thread/fork`.

Some clients want to be able to resume a thread without lossiness. This
lossiness is primarily a UI thing, since what the model sees are
`ResponseItem` and not `EventMsg`.

### Approach
This change introduces an opt-in `persist_full_history` flag to preserve
those events when you start/resume/fork a thread (defaults to `false`).

This is done by adding an `EventPersistenceMode` to the rollout
recorder:
- `Limited` (existing behavior, default)
- `Extended` (new opt-in behavior)

In `Extended` mode, persist additional `EventMsg` variants needed for
non-lossy app-server `ThreadItem` reconstruction. We now store the
following ThreadItems that we didn't before:
- web search
- command execution
- patch/file changes
- MCP tool calls
- image view calls
- collab tool outcomes
- context compaction
- review mode enter/exit

For **command executions** in particular, we truncate the output using
the existing `truncate_text` from core to store an upper bound of 10,000
bytes, which is also the default value for truncating tool outputs shown
to the model. This keeps the size of the rollout file and command
execution items returned over the wire reasonable.

And we also persist `EventMsg::Error` which we can now map back to the
Turn's status and populates the Turn's error metadata.

#### Updates to EventMsgs
To truly make `thread/resume` non-lossy, we also needed to persist the
`status` on `EventMsg::CommandExecutionEndEvent` and
`EventMsg::PatchApplyEndEvent`. Previously it was not obvious whether a
command failed or was declined (similar for apply_patch). These
EventMsgs were never persisted before so I made it a required field.

Owen Lin · 2026-02-12 19:34:22 +00:00

efc8d45750

Try to stop small helper methods (#11203 )

pakrym-oai · 2026-02-09 20:01:30 +00:00

086d02fb14

chore(app-server): update AGENTS.md for config + optional collection guidance (#10914 )

Based on recent app-server PRs

Owen Lin · 2026-02-06 12:45:27 -08:00

731f0f384a

fix(app-server): fix TS annotations for optional fields on requests (#10412 )

This updates our generated TypeScript types to be more correct with how
the server actually behaves, **specifically for JSON-RPC requests**.

Before this PR, we'd generate `field: T | null`. After this PR, we will
have `field?: T | null`. The latter matches how the server actually
works, in that if an optional field is omitted, the server will treat it
as null. This also makes it less annoying in theory for clients to
upgrade to newer versions of Codex, since adding a new optional field to
a JSON-RPC request should not require a client change.

NOTE: This only applies to JSON-RPC requests. All other payloads (i.e.
responses, notifications) will return `field: T | null` as usual.

Owen Lin · 2026-02-03 11:51:37 -08:00

efd96c46c7

Reject request_user_input outside Plan/Pair (#9955 )

## Context

Previous work in https://github.com/openai/codex/pull/9560 only rejected
`request_user_input` in Execute and Custom modes. Since then, additional
modes
(e.g., Code) were added, so the guard should be mode-agnostic.

## What changed

- Switch the handler to an allowlist: only Plan and PairProgramming are
allowed
- Return the same error for any other mode (including Code)
- Add a Code-mode rejection test alongside the existing Execute/Custom
tests

## Why

This prevents `request_user_input` from being used in modes where it is
not
intended, even as new modes are introduced.

Charley Cunningham · 2026-01-26 17:12:17 -08:00

47aa1f3b6a

chore: tweak AGENTS.md (#9650 )

## Summary
Update AGENTS.md to improve testing flow

## Testing
- [x] Tested locally, much faster

Dylan Hurd · 2026-01-21 20:20:45 -08:00

e520592bcf

don't ask for approval for just fix (#9586 )

It blocks all my skills from executing because it asks to run just fmt.
It's quick command that doesn't need approval.


<img width="967" height="120" alt="image"
src="https://github.com/user-attachments/assets/f8e6ca76-a650-49e9-beb2-ce98ba48d310"
/>

Ahmed Ibrahim · 2026-01-21 04:56:11 +00:00

ebc88f29f8

add generated jsonschema for config.toml (#8956 )

### What
Add JSON Schema generation for `config.toml`, with checked‑in
`docs/config.schema.json`. We can move the schema elsewhere if preferred
(and host it if there's demand).

Add fixture test to prevent drift and `just write-config-schema` to
regenerate on schema changes.

Generate MCP config schema from `RawMcpServerConfig` instead of
`McpServerConfig` because that is the runtime type used for
deserialization.

Populate feature flag values into generated schema so they can be
autocompleted.

### Tests
Added tests + regenerate script to prevent drift. Tested autocompletions
using generated jsonschema locally with Even Better TOML.



https://github.com/user-attachments/assets/5aa7cd39-520c-4a63-96fb-63798183d0bc

sayan-oai · 2026-01-13 10:22:51 -08:00

40e2405998

feat: introduce find_resource! macro that works with Cargo or Bazel (#8879 )

To support Bazelification in https://github.com/openai/codex/pull/8875,
this PR introduces a new `find_resource!` macro that we use in place of
our existing logic in tests that looks for resources relative to the
compile-time `CARGO_MANIFEST_DIR` env var.

To make this work, we plan to add the following to all `rust_library()`
and `rust_test()` Bazel rules in the project:

```
rustc_env = {
    "BAZEL_PACKAGE": native.package_name(),
},
```

Our new `find_resource!` macro reads this value via
`option_env!("BAZEL_PACKAGE")` so that the Bazel package _of the code
using `find_resource!`_ is injected into the code expanded from the
macro. (If `find_resource()` were a function, then
`option_env!("BAZEL_PACKAGE")` would always be
`codex-rs/utils/cargo-bin`, which is not what we want.)

Note we only consider the `BAZEL_PACKAGE` value when the `RUNFILES_DIR`
environment variable is set at runtime, indicating that the test is
being run by Bazel. In this case, we have to concatenate the runtime
`RUNFILES_DIR` with the compile-time `BAZEL_PACKAGE` value to build the
path to the resource.

In testing this change, I discovered one funky edge case in
`codex-rs/exec-server/tests/common/lib.rs` where we have to _normalize_
(but not canonicalize!) the result from `find_resource!` because the
path contains a `common/..` component that does not exist on disk when
the test is run under Bazel, so it must be semantically normalized using
the [`path-absolutize`](https://crates.io/crates/path-absolutize) crate
before it is passed to `dotslash fetch`.

Because this new behavior may be non-obvious, this PR also updates
`AGENTS.md` to make humans/Codex aware that this API is preferred.

Michael Bolin · 2026-01-07 18:06:08 -08:00

f6b563ec64

feat: introduce codex-utils-cargo-bin as an alternative to assert_cmd::Command (#8496 )

This PR introduces a `codex-utils-cargo-bin` utility crate that
wraps/replaces our use of `assert_cmd::Command` and
`escargot::CargoBuild`.

As you can infer from the introduction of `buck_project_root()` in this
PR, I am attempting to make it possible to build Codex under
[Buck2](https://buck2.build) as well as `cargo`. With Buck2, I hope to
achieve faster incremental local builds (largely due to Buck2's
[dice](https://buck2.build/docs/insights_and_knowledge/modern_dice/)
build strategy, as well as benefits from its local build daemon) as well
as faster CI builds if we invest in remote execution and caching.

See
https://buck2.build/docs/getting_started/what_is_buck2/#why-use-buck2-key-advantages
for more details about the performance advantages of Buck2.

Buck2 enforces stronger requirements in terms of build and test
isolation. It discourages assumptions about absolute paths (which is key
to enabling remote execution). Because the `CARGO_BIN_EXE_*` environment
variables that Cargo provides are absolute paths (which
`assert_cmd::Command` reads), this is a problem for Buck2, which is why
we need this `codex-utils-cargo-bin` utility.

My WIP-Buck2 setup sets the `CARGO_BIN_EXE_*` environment variables
passed to a `rust_test()` build rule as relative paths.
`codex-utils-cargo-bin` will resolve these values to absolute paths,
when necessary.


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/8496).
* #8498
* __->__ #8496

Michael Bolin · 2025-12-23 19:29:32 -08:00

e61bae12e3

Fixed resume matching to respect case insensitivity when using WSL mount points (#8000 )

This fixes #7995

Eric Traut · 2025-12-16 16:27:38 -08:00

42b8f28ee8

docs: remove blanket ban on unsigned integers (#7957 )

Drop the AGENTS.md rule that forbids unsigned ints. The blanket guidance
causes unnecessary complexity in cases where values are naturally
unsigned, leading to extra clamping/conversion code instead of using
checked or saturating arithmetic where needed.

Josh McKinney · 2025-12-12 17:01:56 -08:00

596fcd040f

feat(core) Add login to shell_command tool (#6846 )

## Summary
Adds the `login` parameter to the `shell_command` tool - optional,
defaults to true.

## Testing
- [x] Tested locally

Dylan Hurd · 2025-12-05 11:03:25 -08:00

a8cbbdbc6e

tests: replace mount_sse_once_match with mount_sse_once for SSE mocking (#6640 )

pakrym-oai · 2025-11-13 18:04:05 -08:00

6c384eb9c6

Prefer wait_for_event over wait_for_event_with_timeout. (#6346 )

No need to specify the timeout in most cases.

pakrym-oai · 2025-11-06 16:14:43 -08:00

f8b30af6dc

Auto compact at ~90% (#5292 )

Users now hit a window exceeded limit and they usually don't know what
to do. This starts auto compact at ~90% of the window.

Ahmed Ibrahim · 2025-10-20 11:29:49 -07:00

049a61bcfc

[MCP] Add support for resources (#5239 )

This PR adds support for [MCP
resources](https://modelcontextprotocol.io/specification/2025-06-18/server/resources)
by adding three new tools for the model:
1. `list_resources`
2. `list_resource_templates`
3. `read_resource`

These 3 tools correspond to the [three primary MCP resource protocol
messages](https://modelcontextprotocol.io/specification/2025-06-18/server/resources#protocol-messages).

Example of listing and reading a GitHub resource tempalte
<img width="2984" height="804" alt="CleanShot 2025-10-15 at 17 31 10"
src="https://github.com/user-attachments/assets/89b7f215-2e2a-41c5-90dd-b932ac84a585"
/>

`/mcp` with Figma configured
<img width="2984" height="442" alt="CleanShot 2025-10-15 at 18 29 35"
src="https://github.com/user-attachments/assets/a7578080-2ed2-4c59-b9b4-d8461f90d8ee"
/>

Fixes #4956

Gabriel Peal · 2025-10-17 01:05:15 -04:00

40fba1bb4c

Simplify request body assertions (#4845 )

We'll have a lot more test like these

pakrym-oai · 2025-10-07 09:56:39 +01:00

35a770e871

chore: subject docs/*.md to Prettier checks (#4645 )

Apparently we were not running our `pnpm run prettier` check in CI, so
many files that were covered by the existing Prettier check were not
well-formatted.

This updates CI and formats the files.

Michael Bolin · 2025-10-03 11:35:48 -07:00

c32e9cfe86

69 Commits