mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
dev
221 Commits
-
[codex] Remove child AGENTS.md prompt experiment (#28993)
## Why `child_agents_md` is a disabled, under-development experiment that adds a second model-visible explanation of hierarchical `AGENTS.md` behavior. Keeping it leaves unused prompt, configuration, documentation, and test surface. ## What changed - remove the `ChildAgentsMd` feature and `child_agents_md` config schema entry - remove the hierarchical prompt asset, export, and instruction injection - remove feature-specific tests and documentation - keep the generic unstable-feature warning coverage using `apply_patch_streaming_events` Normal project `AGENTS.md` discovery and composition are unchanged. ## Testing - `just test -p codex-features` - `just test -p codex-prompts` - `just test -p codex-core agents_md` - `just test -p codex-core unstable_features_warning`
pakrym-oai ·
2026-06-18 16:13:07 -07:00 -
build: run buildifier from just fmt (#28125)
## Intent Keep Bazel and Starlark files consistently formatted without requiring contributors to install or version buildifier themselves. ## Implementation - Add a SHA-256-pinned, cross-platform DotSlash manifest for buildifier v8.5.1. - Run buildifier from the shared `just fmt` and `just fmt-check` driver, with Windows-safe explicit DotSlash invocation. - Provision DotSlash in formatting CI and contributor devcontainers, and document the source-build prerequisite. - Apply the initial mechanical buildifier formatting baseline.
Adam Perry @ OpenAI ·
2026-06-13 21:43:39 -07:00 -
tui: make
codex-tui.logopt-in (#24081)## Why The TUI currently creates a shared plaintext `codex-tui.log` under the default log directory. That append-only file can keep growing across runs even though the TUI already records diagnostics in bounded local stores. Make the plaintext file log an explicit troubleshooting choice instead of a default side effect. This is possible because logs are also stored in the DB with proper rotation ## What changed - Only install the TUI file logging layer when `log_dir` is explicitly set. - Remove the prior `codex-tui.log` at startup before an opt-in file layer is created. - Clarify the `log_dir` config/schema text and `docs/install.md` example so users opt in with `codex -c log_dir=...` when they need a plaintext log.
jif-oai ·
2026-05-22 17:19:51 +00:00 -
Prefer
just testovercargo testin docs (#23910)`cargo test` for the core and other crates fails on a fresh macOS checkout without the right stack size variable. This change encourages using the just test command that sets the environment up correctly. As a bonus, this should encourage agents to get more benefit out of nextest's parallel execution.
anp-oai ·
2026-05-22 16:58:14 +00:00 -
Add allow_managed_hooks_only hook requirement (#20319)
## Why Enterprise-managed hook policy needs a narrow way to require Codex to ignore user-controlled lifecycle hooks without adopting the broader trust-precedence model from earlier hook work. This keeps the policy anchored in `requirements.toml`, so admins can opt into managed hooks only while normal `config.toml` files cannot enable the restriction themselves. ## What changed - Added `allow_managed_hooks_only` to the requirements data flow and preserved explicit `false` values. - Also adds it to /debug-config - Marked MDM, system, and legacy managed config layers as managed for hook discovery. - Updated hook discovery so `allow_managed_hooks_only = true`: - keeps managed requirements hooks and managed config-layer hooks, - skips user/project/session `hooks.json` and `[hooks]` entries with concise startup warnings, - skips current unmanaged plugin hooks, - ignores any `allow_managed_hooks_only` key placed in ordinary `config.toml` layers.
Andrei Eternal ·
2026-05-12 19:05:25 -07:00 -
Clarify docs folder guidance in AGENTS.md (#21772)
## Summary Codex keeps trying to add documentation to the `docs/` directory. With the exception of app server API documentation, the docs for Codex should not live in this repo. We don't want the local `docs/` folder to become a stale shadow of the official docs. This PR updates `AGENTS.md` to make that boundary explicit and scopes the existing API documentation guidance to app-server docs/examples. It also removes the extra `docs/config.md` sections that were recently added.
Eric Traut ·
2026-05-08 10:11:57 -07:00 -
codex-otel: add configurable trace metadata (#21556)
Add Codex config for static trace span attributes and structured W3C tracestate field upserts. The config flows through OtelSettings so callers can attach trace metadata without touching every span call site. Apply span attributes with an SDK span processor so every exported trace span carries the configured metadata. Model tracestate as nested member fields so configured keys can be upserted while unrelated propagated state in the same member is preserved. Validate configured tracestate before installing provider-global state, including header-unsafe values the SDK does not reject by itself. This keeps Codex from propagating malformed trace context from config. Update the config schema, public docs, and OTLP loopback coverage for config parsing, span export, propagation, and invalid-header rejection.
bbrown-oai ·
2026-05-07 16:06:57 -07:00 -
Ensure all mentions of cargo-install are --locked (#21592)
There's already a preference for this in the codebase, but a few of them have drifted away. Generally `--locked` is preferred to reduce exposure to supply-chain attacks (and just generally improve reproducibility). In an ideal world these dependencies would maybe even be pinned to versions but Cargo is kinda bad at that for devtools. Still better to use --locked than not.
Aria Desires ·
2026-05-07 15:30:37 -07:00 -
Document Codex git commit attribution config (#21379)
## Summary - document that commit attribution for generated git commit messages is gated by the `codex_git_commit` feature flag - add an example `config.toml` snippet showing `commit_attribution` with `[features].codex_git_commit = true` - update the config schema description so the reference docs explain that `commit_attribution` only takes effect when the feature is enabled Fixes #19799. ## Validation - `cargo run -p codex-core --bin codex-write-config-schema` - `cargo test -p codex-config` - `cargo test -p codex-features` - `cargo fmt --check` - `git diff --check` ## Notes - `cargo test -p codex-core config_schema_matches_fixture` currently fails before reaching the schema test because `core_test_support` imports `similar` without a linked crate in this checkout. The narrower package checks above avoid that unrelated test-support build failure.
Brian Henzelmann ·
2026-05-06 16:14:50 -05:00 -
Remove local docs and specs (#20896)
## Summary We should not check local-only docs or planning specs into this repository. Keeping those files here duplicates the canonical Codex documentation surface and makes transient implementation notes look like supported docs. This PR removes the local-only docs/spec files from `docs/` and trims `docs/config.md` back to links for the maintained configuration documentation on developers.openai.com.
Eric Traut ·
2026-05-03 10:23:09 -07:00 -
deprecate legacy notify (#20524)
# Why `notify` is the remaining compatibility surface from the legacy hook implementation. The newer lifecycle hook engine now owns the active hook system, so we should start steering users away from adding new `notify` configs before removing the old path entirely. This also adds a lightweight watchpoint for the deprecation so we can see how much legacy usage remains before the clean drop. # What - emit a startup deprecation notice when a non-empty `notify` command is configured - emit `codex.notify.configured` when a session starts with legacy `notify` configured - emit `codex.notify.run` when the legacy notify path fires after a completed turn - mark `notify` as deprecated in the config schema and repo docs - remove the orphaned `codex-rs/hooks/src/user_notification.rs` file that is no longer compiled - add regression coverage for the new deprecation notice # Next steps A follow-up PR can remove the legacy notify path entirely once we are ready for the clean drop. Before then, we can watch `codex.notify.configured` and `codex.notify.run` to understand the deprecation impact and remaining active usage. The cleanup PR should then delete the `notify` config field, the `legacy_notify` implementation, the old compatibility dispatch types and callsites that only exist for the legacy path, and the remaining compatibility docs/tests. # Testing - `cargo test -p codex-hooks` - `cargo test -p codex-config` - `cargo test -p codex-core emits_deprecation_notice_for_notify`
Abhinav ·
2026-05-01 17:35:21 +00:00 -
Support disabling tool suggest for specific tools. (#20072)
## Summary - Add `disable_tool_suggest` to app and plugin config, schema, and TypeScript output - Exclude disabled connectors and plugins from tool suggestion discovery - Persist "never show again" tool-suggestion choices back into `config.toml` - Update config docs and add coverage for connector and plugin suppression ## Testing - Added and updated unit tests for config persistence and tool-suggest filtering - Not run (not requested)
Matthew Zeng ·
2026-04-29 00:19:34 +00:00 -
Curtis 'Fjord' Hawthorne ·
2026-04-24 17:49:29 -07:00 -
Add server-level approval defaults for custom MCP servers (#17843)
## Summary - Add `default_tools_approval_mode` support for custom MCP server configs, matching the existing `codex_apps` behavior - Apply approval precedence as per-tool override, then server default, then `auto` - Update config serialization, CLI display, schema generation, docs, and tests ## Testing - `cargo check -p codex-config` - `cargo check -p codex-core` - `just write-config-schema` - `just fmt` - `cargo test -p codex-config` - Targeted `codex-core` tests for config parsing, config writes, and MCP approval precedence - `just fix -p codex-config -p codex-core`
Matthew Zeng ·
2026-04-16 18:18:07 +00:00 -
Support original-detail metadata on MCP image outputs (#17714)
## Summary - honor `_meta["codex/imageDetail"] == "original"` on MCP image content and map it to `detail: "original"` where supported - strip that detail back out when the active model does not support original-detail image inputs - update code-mode `image(...)` to accept individual MCP image blocks - teach `js_repl` / `codex.emitImage(...)` to preserve the same hint from raw MCP image outputs - document the new `_meta` contract and add generic RMCP-backed coverage across protocol, core, code-mode, and js_repl paths
Curtis 'Fjord' Hawthorne ·
2026-04-15 14:43:33 -07:00 -
Add
supports_parallel_tool_callsflag to included mcps (#17667)## Why For more advanced MCP usage, we want the model to be able to emit parallel MCP tool calls and have Codex execute eligible ones concurrently, instead of forcing all MCP calls through the serial block. The main design choice was where to thread the config. I made this server-level because parallel safety depends on the MCP server implementation. Codex reads the flag from `mcp_servers`, threads the opted-in server names into `ToolRouter`, and checks the parsed `ToolPayload::Mcp { server, .. }` at execution time. That avoids relying on model-visible tool names, which can be incomplete in deferred/search-tool paths or ambiguous for similarly named servers/tools. ## What was added Added `supports_parallel_tool_calls` for MCP servers. Before: ```toml [mcp_servers.docs] command = "docs-server" ``` After: ```toml [mcp_servers.docs] command = "docs-server" supports_parallel_tool_calls = true ``` MCP calls remain serial by default. Only tools from opted-in servers are eligible to run in parallel. Docs also now warn to enable this only when the server’s tools are safe to run concurrently, especially around shared state or read/write races. ## Testing Tested with a local stdio MCP server exposing real delay tools. The model/Responses side was mocked only to deterministically emit two MCP calls in the same turn. Each test called `query_with_delay` and `query_with_delay_2` with `{ "seconds": 25 }`. | Build/config | Observed | Wall time | | --- | --- | --- | | main with flag enabled | serial | `58.79s` | | PR with flag enabled | parallel | `31.73s` | | PR without flag | serial | `56.70s` | PR with flag enabled showed both tools start before either completed; main and PR-without-flag completed the first delay before starting the second. Also added an integration test. Additional checks: - `cargo test -p codex-tools` passed - `cargo test -p codex-core mcp_parallel_support_uses_exact_payload_server` passed - `git diff --check` passedjosiah-openai ·
2026-04-13 15:16:34 -07:00 -
feat(tui): add reverse history search to composer (#17550)
## Problem The TUI had shell-style Up/Down history recall, but `Ctrl+R` did not provide the reverse incremental search workflow users expect from shells. Users needed a way to search older prompts without immediately replacing the current draft, and the interaction needed to handle async persistent history, repeated navigation keys, duplicate prompt text, footer hints, and preview highlighting without making the main composer file even harder to review. https://github.com/user-attachments/assets/5165affd-4c9a-46e9-adbd-89088f5f7b6b <img width="1227" height="722" alt="image" src="https://github.com/user-attachments/assets/8bc83289-eeca-47c7-b0c3-8975101901af" /> ## Mental model `Ctrl+R` opens a temporary search session owned by the composer. The footer line becomes the search input, the composer body previews the current match only after the query has text, and `Enter` accepts that preview as an editable draft while `Esc` restores the draft that existed before search started. The history layer provides a combined offset space over persistent and local history, but search navigation exposes unique prompt text rather than every physical history row. ## Non-goals This change does not rewrite stored history, change normal Up/Down browsing semantics, add fuzzy matching, or add persistent metadata for attachments in cross-session history. Search deduplication is deliberately scoped to the active Ctrl+R search session and uses exact prompt text, so case, whitespace, punctuation, and attachment-only differences are not normalized. ## Tradeoffs The implementation keeps search state in the existing composer and history state machines instead of adding a new cross-module controller. That keeps ownership local and testable, but it means the composer still coordinates visible search status, draft restoration, footer rendering, cursor placement, and match highlighting while `ChatComposerHistory` owns traversal, async fetch continuation, boundary clamping, and unique-result caching. Unique-result caching stores cloned `HistoryEntry` values so known matches can be revisited without cache lookups; this is simple and robust for interactive search sizes, but it is not a global history index. ## Architecture `ChatComposer` detects `Ctrl+R`, snapshots the current draft, switches the footer to `FooterMode::HistorySearch`, and routes search-mode keys before normal editing. Query edits call `ChatComposerHistory::search` with `restart = true`, which starts from the newest combined-history offset. Repeated `Ctrl+R` or Up searches older; Down searches newer through already discovered unique matches or continues the scan. Persistent history entries still arrive asynchronously through `on_entry_response`, where a pending search either accepts the response, skips a duplicate, or requests the next offset. The composer-facing pieces now live in `codex-rs/tui/src/bottom_pane/chat_composer/history_search.rs`, leaving `chat_composer.rs` responsible for routing and rendering integration instead of owning every search helper inline. `codex-rs/tui/src/bottom_pane/chat_composer_history.rs` remains the owner of stored history, combined offsets, async fetch state, boundary semantics, and duplicate suppression. Match highlighting is computed from the current composer text while search is active and disappears when the match is accepted. ## Observability There are no new logs or telemetry. The practical debug path is state inspection: `ChatComposer.history_search` tells whether the footer query is idle, searching, matched, or unmatched; `ChatComposerHistory.search` tracks selected raw offsets, pending persistent fetches, exhausted directions, and unique match cache state. If a user reports skipped or repeated results, first inspect the exact stored prompt text, the selected offset, whether an async persistent response is still pending, and whether a query edit restarted the search session. ## Tests The change is covered by focused `codex-tui` unit tests for opening search without previewing the latest entry, accepting and canceling search, no-match restoration, boundary clamping, footer hints, case-insensitive highlighting, local duplicate skipping, and persistent duplicate skipping through async responses. Snapshot coverage captures the footer-mode visual changes. Local verification used `just fmt`, `cargo test -p codex-tui history_search`, `cargo test -p codex-tui`, and `just fix -p codex-tui`.
Felipe Coury ·
2026-04-12 19:32:19 -03:00 -
Remove remaining custom prompt support (#16115)
## Summary - remove protocol and core support for discovering and listing custom prompts - simplify the TUI slash-command flow and command popup to built-in commands only - delete obsolete custom prompt tests, helpers, and docs references - clean up downstream event handling for the removed protocol events
Eric Traut ·
2026-03-28 13:49:37 -06:00 -
Rename tui_app_server to tui (#16104)
This is a follow-up to https://github.com/openai/codex/pull/15922. That previous PR deleted the old `tui` directory and left the new `tui_app_server` directory in place. This PR renames `tui_app_server` to `tui` and fixes up all references.
Eric Traut ·
2026-03-28 11:23:07 -06:00 -
Remove the legacy TUI split (#15922)
This is the part 1 of 2 PRs that will delete the `tui` / `tui_app_server` split. This part simply deletes the existing `tui` directory and marks the `tui_app_server` feature flag as removed. I left the `tui_app_server` feature flag in place for now so its presence doesn't result in an error. It is simply ignored. Part 2 will rename the `tui_app_server` directory `tui`. I did this as two parts to reduce visible code churn.
Eric Traut ·
2026-03-27 22:56:44 +00:00 -
[mcp] Improve custom MCP elicitation (#15800)
- [x] Support don't ask again for custom MCP tool calls. - [x] Don't run arc in yolo mode. - [x] Run arc for custom MCP tools in always allow mode.
Matthew Zeng ·
2026-03-26 01:02:37 +00:00 -
client: extend custom CA handling across HTTPS and websocket clients (#14239)
## Stacked PRs This work is now effectively split across two steps: - #14178: add custom CA support for browser and device-code login flows, docs, and hermetic subprocess tests - #14239: extend that shared custom CA handling across Codex HTTPS clients and secure websocket TLS Note: #14240 was merged into this branch while it was stacked on top of this PR. This PR now subsumes that websocket follow-up and should be treated as the combined change. Builds on top of #14178. ## Problem Custom CA support landed first in the login path, but the real requirement is broader. Codex constructs outbound TLS clients in multiple places, and both HTTPS and secure websocket paths can fail behind enterprise TLS interception if they do not honor `CODEX_CA_CERTIFICATE` or `SSL_CERT_FILE` consistently. This PR broadens the shared custom-CA logic beyond login and applies the same policy to websocket TLS, so the enterprise-proxy story is no longer split between “HTTPS works” and “websockets still fail”. ## What This Delivers Custom CA support is no longer limited to login. Codex outbound HTTPS clients and secure websocket connections can now honor the same `CODEX_CA_CERTIFICATE` / `SSL_CERT_FILE` configuration, so enterprise proxy/intercept setups work more consistently end-to-end. For users and operators, nothing new needs to be configured beyond the same CA env vars introduced in #14178. The change is that more of Codex now respects them, including websocket-backed flows that were previously still using default trust roots. I also manually validated the proxy path locally with mitmproxy using: `CODEX_CA_CERTIFICATE=~/.mitmproxy/mitmproxy-ca-cert.pem HTTPS_PROXY=http://127.0.0.1:8080 just codex` with mitmproxy installed via `brew install mitmproxy` and configured as the macOS system proxy. ## Mental model `codex-client` is now the owner of shared custom-CA policy for outbound TLS client construction. Reqwest callers start from the builder configuration they already need, then pass that builder through `build_reqwest_client_with_custom_ca(...)`. Websocket callers ask the same module for a rustls client config when a custom CA bundle is configured. The env precedence is the same everywhere: - `CODEX_CA_CERTIFICATE` wins - otherwise fall back to `SSL_CERT_FILE` - otherwise use system roots The helper is intentionally narrow. It loads every usable certificate from the configured PEM bundle into the appropriate root store and returns either a configured transport or a typed error that explains what went wrong. ## Non-goals This does not add handshake-level integration tests against a live TLS endpoint. It does not validate that the configured bundle forms a meaningful certificate chain. It also does not try to force every transport in the repo through one abstraction; it extends the shared CA policy across the reqwest and websocket paths that actually needed it. ## Tradeoffs The main tradeoff is centralizing CA behavior in `codex-client` while still leaving adoption up to call sites. That keeps the implementation additive and reviewable, but it means the rule "outbound Codex TLS that should honor enterprise roots must use the shared helper" is still partly enforced socially rather than by types. For websockets, the shared helper only builds an explicit rustls config when a custom CA bundle is configured. When no override env var is set, websocket callers still use their ordinary default connector path. ## Architecture `codex-client::custom_ca` now owns CA bundle selection, PEM normalization, mixed-section parsing, certificate extraction, typed CA-loading errors, and optional rustls client-config construction for websocket TLS. The affected consumers now call into that shared helper directly rather than carrying login-local CA behavior: - backend-client - cloud-tasks - RMCP client paths that use `reqwest` - TUI voice HTTP paths - `codex-core` default reqwest client construction - `codex-api` websocket clients for both responses and realtime websocket connections The subprocess CA probe, env-sensitive integration tests, and shared PEM fixtures also live in `codex-client`, which is now the actual owner of the behavior they exercise. ## Observability The shared CA path logs: - which environment variable selected the bundle - which path was loaded - how many certificates were accepted - when `TRUSTED CERTIFICATE` labels were normalized - when CRLs were ignored - where client construction failed Returned errors remain user-facing and include the relevant env var, path, and remediation hint. That same error model now applies whether the failure surfaced while building a reqwest client or websocket TLS configuration. ## Tests Pure unit tests in `codex-client` cover env precedence and PEM normalization behavior. Real client construction remains in subprocess tests so the suite can control process env and avoid the macOS seatbelt panic path that motivated the hermetic test split. The subprocess coverage verifies: - `CODEX_CA_CERTIFICATE` precedence over `SSL_CERT_FILE` - fallback to `SSL_CERT_FILE` - single-cert and multi-cert bundles - malformed and empty-file errors - OpenSSL `TRUSTED CERTIFICATE` handling - CRL tolerance for well-formed CRL sections The websocket side is covered by the existing `codex-api` / `codex-core` websocket test suites plus the manual mitmproxy validation above. --------- Co-authored-by: Ivan Zakharchanka <3axap4eHko@gmail.com> Co-authored-by: Codex <noreply@openai.com>
Josh McKinney ·
2026-03-13 00:59:26 +00:00 -
login: add custom CA support for login flows (#14178)
## Stacked PRs This work is split across three stacked PRs: - #14178: add custom CA support for browser and device-code login flows, docs, and hermetic subprocess tests - #14239: broaden the shared custom CA path from login to other outbound `reqwest` clients across Codex - #14240: extend that shared custom CA handling to secure websocket TLS so websocket connections honor the same CA env vars Review order: #14178, then #14239, then #14240. Supersedes #6864. Thanks to @3axap4eHko for the original implementation and investigation here. Although this version rearranges the code and history significantly, the majority of the credit for this work belongs to them. ## Problem Login flows need to work in enterprise environments where outbound TLS is intercepted by an internal proxy or gateway. In those setups, system root certificates alone are often insufficient to validate the OAuth and device-code endpoints used during login. The change adds a login-specific custom CA loading path, but the important contracts around env precedence, PEM compatibility, test boundaries, and probe-only workarounds need to be explicit so reviewers can understand what behavior is intentional. For users and operators, the behavior is simple: if login needs to trust a custom root CA, set `CODEX_CA_CERTIFICATE` to a PEM file containing one or more certificates. If that variable is unset, login falls back to `SSL_CERT_FILE`. If neither is set, login uses system roots. Invalid or empty PEM files now fail with an error that points back to those environment variables and explains how to recover. ## What This Delivers Users can now make Codex login work behind enterprise TLS interception by pointing `CODEX_CA_CERTIFICATE` at a PEM bundle containing the relevant root certificates. If that variable is unset, login falls back to `SSL_CERT_FILE`, then to system roots. This PR applies that behavior to both browser-based and device-code login flows. It also makes login tolerant of the PEM shapes operators actually have in hand: multi-certificate bundles, OpenSSL `TRUSTED CERTIFICATE` labels, and bundles that include well-formed CRLs. ## Mental model `codex-login` is the place where the login flows construct ad hoc outbound HTTP clients. That makes it the right boundary for a narrow CA policy: look for `CODEX_CA_CERTIFICATE`, fall back to `SSL_CERT_FILE`, load every parseable certificate block in that bundle into a `reqwest::Client`, and fail early with a clear user-facing error if the bundle is unreadable or malformed. The implementation is intentionally pragmatic about PEM input shape. It accepts ordinary certificate bundles, multi-certificate bundles, OpenSSL `TRUSTED CERTIFICATE` labels, and bundles that also contain CRLs. It does not validate a certificate chain or prove a handshake; it only constructs the root store used by login. ## Non-goals This change does not introduce a general-purpose transport abstraction for the rest of the product. It does not validate whether the provided bundle forms a real chain, and it does not add handshake-level integration tests against a live TLS server. It also does not change login state management or OAuth semantics beyond ensuring the existing flows share the same CA-loading rules. ## Tradeoffs The main tradeoff is keeping this logic scoped to login-specific client construction rather than lifting it into a broader shared HTTP layer. That keeps the review surface smaller, but it also means future login-adjacent code must continue to use `build_login_http_client()` or it can silently bypass enterprise CA overrides. The `TRUSTED CERTIFICATE` handling is also intentionally a local compatibility shim. The rustls ecosystem does not currently accept that PEM label upstream, so the code normalizes it locally and trims the OpenSSL `X509_AUX` trailer bytes down to the certificate DER that `reqwest` can consume. ## Architecture `custom_ca.rs` is now the single place that owns login CA behavior. It selects the CA file from the environment, reads it, normalizes PEM label shape where needed, iterates mixed PEM sections with `rustls-pki-types`, ignores CRLs, trims OpenSSL trust metadata when necessary, and returns either a configured `reqwest::Client` or a typed error. The browser login server and the device-code flow both call `build_login_http_client()`, so they share the same trust-store policy. Environment-sensitive tests run through the `login_ca_probe` helper binary because those tests must control process-wide env vars and cannot reliably build a real reqwest client in-process on macOS seatbelt runs. ## Observability The custom CA path logs which environment variable selected the bundle, which file path was loaded, how many certificates were accepted, when `TRUSTED CERTIFICATE` labels were normalized, when CRLs were ignored, and where client construction failed. Returned errors remain user-facing and include the relevant path, env var, and remediation hint. This gives enough signal for three audiences: - users can see why login failed and which env/file caused it - sysadmins can confirm which override actually won - developers can tell whether the failure happened during file read, PEM parsing, certificate registration, or final reqwest client construction ## Tests Pure unit tests stay limited to env precedence and empty-value handling. Real client construction lives in subprocess tests so the suite remains hermetic with respect to process env and macOS sandbox behavior. The subprocess tests verify: - `CODEX_CA_CERTIFICATE` precedence over `SSL_CERT_FILE` - fallback to `SSL_CERT_FILE` - single-certificate and multi-certificate bundles - malformed and empty-bundle errors - OpenSSL `TRUSTED CERTIFICATE` handling - CRL tolerance for well-formed CRL sections The named PEM fixtures under `login/tests/fixtures/` are shared by the tests so their purpose stays reviewable. --------- Co-authored-by: Ivan Zakharchanka <3axap4eHko@gmail.com> Co-authored-by: Codex <noreply@openai.com>
Josh McKinney ·
2026-03-13 00:14:54 +00:00 -
Persist js_repl codex helpers across cells (#14503)
## Summary This changes `js_repl` so saved references to `codex.tool(...)` and `codex.emitImage(...)` keep working across cells. Previously, those helpers were recreated per exec and captured that exec's `message.id`. If a persisted object or saved closure reused an old helper in a later cell, the nested tool/image call could fail with `js_repl exec context not found`. This patch: - keeps stable `codex.tool` and `codex.emitImage` helper identities in the kernel - resolves the current exec dynamically at call time using `AsyncLocalStorage` - adds regression coverage for persisted helper references across cells - updates the js_repl docs and project-doc instructions to describe the new behavior and its limits ## Why We already support persistent top-level bindings across `js_repl` cells, so persisted objects should be able to reuse `codex` helpers in later active cells. The bug was that helper identity was exec-scoped, not kernel-scoped. Using `AsyncLocalStorage` fixes the cross-cell reuse case without falling back to a single global active exec that could accidentally attribute stale background callbacks to the wrong cell.
Curtis 'Fjord' Hawthorne ·
2026-03-12 15:41:54 -07:00 -
Let models opt into original image detail (#14175)
## Summary This PR narrows original image detail handling to a single opt-in feature: - `image_detail_original` lets the model request `detail: "original"` on supported models - Omitting `detail` preserves the default resized behavior The model only sees `detail: "original"` guidance when the active model supports it: - JS REPL instructions include the guidance and examples only on supported models - `view_image` only exposes a `detail` parameter when the feature and model can use it The image detail API is intentionally narrow and consistent across both paths: - `view_image.detail` supports only `"original"`; otherwise omit the field - `codex.emitImage(..., detail)` supports only `"original"`; otherwise omit the field - Unsupported explicit values fail clearly at the API boundary instead of being silently reinterpreted - Unsupported explicit `detail: "original"` requests fall back to normal behavior when the feature is disabled or the model does not support original detail
Curtis 'Fjord' Hawthorne ·
2026-03-11 15:25:07 -07:00 -
Add js_repl cwd and homeDir helpers (#14385)
## Summary This PR adds two read-only path helpers to `js_repl`: - `codex.cwd` - `codex.homeDir` They are exposed alongside the existing `codex.tmpDir` helper so the REPL can reference basic host path context without reopening direct `process` access. ## Implementation - expose `codex.cwd` and `codex.homeDir` from the js_repl kernel - make `codex.homeDir` come from the kernel process environment - pass session dependency env through js_repl kernel startup so `codex.homeDir` matches the env a shell-launched process would see - keep existing shell `HOME` population behavior unchanged - update js_repl prompt/docs and add runtime/integration coverage for the new helpers
Curtis 'Fjord' Hawthorne ·
2026-03-11 14:44:44 -07:00 -
Add realtime start instructions config override (#14270)
- add `realtime_start_instructions` config support - thread it into realtime context updates, schema, docs, and tests
Ahmed Ibrahim ·
2026-03-11 12:33:09 -07:00 -
docs: remove auth login logging plan (#13810)
## Summary Remove `docs/auth-login-logging-plan.md`. ## Why The document was a temporary planning artifact. The durable rationale for the auth-login diagnostics work now lives in the code comments, tests, PR context, and existing implementation notes, so keeping the standalone plan doc adds duplicate maintenance surface. ## Testing - not run (docs-only deletion) Co-authored-by: Codex <noreply@openai.com>
Josh McKinney ·
2026-03-06 23:32:53 +00:00 -
feat: add auth login diagnostics (#13797)
## Problem Browser login failures historically leave support with an incomplete picture. HARs can show that the browser completed OAuth and reached the localhost callback, but they do not explain why the native client failed on the final `/oauth/token` exchange. Direct `codex login` also relied mostly on terminal stderr and the browser error page, so even when the login crate emitted better sign-in diagnostics through TUI or app-server flows, the one-shot CLI path still did not leave behind an easy artifact to collect. ## Mental model This implementation treats the browser page, the returned `io::Error`, and the normal structured log as separate surfaces with different safety requirements. The browser page and returned error preserve the detail that operators need to diagnose failures. The structured log stays narrower: it records reviewed lifecycle events, parsed safe fields, and redacted transport errors without becoming a sink for secrets or arbitrary backend bodies. Direct `codex login` now adds a fourth support surface: a small file-backed log at `codex-login.log` under the configured `log_dir`. That artifact carries the same login-target events as the other entrypoints without changing the existing stderr/browser UX. ## Non-goals This does not add auth logging to normal runtime requests, and it does not try to infer precise transport root causes from brittle string matching. The scope remains the browser-login callback flow in the `login` crate plus a direct-CLI wrapper that persists those events to disk. This also does not try to reuse the TUI logging stack wholesale. The TUI path initializes feedback, OpenTelemetry, and other session-oriented layers that are useful for an interactive app but unnecessary for a one-shot login command. ## Tradeoffs The implementation favors fidelity for caller-visible errors and restraint for persistent logs. Parsed JSON token-endpoint errors are logged safely by field. Non-JSON token-endpoint bodies remain available to the returned error so CLI and browser surfaces still show backend detail. Transport errors keep their real `reqwest` message, but attached URLs are surgically redacted. Custom issuer URLs are sanitized before logging. On the CLI side, the code intentionally duplicates a narrow slice of the TUI file-logging setup instead of sharing the full initializer. That keeps `codex login` easy to reason about and avoids coupling it to interactive-session layers that the command does not need. ## Architecture The core auth behavior lives in `codex-rs/login/src/server.rs`. The callback path now logs callback receipt, callback validation, token-exchange start, token-exchange success, token-endpoint non-2xx responses, and transport failures. App-server consumers still use this same login-server path via `run_login_server(...)`, so the same instrumentation benefits TUI, Electron, and VS Code extension flows. The direct CLI path in `codex-rs/cli/src/login.rs` now installs a small file-backed tracing layer for login commands only. That writes `codex-login.log` under `log_dir` with login-specific targets such as `codex_cli::login` and `codex_login::server`. ## Observability The main signals come from the `login` crate target and are intentionally scoped to sign-in. Structured logs include redacted issuer URLs, redacted transport errors, HTTP status, and parsed token-endpoint fields when available. The callback-layer log intentionally avoids `%err` on token-endpoint failures so arbitrary backend bodies do not get copied into the normal log file. Direct `codex login` now leaves a durable artifact for both failure and success cases. Example output from the new file-backed CLI path: Failing callback: ```text 2026-03-06T22:08:54.143612Z INFO codex_cli::login: starting browser login flow 2026-03-06T22:09:03.431699Z INFO codex_login::server: received login callback path=/auth/callback has_code=false has_state=true has_error=true state_valid=true 2026-03-06T22:09:03.431745Z WARN codex_login::server: oauth callback returned error error_code="access_denied" has_error_description=true ``` Succeeded callback and token exchange: ```text 2026-03-06T22:09:14.065559Z INFO codex_cli::login: starting browser login flow 2026-03-06T22:09:36.431678Z INFO codex_login::server: received login callback path=/auth/callback has_code=true has_state=true has_error=false state_valid=true 2026-03-06T22:09:36.436977Z INFO codex_login::server: starting oauth token exchange issuer=https://auth.openai.com/ redirect_uri=http://localhost:1455/auth/callback 2026-03-06T22:09:36.685438Z INFO codex_login::server: oauth token exchange succeeded status=200 OK ``` ## Tests - `cargo test -p codex-login` - `cargo clippy -p codex-login --tests -- -D warnings` - `cargo test -p codex-cli` - `just bazel-lock-update` - `just bazel-lock-check` - manual direct `codex login` smoke tests for both a failing callback and a successful browser login --------- Co-authored-by: Codex <noreply@openai.com>
Josh McKinney ·
2026-03-06 15:00:37 -08:00 -
Clarify js_repl image emission and encoding guidance (#13639)
## Summary This updates the `js_repl` prompt and docs to make the image guidance less confusing. ## What changed - Clarified that `codex.emitImage(...)` adds one image per call and can be called multiple times to emit multiple images. - Reworded the image-encoding guidance to be general `js_repl` advice instead of `ImageDetailOriginal`-specific behavior. - Updated the guidance to recommend JPEG at about quality 85 when lossy compression is acceptable, and PNG when transparency or lossless detail matters. - Mirrored the same wording in the public `js_repl` docs.
Curtis 'Fjord' Hawthorne ·
2026-03-05 16:02:37 -08:00 -
Harden js_repl emitImage to accept only data: URLs (#13507)
### Motivation - Prevent untrusted js_repl code from supplying arbitrary external URLs that the host would forward into model input and cause external fetches / data exfiltration. This change narrows the emitImage contract to safe, self-contained data URLs. ### Description - Kernel: added `normalizeEmitImageUrl` and enforce that string-valued `codex.emitImage(...)` inputs and `input_image`/content-item paths only accept non-empty `data:` URLs; byte-based paths still produce data URLs as before (`kernel.js`). - Host: added `validate_emitted_image_url` and check `EmitImage` requests before creating `FunctionCallOutputContentItem::InputImage`, returning an error to the kernel if the URL is not a `data:` URL (`mod.rs`). - Tests/docs: added a runtime test `js_repl_emit_image_rejects_non_data_url` to assert rejection of non-data URLs and updated user-facing docs/instruction text to state `data URL` support instead of generic direct image URLs (`mod.rs`, `docs/js_repl.md`, `project_doc.rs`). ### Testing - Ran `just fmt` in `codex-rs`; it completed successfully. - Added a runtime test (`cargo test -p codex-core js_repl_emit_image_rejects_non_data_url`) but executing the test in this environment failed due to a missing system dependency required by `codex-linux-sandbox` (the vendored `bubblewrap` build requires `libcap.pc` via `pkg-config`), so the test could not be run here. - Attempted a focused `cargo test` invocation with and without default features; both compile/test attempts were blocked by the same missing system `libcap` dependency in this environment. ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_69a7837bce98832d91db92d5f76d6cbe)
Curtis 'Fjord' Hawthorne ·
2026-03-05 12:12:32 -08:00 -
Persist initialized js_repl bindings after failed cells (#13482)
## Summary - Change `js_repl` failed-cell persistence so later cells keep prior bindings plus only the current-cell bindings whose initialization definitely completed before the throw. - Preserve initialized lexical bindings across failed cells via module-namespace readability, including top-level destructuring that partially succeeds before a later throw. - Preserve hoisted `var` and `function` bindings only when execution clearly reached their declaration site, and preserve direct top-level pre-declaration `var` writes and updates through explicit write-site markers. - Preserve top-level `for...in` / `for...of` `var` bindings when the loop body executes at least once, using a first-iteration guard to avoid per-iteration bookkeeping overhead. - Keep prior module state intact across link-time failures and evaluation failures before the prelude runs, while still allowing failed cells that already recreated prior bindings to persist updates to those existing bindings. - Hide internal commit hooks from user `js_repl` code after the prelude aliases them, so snippets cannot spoof committed bindings by calling the raw `import.meta` hooks directly. - Add focused regression coverage for the supported failed-cell behaviors and the intentionally unsupported boundaries. - Update `js_repl` docs and generated instructions to describe the new, narrower failed-cell persistence model. ## Motivation We saw `js_repl` drop bindings that had already been initialized successfully when a later statement in the same cell threw, for example: const { context: liveContext, session } = await initializeGoogleSheetsLiveForTab(tab); // later statement throws That was surprising in practice because successful earlier work disappeared from the next cell. This change makes failed-cell persistence more useful without trying to model every possible partially executed JavaScript edge case. The resulting behavior is narrower and easier to reason about: - prior bindings are always preserved - lexical bindings persist when their initialization completed before the throw - hoisted `var` / `function` bindings persist only when execution clearly reached their declaration or a supported top-level `var` write site - failed cells that already recreated prior bindings can persist writes to those existing bindings even if they introduce no new bindings The detailed edge-case matrix stays in `docs/js_repl.md`. The model-facing `project_doc` guidance is intentionally shorter and focused on generation-relevant behavior. ## Supported Failed-Cell Behavior - Prior bindings remain available after a failed cell. - Initialized lexical bindings remain available after a failed cell. - Top-level destructuring like `const { a, b } = ...` preserves names whose initialization completed before a later throw. - Hoisted `function` bindings persist when execution reached the declaration statement before the throw. - Direct top-level pre-declaration `var` writes and updates persist, for example: - `x = 1` - `x += 1` - `x++` - short-circuiting logical assignments only persist when the write branch actually runs - Non-empty top-level `for...in` / `for...of` `var` loops persist their loop bindings. - Failed cells can persist updates to existing carried bindings after the prelude has run, even when the cell commits no new bindings. - Link failures and eval failures before the prelude do not poison `@prev`. ## Intentionally Unsupported Failed-Cell Cases - Hoisted function reads before the declaration, such as `foo(); ...; function foo() {}` - Aliasing or inference-based recovery from reads before declaration - Nested writes inside already-instrumented assignment RHS expressions - Destructuring-assignment recovery for hoisted `var` - Partial `var` destructuring recovery - Pre-declaration `undefined` reads for hoisted `var` - Empty top-level `for...in` / `for...of` loop vars - Nested or scope-sensitive pre-declaration `var` writes outside direct top-level expression statementsCurtis 'Fjord' Hawthorne ·
2026-03-05 11:01:46 -08:00 -
[js_repl] Support local ESM file imports (#13437)
## Summary - add `js_repl` support for dynamic imports of relative and absolute local ESM `.js` / `.mjs` files - keep bare package imports on the native Node path and resolved from REPL-global search roots (`CODEX_JS_REPL_NODE_MODULE_DIRS`, then `cwd`), even when they originate from imported local files - restrict static imports inside imported local files to other local relative/absolute `.js` / `.mjs` files, and surface a clear error for unsupported top-level static imports in the REPL cell - run imported local files inside the REPL VM context so they can access `codex.tmpDir`, `codex.tool`, captured `console`, and Node-like `import.meta` helpers - reload local files between execs so later `await import("./file.js")` calls pick up edits and fixed failures, while preserving package/builtin caching and persistent top-level REPL bindings - make `import.meta.resolve()` self-consistent by allowing the returned `file://...` URLs to round-trip through `await import(...)` - update both public and injected `js_repl` docs to clarify the narrowed contract, including global bare-import resolution behavior for local absolute files ## Testing - `cargo test -p codex-core js_repl_` - built codex binary and verified behavior --------- Co-authored-by: Codex <noreply@openai.com>aaronl-openai ·
2026-03-04 22:40:31 -08:00 -
Make js_repl image output controllable (#13331)
## Summary Instead of always adding inner function call outputs to the model context, let js code decide which ones to return. - Stop auto-hoisting nested tool outputs from `codex.tool(...)` into the outer `js_repl` function output. - Keep `codex.tool(...)` return values unchanged as structured JS objects. - Add `codex.emitImage(...)` as the explicit path for attaching an image to the outer `js_repl` function output. - Support emitting from a direct image URL, a single `input_image` item, an explicit `{ bytes, mimeType }` object, or a raw tool response object containing exactly one image. - Preserve existing `view_image` original-resolution behavior when JS emits the raw `view_image` tool result. - Suppress the special `ViewImageToolCall` event for `js_repl`-sourced `view_image` calls so nested inspection stays side-effect free until JS explicitly emits. - Update the `js_repl` docs and generated project instructions with both recommended patterns: - `await codex.emitImage(codex.tool("view_image", { path }))` - `await codex.emitImage({ bytes: await page.screenshot({ type: "jpeg", quality: 85 }), mimeType: "image/jpeg" })` #### [git stack](https://github.com/magus/git-stack-cli) - ✅ `1` https://github.com/openai/codex/pull/13050 - 👉 `2` https://github.com/openai/codex/pull/13331 - ⏳ `3` https://github.com/openai/codex/pull/13049Curtis 'Fjord' Hawthorne ·
2026-03-03 16:25:59 -08:00 -
tui: preserve kill buffer across submit and slash-command clears (#12006)
## Problem Before this change, composer paths that cleared the textarea after submit or slash-command dispatch also cleared the textarea kill buffer. That meant a user could `Ctrl+K` part of a draft, trigger a composer action that cleared the visible draft, and then lose the ability to `Ctrl+Y` the killed text back. This was especially awkward for workflows where the user wants to temporarily remove text, run a composer action such as changing reasoning level or dispatching a slash command, and then restore the killed text into the now-empty draft. ## Mental model This change separates visible draft state from editing-history state. The visible draft includes the current textarea contents and text elements that should be cleared when the composer submits or dispatches a command. The kill buffer is different: it represents the most recent killed text and should survive those composer-driven clears so the user can still yank it back afterward. After this change, submit and slash-command dispatch still clear the visible textarea contents, but they no longer erase the most recent kill. ## Non-goals This does not implement a multi-entry kill ring or change the semantics of `Ctrl+K` and `Ctrl+Y` beyond preserving the existing yank target across these clears. It also does not change how submit, slash-command parsing, prompt expansion, or attachment handling work, except that those flows no longer discard the textarea kill buffer as a side effect of clearing the draft. ## Tradeoffs The main tradeoff is that clearing the visible textarea is no longer equivalent to fully resetting all editing state. That is intentional here, because submit and slash-command dispatch are composer actions, not requests to forget the user's most recent kill. The benefit is better editing continuity. The cost is that callers must understand that full-buffer replacement resets visible draft state but not the kill buffer. ## Architecture The behavioral change is in `TextArea`: full-buffer replacement now rebuilds text and elements without clearing `kill_buffer`. `ChatComposer` already clears the textarea after successful submit and slash-command dispatch by calling into those textarea replacement paths. With this change, those existing composer flows inherit the new behavior automatically: the visible draft is cleared, but the last killed text remains available for `Ctrl+Y`. The tests cover both layers: - `TextArea` verifies that the kill buffer survives full-buffer replacement. - `ChatComposer` verifies that it survives submit. - `ChatComposer` also verifies that it survives slash-command dispatch. ## Observability There is no dedicated logging for kill-buffer preservation. The most direct way to reason about the behavior is to inspect textarea-wide replacement paths and confirm whether they treat the kill buffer as visible-buffer state or as editing-history state. If this regresses in the future, the likely failure mode is simple and user-visible: `Ctrl+Y` stops restoring text after submit or slash-command clears even though ordinary kill/yank still works within a single uninterrupted draft. ## Tests Added focused regression coverage for the new contract: - `kill_buffer_persists_across_set_text` - `kill_buffer_persists_after_submit` - `kill_buffer_persists_after_slash_command_dispatch` Local verification: - `just fmt` - `cargo test -p codex-tui` --------- Co-authored-by: Josh McKinney <joshka@openai.com>
rakan-oai ·
2026-03-03 02:06:08 +00:00 -
notify: include client in legacy hook payload (#12968)
## Why The `notify` hook payload did not identify which Codex client started the turn. That meant downstream notification hooks could not distinguish between completions coming from the TUI and completions coming from app-server clients such as VS Code or Xcode. Now that the Codex App provides its own desktop notifications, it would be nice to be able to filter those out. This change adds that context without changing the existing payload shape for callers that do not know the client name, and keeps the new end-to-end test cross-platform. ## What changed - added an optional top-level `client` field to the legacy `notify` JSON payload - threaded that value through `core` and `hooks`; the internal session and turn state now carries it as `app_server_client_name` - set the field to `codex-tui` for TUI turns - captured `initialize.clientInfo.name` in the app server and applied it to subsequent turns before dispatching hooks - replaced the notify integration test hook with a `python3` script so the test does not rely on Unix shell permissions or `bash` - documented the new field in `docs/config.md` ## Testing - `cargo test -p codex-hooks` - `cargo test -p codex-tui` - `cargo test -p codex-app-server suite::v2::initialize::turn_start_notify_payload_includes_initialize_client_name -- --exact --nocapture` - `cargo test -p codex-core` (`src/lib.rs` passed; `core/tests/all.rs` still has unrelated existing failures in this environment) ## Docs The public config reference on `developers.openai.com/codex` should mention that the legacy `notify` payload may include a top-level `client` field. The TUI reports `codex-tui`, and the app server reports `initialize.clientInfo.name` when it is available.
Michael Bolin ·
2026-02-26 22:27:34 -08:00 -
Log js_repl nested tool responses in rollout history (#12837)
## Summary - add tracing-based diagnostics for nested `codex.tool(...)` calls made from `js_repl` - emit a bounded, sanitized summary at `info!` - emit the exact raw serialized response object or error string seen by JavaScript at `trace!` - document how to enable these logs and where to find them, especially for `codex app-server` ## Why Nested `codex.tool(...)` calls inside `js_repl` are a debugging boundary: JavaScript sees the tool result, but that result is otherwise hard to inspect from outside the kernel. This change adds explicit tracing for that path using the repo’s normal observability pattern: - `info` for compact summaries - `trace` for exact raw payloads when deep debugging is needed ## What changed - `js_repl` now summarizes nested tool-call results across the response shapes it can receive: - message content - function-call outputs - custom tool outputs - MCP tool results and MCP error results - direct error strings - each nested `codex.tool(...)` completion logs: - `exec_id` - `tool_call_id` - `tool_name` - `ok` - a bounded summary struct describing the payload shape - at `trace`, the same path also logs the exact serialized response object or error string that JavaScript received - docs now include concrete logging examples for `codex app-server` - unit coverage was added for multimodal function output summaries and error summaries ## How to use it ### Summary-only logging Set: ```sh RUST_LOG=codex_core::tools::js_repl=info ``` For `codex app-server`, tracing output is written to the server process `stderr`. Example: ```sh RUST_LOG=codex_core::tools::js_repl=info \ LOG_FORMAT=json \ codex app-server \ 2> /tmp/codex-app-server.log ``` This emits bounded summary lines for nested `codex.tool(...)` calls. ### Full raw debugging Set: ```sh RUST_LOG=codex_core::tools::js_repl=trace ``` Example: ```sh RUST_LOG=codex_core::tools::js_repl=trace \ LOG_FORMAT=json \ codex app-server \ 2> /tmp/codex-app-server.log ``` At `trace`, you get: - the same `info` summary line - a `trace` line with the exact serialized response object seen by JavaScript - or the exact error string if the nested tool call failed ### Where the logs go For `codex app-server`, these logs go to process `stderr`, so redirect or capture `stderr` to inspect them. Example: ```sh RUST_LOG=codex_core::tools::js_repl=trace \ LOG_FORMAT=json \ /Users/fjord/code/codex/codex-rs/target/debug/codex app-server \ 2> /tmp/codex-app-server.log ``` Then inspect: ```sh rg "js_repl nested tool call" /tmp/codex-app-server.log ``` Without an explicit `RUST_LOG` override, these `js_repl` nested tool-call logs are typically not visible.
Curtis 'Fjord' Hawthorne ·
2026-02-26 10:12:28 -08:00 -
Agent jobs (spawn_agents_on_csv) + progress UI (#10935)
## Summary - Add agent job support: spawn a batch of sub-agents from CSV, auto-run, auto-export, and store results in SQLite. - Simplify workflow: remove run/resume/get-status/export tools; spawn is deterministic and completes in one call. - Improve exec UX: stable, single-line progress bar with ETA; suppress sub-agent chatter in exec. ## Why Enables map-reduce style workflows over arbitrarily large repos using the existing Codex orchestrator. This addresses review feedback about overly complex job controls and non-deterministic monitoring. ## Demo (progress bar) ``` ./codex-rs/target/debug/codex exec \ --enable collab \ --enable sqlite \ --full-auto \ --progress-cursor \ -c agents.max_threads=16 \ -C /Users/daveaitel/code/codex \ - <<'PROMPT' Create /tmp/agent_job_progress_demo.csv with columns: path,area and 30 rows: path = item-01..item-30, area = test. Then call spawn_agents_on_csv with: - csv_path: /tmp/agent_job_progress_demo.csv - instruction: "Run `python - <<'PY'` to sleep a random 0.3–1.2s, then output JSON with keys: path, score (int). Set score = 1." - output_csv_path: /tmp/agent_job_progress_demo_out.csv PROMPT ``` ## Review feedback addressed - Auto-start jobs on spawn; removed run/resume/status/export tools. - Auto-export on success. - More descriptive tool spec + clearer prompts. - Avoid deadlocks on spawn failure; pending/running handled safely. - Progress bar no longer scrolls; stable single-line redraw. ## Tests - `cd codex-rs && cargo test -p codex-exec` - `cd codex-rs && cargo build -p codex-cli`
daveaitel-openai ·
2026-02-24 21:00:19 +00:00 -
feat: discourage the use of the --all-features flag (#12429)
## Why Developers are frequently running low on disk space, and routine use of `--all-features` contributes to larger Cargo build caches in `target/` by compiling additional feature combinations. This change updates local workflow guidance to avoid `--all-features` by default and reserve it for cases where full feature coverage is specifically needed. ## What Changed - Updated `AGENTS.md` guidance for `codex-rs` to recommend `cargo test` / `just test` for full-suite local runs, and to call out the disk-usage cost of routine `--all-features` usage. - Updated the root `justfile` so `just fix` and `just clippy` no longer pass `--all-features` by default. - Updated `docs/install.md` to explicitly describe `cargo test --all-features` as an optional heavier-weight run (more build time and `target/` disk usage). ## Verification - Confirmed the `justfile` parses and the recipes list successfully with `just --list`.
Michael Bolin ·
2026-02-20 23:02:24 -08:00 -
Improve Plan mode reasoning selection flow (#12303)
Addresses https://github.com/openai/codex/issues/11013 ## Summary - add a Plan implementation path in the TUI that lets users choose reasoning before switching to Default mode and implementing - add Plan-mode reasoning scope handling (Plan-only override vs all-modes default), including config/schema/docs plumbing for `plan_mode_reasoning_effort` - remove the hardcoded Plan preset medium default and make the reasoning popup reflect the active Plan override as `(current)` - split the collaboration-mode switch notification UI hint into #12307 to keep this diff focused If I have `plan_mode_reasoning_effort = "medium"` set in my `config.toml`: <img width="699" height="127" alt="Screenshot 2026-02-20 at 6 59 37 PM" src="https://github.com/user-attachments/assets/b33abf04-6b7a-49ed-b2e9-d24b99795369" /> If I don't have `plan_mode_reasoning_effort` set in my `config.toml`: <img width="704" height="129" alt="Screenshot 2026-02-20 at 7 01 51 PM" src="https://github.com/user-attachments/assets/88a086d4-d2f1-49c7-8be4-f6f0c0fa1b8d" /> ## Codex author `codex resume 019c78a2-726b-7fe3-adac-3fa4523dcc2a`
Charley Cunningham ·
2026-02-20 20:08:56 -08:00 -
docs: use --locked when installing cargo-nextest (#12377)
## What Updates the optional `cargo-nextest` install command in `docs/install.md`: - `cargo install cargo-nextest` -> `cargo install --locked cargo-nextest` ## Why The current docs command can fail during source install because recent `cargo-nextest` releases intentionally require `--locked`. Repro (macOS, but likely not platform-specific): - `cargo install cargo-nextest` - Fails with a compile error from `locked-tripwire` indicating: - `Nextest does not support being installed without --locked` - suggests `cargo install --locked cargo-nextest` Using the locked command succeeds: - `cargo install --locked cargo-nextest` ## How Single-line docs change in `docs/install.md` to match current `cargo-nextest` install requirements. ## Validation - Reproduced failure locally using a temporary `CARGO_HOME` directory (clean Cargo home) - Example command used: `CARGO_HOME=/tmp/cargo-home-test cargo install cargo-nextest` - Confirmed success with `cargo install --locked cargo-nextest`
derekf-oai ·
2026-02-20 14:12:13 -08:00 -
js_repl: remove codex.state helper references (#12275)
## Summary This PR removes `codex.state` from the `js_repl` helper surface and removes all corresponding documentation/instruction references. ## Motivation Top-level bindings in `js_repl` now persist across cells, so the extra `codex.state` helper is redundant and adds unnecessary API/docs surface. ## Changes - Removed the long-lived `state` object from the Node kernel helper wiring. - Stopped exposing `codex.state` (and `context.state`) during `js_repl` execution. - Updated user-facing `js_repl` docs to remove `codex.state`. - Updated generated instruction text and related test expectations to list only: - `codex.tmpDir` - `codex.tool(name, args?)` #### [git stack](https://github.com/magus/git-stack-cli) - ✅ `1` https://github.com/openai/codex/pull/12300 - 👉 `2` https://github.com/openai/codex/pull/12275 - ⏳ `3` https://github.com/openai/codex/pull/12205 - ⏳ `4` https://github.com/openai/codex/pull/12185 - ⏳ `5` https://github.com/openai/codex/pull/10673
Curtis 'Fjord' Hawthorne ·
2026-02-20 11:20:45 -08:00 -
[js_repl] paths for node module resolution can be specified for js_repl (#11944)
# External (non-OpenAI) Pull Request Requirements In `js_repl` mode, module resolution currently starts from `js_repl_kernel.js`, which is written to a per-kernel temp dir. This effectively means that bare imports will not resolve. This PR adds a new config option, `js_repl_node_module_dirs`, which is a list of dirs that are used (in order) to resolve a bare import. If none of those work, the current working directory of the thread is used. For example: ```toml js_repl_node_module_dirs = [ "/path/to/node_modules/", "/other/path/to/node_modules/", ] ```aaronl-openai ·
2026-02-17 23:29:49 -08:00 -
tui: preserve remote image attachments across resume/backtrack (#10590)
## Summary This PR makes app-server-provided image URLs first-class attachments in TUI, so they survive resume/backtrack/history recall and are resubmitted correctly. <img width="715" height="491" alt="Screenshot 2026-02-12 at 8 27 08 PM" src="https://github.com/user-attachments/assets/226cbd35-8f0c-4e51-a13e-459ef5dd1927" /> Can delete the attached image upon backtracking: <img width="716" height="301" alt="Screenshot 2026-02-12 at 8 27 31 PM" src="https://github.com/user-attachments/assets/4558d230-f1bd-4eed-a093-8e1ab9c6db27" /> In both history and composer, remote images are rendered as normal `[Image #N]` placeholders, with numbering unified with local images. ## What changed - Plumb remote image URLs through TUI message state: - `UserHistoryCell` - `BacktrackSelection` - `ChatComposerHistory::HistoryEntry` - `ChatWidget::UserMessage` - Show remote images as placeholder rows inside the composer box (above textarea), and in history cells. - Support keyboard selection/deletion for remote image rows in composer (`Up`/`Down`, `Delete`/`Backspace`). - Preserve remote-image-only turns in local composer history (Up/Down recall), including restore after backtrack. - Ensure submit/queue/backtrack resubmit include remote images in model input (`UserInput::Image`), and keep request shape stable for remote-image-only turns. - Keep image numbering contiguous across remote + local images: - remote images occupy `[Image #1]..[Image #M]` - local images start at `[Image #M+1]` - deletion renumbers consistently. - In protocol conversion, increment shared image index for remote images too, so mixed remote/local image tags stay in a single sequence. - Simplify restore logic to trust in-memory attachment order (no placeholder-number parsing path). - Backtrack/replay rollback handling now queues trims through `AppEvent::ApplyThreadRollback` and syncs transcript overlay/deferred lines after trims, so overlay/transcript state stays consistent. - Trim trailing blank rendered lines from user history rendering to avoid oversized blank padding. ## Docs + tests - Updated: `docs/tui-chat-composer.md` (remote image flow, selection/deletion, numbering offsets) - Added/updated tests across `tui/src/chatwidget/tests.rs`, `tui/src/app.rs`, `tui/src/app_backtrack.rs`, `tui/src/history_cell.rs`, and `tui/src/bottom_pane/chat_composer.rs` - Added snapshot coverage for remote image composer states, including deleting the first of two remote images. ## Validation - `just fmt` - `cargo test -p codex-tui` ## Codex author `codex fork 019c2636-1571-74a1-8471-15a3b1c3f49d`
Charley Cunningham ·
2026-02-13 14:54:06 -08:00 -
Add js_repl_tools_only model and routing restrictions (#10671)
# External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request. #### [git stack](https://github.com/magus/git-stack-cli) - ✅ `1` https://github.com/openai/codex/pull/10674 - ✅ `2` https://github.com/openai/codex/pull/10672 - 👉 `3` https://github.com/openai/codex/pull/10671 - ⏳ `4` https://github.com/openai/codex/pull/10673 - ⏳ `5` https://github.com/openai/codex/pull/10670
Curtis 'Fjord' Hawthorne ·
2026-02-12 15:41:05 -08:00 -
Add js_repl host helpers and exec end events (#10672)
## Summary This PR adds host-integrated helper APIs for `js_repl` and updates model guidance so the agent can use them reliably. ### What’s included - Add `codex.tool(name, args?)` in the JS kernel so `js_repl` can call normal Codex tools. - Keep persistent JS state and scratch-path helpers available: - `codex.state` - `codex.tmpDir` - Wire `js_repl` tool calls through the standard tool router path. - Add/align `js_repl` execution completion/end event behavior with existing tool logging patterns. - Update dynamic prompt injection (`project_doc`) to document: - how to call `codex.tool(...)` - raw output behavior - image flow via `view_image` (`codex.tmpDir` + `codex.tool("view_image", ...)`) - stdio safety guidance (`console.log` / `codex.tool`, avoid direct `process.std*`) ## Why - Standardize JS-side tool usage on `codex.tool(...)` - Make `js_repl` behavior more consistent with existing tool execution and event/logging patterns. - Give the model enough runtime guidance to use `js_repl` safely and effectively. ## Testing - Added/updated unit and runtime tests for: - `codex.tool` calls from `js_repl` (including shell/MCP paths) - image handoff flow via `view_image` - prompt-injection text for `js_repl` guidance - execution/end event behavior and related regression coverage #### [git stack](https://github.com/magus/git-stack-cli) - ✅ `1` https://github.com/openai/codex/pull/10674 - 👉 `2` https://github.com/openai/codex/pull/10672 - ⏳ `3` https://github.com/openai/codex/pull/10671 - ⏳ `4` https://github.com/openai/codex/pull/10673 - ⏳ `5` https://github.com/openai/codex/pull/10670Curtis 'Fjord' Hawthorne ·
2026-02-12 12:10:25 -08:00 -
Add feature-gated freeform js_repl core runtime (#10674)
## Summary This PR adds an **experimental, feature-gated `js_repl` core runtime** so models can execute JavaScript in a persistent REPL context across tool calls. The implementation integrates with existing feature gating, tool registration, prompt composition, config/schema docs, and tests. ## What changed - Added new experimental feature flag: `features.js_repl`. - Added freeform `js_repl` tool and companion `js_repl_reset` tool. - Gated tool availability behind `Feature::JsRepl`. - Added conditional prompt-section injection for JS REPL instructions via marker-based prompt processing. - Implemented JS REPL handlers, including freeform parsing and pragma support (timeout/reset controls). - Added runtime resolution order for Node: 1. `CODEX_JS_REPL_NODE_PATH` 2. `js_repl_node_path` in config 3. `PATH` - Added JS runtime assets/version files and updated docs/schema. ## Why This enables richer agent workflows that require incremental JavaScript execution with preserved state, while keeping rollout safe behind an explicit feature flag. ## Testing Coverage includes: - Feature-flag gating behavior for tool exposure. - Freeform parser/pragma handling edge cases. - Runtime behavior (state persistence across calls and top-level `await` support). ## Usage ```toml [features] js_repl = true ``` Optional runtime override: - `CODEX_JS_REPL_NODE_PATH`, or - `js_repl_node_path` in config. #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/10674 - ⏳ `2` https://github.com/openai/codex/pull/10672 - ⏳ `3` https://github.com/openai/codex/pull/10671 - ⏳ `4` https://github.com/openai/codex/pull/10673 - ⏳ `5` https://github.com/openai/codex/pull/10670
Curtis 'Fjord' Hawthorne ·
2026-02-11 12:05:02 -08:00 -
tui: keep history recall cursor at line end (#11295)
## Summary - keep cursor at end-of-line after Up/Down history recall - allow continued history navigation when recalled text cursor is at start or end boundary - add regression tests and document the history cursor contract in composer docs ## Testing - just fmt - cargo test -p codex-tui --lib history_navigation_leaves_cursor_at_end_of_line - cargo test -p codex-tui --lib should_handle_navigation_when_cursor_is_at_line_boundaries - cargo test -p codex-tui *(fails in existing integration test `suite::no_panic_on_startup::malformed_rules_should_not_panic` because `target/debug/codex` is not present in this environment)*
Josh McKinney ·
2026-02-10 17:21:46 +00:00 -
fix(tui): tab submits when no task running in steer mode (#10035)
When steer mode is enabled, Tab used to only queue while a task was running and otherwise did nothing. Treat Tab as an immediate submit when no task is running so input isn't dropped when the inflight turn ends mid-typing. Adds a regression test and updates docs/tooltips.
Josh McKinney ·
2026-02-10 00:39:09 +00:00 -
fix(tui): rehydrate drafts and restore image placeholders (#9040)
Fixes #9050 When a draft is stashed with Ctrl+C, we now persist the full draft state (text elements, local image paths, and pending paste payloads) in local history. Up/Down recall rehydrates placeholder elements and attachments so styling remains correct and large pastes still expand on submit. Persistent (cross‑session) history remains text‑only. Backtrack prefills now reuse the selected user message’s text elements and local image paths, so image placeholders/attachments rehydrate when rolling back. External editor replacements keep only attachments whose placeholders remain and then normalize image placeholders to `[Image #1]..[Image #N]` to keep the attachment mapping consistent. Docs: - docs/tui-chat-composer.md Testing: - just fix -p codex-tui - cargo test -p codex-tui Co-authored-by: Eric Traut <etraut@openai.com>
Chriss4123 ·
2026-02-07 20:08:45 -08:00