mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
dev
342 Commits
-
[codex] Inject agent graph store into ThreadManager (#29736)
Pick up the AgentGraphStore migration. - Inject an explicit optional agent graph store into `ThreadManager` - Move all calls to spawn, close, recursive resume, and subtree/archive/delete/feedback traversal through it - Keep using `LocalAgentGraphStore` when SQLite is available This required some changes to the interface to deal with futures: - The interface now matches `ThreadStore`'s object-safe pattern by returning a boxed `AgentGraphStoreFuture` directly, allowing `ThreadManager` to hold `Arc<dyn AgentGraphStore>` *Slight behavior change!* Unfiltered subtree enumeration now performs a single all-status breadth-first traversal, so a closed grandchild beneath an open edge is included; the previous Open-then-Closed traversals could not cross mixed-status paths and silently omitted it.
Tom ·
2026-06-24 13:24:10 -07:00 -
chore(core) rm AskForApproval::OnFailure (#28418)
## Summary Deletes the OnFailure variant of the `AskForApproval` enum. This option has been deprecated since #11631. ## Testing - [x] Tests pass
Dylan Hurd ·
2026-06-23 12:13:54 -07:00 -
Propagate safety buffering events to app-server clients (#29371)
Responses API safety buffering metadata currently stops at the transport boundary, so app-server clients cannot render the in-progress safety review state. This change: - decodes and deduplicates `safety_buffering` metadata from Responses API SSE and WebSocket events without suppressing the original response event - emits a typed core event containing the requested model plus backend use cases and reasons - forwards that event as `turn/safetyBuffering/updated` through app-server v2 and updates generated protocol schemas - keeps the side-channel event out of persisted rollouts and turn timing This supports the Codex Apps buffering UX and depends on the Responses API backend work in https://github.com/openai/openai/pull/1044569 and https://github.com/openai/openai/pull/1044571. Validation: - focused `codex-core` safety-buffering integration test passes - `cargo check -p codex-core -p codex-app-server -p codex-app-server-protocol` - `just fix -p codex-api -p codex-protocol -p codex-core -p codex-app-server-protocol -p codex-app-server -p codex-rollout -p codex-rollout-trace -p codex-otel` - `just fmt` - broad package test run: 4,430/4,492 passed; 62 unrelated local-environment/concurrency failures involved unavailable test binaries, MCP subprocess setup, and app-server timeouts
Francis Chalissery ·
2026-06-22 03:39:14 +00:00 -
current time reminders impl for system clock (varlatency 2/n) (#28824)
Stacked on #28822. ## Summary - add a host-injectable current-time provider with a built-in system implementation - record UTC developer reminders in history immediately before due model requests - keep cadence state per session and force a refresh after compaction This does NOT include the app server client <-> server clock logic. This PR is only for the reminder message & system clock that will be used in prod. ## Testing - `just test -p codex-core varlatency_` - `just clippy -p codex-core -p codex-app-server -p codex-mcp-server -p codex-thread-manager-sample` - `just fmt`
rka-oai ·
2026-06-18 19:18:42 +00:00 -
Scope command approvals by execution environment (#28738)
## Why Command approval cache keys included the command and working directory, but not the execution environment. An approval for `/workspace` locally could therefore be reused for the same command and path on an executor. ## What changed - Include the selected environment ID in shell and unified-exec approval cache keys. - Carry that ID through the normal command approval request so clients can show which environment is being approved. - Expose the environment through app-server as a required nullable `environmentId` and show it in the inline TUI approval prompt. - Keep older recorded approval events compatible when the environment is absent. For example, `echo ok` in local `/workspace` and `echo ok` in executor `/workspace` now produce different approval keys and separate prompts. ## Scope This PR does not change network approvals, Guardian review actions, MCP elicitation, full-screen TUI rendering, or environment-ID validation. Remote `shell_command` execution itself remains in #28722; this PR only makes its approval key environment-aware.
jif ·
2026-06-17 19:52:43 +02:00 -
[codex] Use expect in integration tests (#28441)
The workspace denies `clippy::expect_used` in production. Although `clippy.toml` allows `expect` in tests, Bazel Clippy compiles integration-test helper code in a way that does not receive that exemption, which encouraged verbose `unwrap_or_else(... panic!(...))` and equivalent `match`/`let else` forms. This allows `clippy::expect_used` once at each integration-test crate root (including aggregated suites and test-support libraries), then replaces manual panic-based Result and Option unwraps with `expect`/`expect_err`. Standalone `tests/*.rs` files remain their own crate roots. Intentional assertion and unexpected-variant panics remain unchanged, and the production `expect_used = "deny"` lint remains in place. The cleanup is mechanical and net-negative in line count.
pakrym-oai ·
2026-06-15 21:53:47 -07:00 -
[codex] Load user instructions through an injected provider (#27101)
## Why We want to remove implicit use of `$CODEX_HOME` from `codex-core` and make embedders responsible for supplying user-level instructions. This also ensures user instructions load when no primary environment is selected. ## What changed Stacked on #27415, which makes `codex exec` surface thread-scoped runtime warnings. - Added `UserInstructionsProvider` to `codex-extension-api`, with absolute source attribution and recoverable loading warnings. - Added `codex-home` with the filesystem-backed provider for `AGENTS.override.md` and `AGENTS.md`, preserving precedence, fallback, trimming, lossy UTF-8 handling, and the existing uncapped global instruction size. - Removed global instruction loading from `Config` and require `ThreadManager` callers to inject a provider. - Load provider instructions once for each fresh root runtime, including runtimes without a primary environment. Running sessions retain their snapshot, while child agents inherit the parent snapshot without invoking the provider. - Keep provider instructions separate while loading project `AGENTS.md`, then assemble the model-visible instructions with the existing ordering, source attribution, warning, and turn-context behavior. - Wired the Codex home provider through the CLI, app server, MCP server, core facade, and thread-manager sample. ## Validation - `just test -p codex-home -p codex-extension-api` - `just test -p codex-core agents_md` - `just test -p codex-core guardian` - `just test -p codex-app-server thread_start_without_selected_environment_includes_only_global_instruction_source` - `just test -p codex-exec warning` - `just bazel-lock-check`
Adam Perry @ OpenAI ·
2026-06-11 19:28:47 +00:00 -
multi-agent: add path-based v2 activity tracking (#27007)
## Why Multi-agent v2 identifies agents by canonical paths, but its tool handlers still emitted the larger legacy collaboration begin/end events built around nickname and role metadata. App-server, rollout-trace, analytics, and TUI consumers therefore lacked one compact path-based completion signal that behaved consistently across live events and replay. The TUI also needs a bounded `/agent` status surface for v2 agents. It should use recent local activity for previews, refresh liveness without loading full histories, and keep the legacy picker available when no path-backed v2 agent is known. ## What changed - Replace the v2 `spawn_agent`, `send_message`, `followup_task`, and `interrupt_agent` legacy lifecycle emissions with a success-only `SubAgentActivity` event. The event records the tool call ID, occurrence time, affected thread, canonical agent path, and `started`, `interacted`, or `interrupted` kind. - Expose the activity as a completion-only app-server v2 `subAgentActivity` thread item in live notifications and reconstructed history, regenerate the protocol schemas, and count it in sub-agent tool analytics. - Track canonical paths from live activity and loaded-thread metadata in the TUI, and render the activity in live and replayed transcripts. - Make `/agent` list running path-backed agents with summaries from bounded local event buffers. Each summary is capped at 240 graphemes, the scan is capped at six recent items, only the last three wrapped lines are shown, and command output is omitted. Liveness falls back to metadata-only `thread/read` when local turn state is unavailable. - Persist the activity as a terminal rollout-trace runtime payload and reduce it to the corresponding spawn, send, follow-up, or close interaction edge. `interrupt_agent` is classified as a close-edge operation. - Preserve the legacy picker when no path-backed v2 agent is known. ## Compatibility App-server v2 clients that consumed `collabAgentToolCall` begin/end pairs for these tools must handle the new completion-only `subAgentActivity` item. Legacy v1 collaboration behavior is unchanged. ## Screenshot <img width="684" height="288" alt="Screenshot 2026-06-08 at 15 40 47" src="https://github.com/user-attachments/assets/194b3cd0-619d-45fb-b587-cf3e2b1b8a1d" /> ## Testing - `just test -p codex-app-server-protocol` - `just test -p codex-rollout-trace` - Added focused coverage for activity analytics, terminal trace serialization, spawn-edge reduction, `interrupt_agent` classification, TUI status rendering without aggregated command output, and clearing stale running state after a completed turn.
jif ·
2026-06-09 12:14:48 +02:00 -
Pair thread environment settings (#26687)
## Why Thread cwd and environment selections are a single logical setting in core: updating one without the other can silently desynchronize the next-turn execution context. This change makes that relationship explicit in the internal thread settings flow while preserving the existing app-server public API shape. ## What changed - Moved the cwd/environment pair through internal `ThreadSettingsOverrides.environment_settings` instead of a top-level internal `cwd` field. - Kept `thread/settings/update` public params unchanged, with app-server translating top-level `cwd` into the paired internal settings shape. - Moved `Op::UserInput` environment overrides into thread settings so user turns and settings updates use the same core path. - Updated core, app-server, MCP, memories, sample, and test callsites to construct the paired settings shape. ## Verification - `git diff --check` - Local test run starting after PR creation.
pakrym-oai ·
2026-06-08 13:55:15 -07:00 -
[codex] Forward turn moderation metadata through app-server (#25710)
## Why First-party backends can supply turn-scoped moderation metadata that app-server clients need for client-side presentation. Exposing this as an experimental typed notification lets opted-in clients consume it without interpreting raw Responses API events. ## What changed - forward `response.metadata.openai_chatgpt_moderation_metadata` from Responses API SSE and WebSocket streams as turn-scoped moderation metadata - emit the experimental app-server v2 `turn/moderationMetadata` notification with `{ threadId, turnId, metadata }` - add app-server integration coverage for the typed moderation metadata notification ## Testing - `just test -p codex-core build_ws_client_metadata_includes_window_lineage_and_turn_metadata` - `just test -p codex-core` (fails locally: 46 failures and 1 timeout, primarily missing `test_stdio_server` and shell snapshot timeouts) - `just test -p codex-app-server-protocol` - `just test -p codex-app-server turn_moderation_metadata_emits_typed_notification_v2` - `just test -p codex-app-server` (fails locally: 792 passed, 10 failed, and 5 timed out; failures are in existing environment-sensitive tests, primarily because nested macOS `sandbox-exec` is not permitted) - `just write-app-server-schema --experimental --schema-root /tmp/codex-app-server-schema-experimental`carlc-oai ·
2026-06-05 02:41:06 -07:00 -
store and expose parent_thread_id on Threads (#25113)
## Why This PR https://github.com/openai/codex/pull/24161#discussion_r3325692763 revealed a subagent data modeling issue, where we overloaded `forked_from_id` to also mean `parent_thread_id`. That's incorrect since guardian and review subagents can be a subagent and NOT fork the main thread's history. The solution here is to explicitly store a new `parent_thread_id` on `SessionMeta`, alongside `forked_from_id` which already exists. While we're at it, also expose it in the app-server protocol on the `Thread` object. A thread->subagent relationship and a fork of thread history are orthogonal concepts. ## What Changed - Added top-level `parent_thread_id` persistence on `SessionMeta` and runtime/session plumbing through `SessionConfiguredEvent`, `CodexSpawnArgs`, `SessionConfiguration`, `ThreadConfigSnapshot`, `TurnContext`, and `ModelClient`. - Made turn metadata, request headers, analytics, and subagent-start events read the separate runtime/top-level parent field instead of deriving general parent lineage from `SessionSource` or `forked_from_thread_id`. - Passed parent lineage separately at delegated subagent, review, guardian, agent-job, and multi-agent spawn construction sites; copied-history fork lineage remains derived only from `InitialHistory`. - Persisted and exposed parent lineage through rollout/thread-store projections and app-server v2 `Thread.parentThreadId`. - Updated app-server README text and regenerated app-server schema fixtures for the additive `parentThreadId` response field.
Owen Lin ·
2026-06-01 04:33:20 +00:00 -
[codex] Add user input client ids (#24653)
## Summary Adds an optional `clientId` field to app-server v2 `UserInput` and carries it through the core `UserInput` model so clients can correlate echoed user input items without relying on payload equality. ## Details - Adds `client_id: Option<String>` to core `UserInput` variants. - Exposes the v2 app-server field as `clientId` on the wire and in generated TypeScript. - Preserves the id when converting between app-server v2 and core protocol types. - Regenerates app-server schema fixtures. ## Validation - `just fmt` - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-protocol` - `just fix -p codex-app-server-protocol` - `just fix -p codex-protocol` - `git diff --check`
Alexi Christakis ·
2026-05-28 14:54:39 -07:00 -
Update rmcp to 1.7.0 (#24763)
WIll make it easier to uprev when the new draft spec is supported. Also updates reqwest where needed for compatibility but doesn't update it everywhere since this is already a large diff. The new version of rmcp handles certain kinds of authentication failures differently, this patch includes support for identifying the failing scope in a WWW-Authenticate header.
Adam Perry @ OpenAI ·
2026-05-27 14:52:06 -07:00 -
Add experimental turn additional context (#24154)
## Summary Adds experimental `additionalContext` support to `turn/start` and `turn/steer` so clients can provide ephemeral external context, such as browser or automation state, without turning that plumbing into a visible user prompt or triggering user-prompt lifecycle behavior. ## API Shape The parameter shape is: ```ts additionalContext?: Record<string, { value: string kind: "untrusted" | "application" }> | null ``` Example: ```json { "additionalContext": { "browser_info": { "value": "Active tab is CI failures.", "kind": "untrusted" }, "automation_info": { "value": "CI rerun is in progress.", "kind": "application" } } } ``` The keys are opaque and caller-defined. ## Context Injection When provided, accepted entries are inserted into model context as hidden contextual message items, not as visible thread user-message items. `kind: "untrusted"` entries are inserted with role `user`: ```text <external_${key}>${value}</external_${key}> ``` `kind: "application"` entries are inserted with role `developer`: ```text <${key}>${value}</${key}> ``` Values are not escaped. Each value is truncated to 1k approximate tokens before wrapping. For `turn/start`, accepted additional context is inserted before normal user input. For `turn/steer`, additional context is merged only when the steer includes non-empty user input; context-only steers still reject as empty input. ## Dedupe Strategy `AdditionalContextStore` lives on session state and stores the latest complete additional-context map. Each `turn/start` or non-empty `turn/steer` treats its `additionalContext` as the current complete set of values. Entries are injected only when the key is new or the exact entry for that key changed, including `value` or `kind`. After merging, the store is replaced with the provided map, so omitted keys are removed from the retained set and can be injected again later if reintroduced. Omitting `additionalContext`, passing `null`, or passing an empty object resets the store to empty and injects nothing. ## What Changed - Threads experimental v2 `additionalContext` through app-server into core turn start and steer handling. - Adds separate contextual fragment types for untrusted user-role context and application developer-role context. - Uses pending response input items so additional context can be combined with normal user input without treating it as prompt text. - Adds integration coverage for start/steer flow, role routing, dedupe/reset behavior, deletion/re-add behavior, hook-blocked input behavior, empty context-only steer rejection, external-fragment marker matching, and truncation.pakrym-oai ·
2026-05-26 13:02:34 -07:00 -
fix: reject legacy profile selectors (#24059)
## Why `--profile` now selects `<name>.config.toml`, so the legacy `profile` selector should not be reintroduced through config write or MCP tool paths. A matching legacy selector in base user config also needs the same migration guard as a matching legacy `[profiles.<name>]` table so profile loading fails with one clear migration error instead of mixing the old and new profile models. ## What - reject non-null app-server config writes to the top-level legacy `profile` selector - make `--profile <name>` reject base user config that still selects the same legacy `profile = "<name>"` value, alongside the existing matching legacy profile-table guard - reject removed MCP `codex` tool fields such as `profile` by denying unknown tool-call parameters and exposing that restriction in the generated schema - add regression coverage for the app-server write paths, config loader guard, and MCP tool input/schema behavior ## Verification - targeted regression tests cover the new app-server, config loader, and MCP rejection paths
jif-oai ·
2026-05-22 13:19:47 +02:00 -
config: remove legacy profile v1 resolution (#24051)
## Why [#23883](https://github.com/openai/codex/pull/23883) moved user-facing `--profile` selection onto profile v2, and [#23886](https://github.com/openai/codex/pull/23886) removed the old CLI `config_profile` override path. Core still had a second legacy path: `profile = "..."` could select `[profiles.*]` values while runtime config was built. Keeping that resolver alive preserves the old precedence model and profile-carrying surfaces even though profile selection now points at `$CODEX_HOME/<name>.config.toml`. ## What - Reject legacy top-level `profile = "..."` config while loading runtime config, with an error that points callers at `--profile <name>` and `<name>.config.toml` in the [core load path](https://github.com/openai/codex/blob/3d923366eca10a29143623124c6c6e538f058269/codex-rs/core/src/config/mod.rs#L2524-L2531). - Remove the remaining profile-v1 merge points from runtime config resolution, including features, permissions, model/provider selection, web search, Windows sandbox settings, TUI settings, role reloads, and OSS provider lookup. - Drop the leftover profile override surface from [`ConfigOverrides`](https://github.com/openai/codex/blob/3d923366eca10a29143623124c6c6e538f058269/codex-rs/core/src/config/mod.rs#L2118-L2148) and from the MCP server `codex` tool schema. - Prune profile-precedence tests that only exercised the removed resolver and replace them with rejection coverage for the legacy selector. ## Testing - Not run in this metadata pass. - Added [`legacy_profile_selection_is_rejected`](https://github.com/openai/codex/blob/3d923366eca10a29143623124c6c6e538f058269/codex-rs/core/src/config/config_tests.rs#L7942-L7965) coverage for the new runtime guard.
jif-oai ·
2026-05-22 12:13:52 +02:00 -
Make local environment optional in EnvironmentManager (#23369)
## Summary - make `EnvironmentManager` local environment/runtime paths optional - simplify constructor surface around snapshot materialization - rename local env accessors to `require_local_environment` / `try_local_environment` ## Validation - devbox Bazel build for touched crate surfaces - `//codex-rs/exec-server:exec-server-unit-tests` - `//codex-rs/app-server-client:app-server-client-unit-tests` - filtered touched `//codex-rs/core:core-unit-tests` cases
starr-openai ·
2026-05-19 12:55:34 -07:00 -
[5 of 7] Replace OverrideTurnContext with ThreadSettings (#22508)
**Stack position:** [5 of 7] ## Summary This PR adds `Op::ThreadSettings`, a queued settings-only update mechanism for changing stored thread settings without starting a new turn. It also removes the legacy `Op::OverrideTurnContext` in the same layer, so reviewers can see the replacement and deletion together. ## Changes - Add `Op::ThreadSettings` for settings-only queued updates. - Emit `ThreadSettingsApplied` with the effective thread settings snapshot after core applies an update. - Route settings-only updates through the same submission queue as user input. - Migrate remaining `OverrideTurnContext` tests and callers to the queued `Op::ThreadSettings` path. - Delete `Op::OverrideTurnContext` from the core protocol and submission loop. This stack addresses #20656 and #22090. ## Stack 1. [1 of 7] [Add thread settings to UserInput](https://github.com/openai/codex/pull/23080) 2. [2 of 7] [Remove UserInputWithTurnContext](https://github.com/openai/codex/pull/23081) 3. [3 of 7] [Remove UserTurn](https://github.com/openai/codex/pull/23075) 4. [4 of 7] [Placeholder for OverrideTurnContext cleanup](https://github.com/openai/codex/pull/23087) 5. [5 of 7] [Replace OverrideTurnContext with ThreadSettings](https://github.com/openai/codex/pull/22508) (this PR) 6. [6 of 7] [Add app-server thread settings API](https://github.com/openai/codex/pull/22509) 7. [7 of 7] [Sync TUI thread settings](https://github.com/openai/codex/pull/22510)
Eric Traut ·
2026-05-18 21:03:51 -07:00 -
[1 of 7] Add thread settings to UserInput (#23080)
**Stack position:** [1 of 7] ## Summary The first three PRs in this stack are a cleanup pass before the actual thread settings API work. Today, core has several overlapping "user input" ops: `UserInput`, `UserInputWithTurnContext`, and `UserTurn`. They differ mostly in how much next-turn state they carry, which makes the later queued thread settings update harder to reason about and review. This PR starts that cleanup by adding the shared `ThreadSettingsOverrides` payload and allowing `Op::UserInput` to carry it. Existing variants remain in place here, so this layer is mostly a behavior-preserving API shape change plus mechanical constructor updates. ## End State After PR3 By the end of PR3, `Op::UserInput` is the only "user input" core op. It can carry optional thread settings overrides for callers that need to update stored defaults with a turn, while callers without updates use empty settings. `Op::UserInputWithTurnContext` and `Op::UserTurn` are deleted. ## End State After PR5 By the end of PR5, core will have only two ops for this area: - `Op::UserInput` for user-input-bearing submissions. - `Op::ThreadSettings` for settings-only updates. ## Stack 1. [1 of 7] [Add thread settings to UserInput](https://github.com/openai/codex/pull/23080) (this PR) 2. [2 of 7] [Remove UserInputWithTurnContext](https://github.com/openai/codex/pull/23081) 3. [3 of 7] [Remove UserTurn](https://github.com/openai/codex/pull/23075) 4. [4 of 7] [Placeholder for OverrideTurnContext cleanup](https://github.com/openai/codex/pull/23087) 5. [5 of 7] [Replace OverrideTurnContext with ThreadSettings](https://github.com/openai/codex/pull/22508) 6. [6 of 7] [Add app-server thread settings API](https://github.com/openai/codex/pull/22509) 7. [7 of 7] [Sync TUI thread settings](https://github.com/openai/codex/pull/22510)
Eric Traut ·
2026-05-18 18:48:35 -07:00 -
config: add strict config parsing (#20559)
## Why Codex intentionally ignores unknown `config.toml` fields by default so older and newer config files keep working across versions. That leniency also makes typo detection hard because misspelled or misplaced keys disappear silently. This change adds an opt-in strict config mode so users and tooling can fail fast on unrecognized config fields without changing the default permissive behavior. This feature is possible because `serde_ignored` exposes the exact signal Codex needs: it lets Codex run ordinary Serde deserialization while recording fields Serde would otherwise ignore. That avoids requiring `#[serde(deny_unknown_fields)]` across every config type and keeps strict validation opt-in around the existing config model. ## What Changed ### Added strict config validation - Added `serde_ignored`-based validation for `ConfigToml` in `codex-rs/config/src/strict_config.rs`. - Combined `serde_ignored` with `serde_path_to_error` so strict mode preserves typed config error paths while also collecting fields Serde would otherwise ignore. - Added strict-mode validation for unknown `[features]` keys, including keys that would otherwise be accepted by `FeaturesToml`'s flattened boolean map. - Kept typed config errors ahead of ignored-field reporting, so malformed known fields are reported before unknown-field diagnostics. - Added source-range diagnostics for top-level and nested unknown config fields, including non-file managed preference source names. ### Kept parsing single-pass per source - Reworked file and managed-config loading so strict validation reuses the already parsed `TomlValue` for that source. - For actual config files and managed config strings, the loader now reads once, parses once, and validates that same parsed value instead of deserializing multiple times. - Validated `-c` / `--config` override layers with the same base-directory context used for normal relative-path resolution, so unknown override keys are still reported when another override contains a relative path. ### Scoped `--strict-config` to config-heavy entry points - Added support for `--strict-config` on the main config-loading entry points where it is most useful: - `codex` - `codex resume` - `codex fork` - `codex exec` - `codex review` - `codex mcp-server` - `codex app-server` when running the server itself - the standalone `codex-app-server` binary - the standalone `codex-exec` binary - Commands outside that set now reject `--strict-config` early with targeted errors instead of accepting it everywhere through shared CLI plumbing. - `codex app-server` subcommands such as `proxy`, `daemon`, and `generate-*` are intentionally excluded from the first rollout. - When app-server strict mode sees invalid config, app-server exits with the config error instead of logging a warning and continuing with defaults. - Introduced a dedicated `ReviewCommand` wrapper in `codex-rs/cli` instead of extending shared `ReviewArgs`, so `--strict-config` stays on the outer config-loading command surface and does not become part of the reusable review payload used by `codex exec review`. ### Coverage - Added tests for top-level and nested unknown config fields, unknown `[features]` keys, typed-error precedence, source-location reporting, and non-file managed preference source names. - Added CLI coverage showing invalid `--enable`, invalid `--disable`, and unknown `-c` overrides still error when `--strict-config` is present, including compound-looking feature names such as `multi_agent_v2.subagent_usage_hint_text`. - Added integration coverage showing both `codex app-server --strict-config` and standalone `codex-app-server --strict-config` exit with an error for unknown config fields instead of starting with fallback defaults. - Added coverage showing unsupported command surfaces reject `--strict-config` with explicit errors. ## Example Usage Run Codex with strict config validation enabled: ```shell codex --strict-config ``` Strict config mode is also available on the supported config-heavy subcommands: ```shell codex --strict-config exec "explain this repository" codex review --strict-config --uncommitted codex mcp-server --strict-config codex app-server --strict-config --listen off codex-app-server --strict-config --listen off ``` For example, if `~/.codex/config.toml` contains a typo in a key name: ```toml model = "gpt-5" approval_polic = "on-request" ``` then `codex --strict-config` reports the misspelled key instead of silently ignoring it. The path is shortened to `~` here for readability: ```text $ codex --strict-config Error loading config.toml: ~/.codex/config.toml:2:1: unknown configuration field `approval_polic` | 2 | approval_polic = "on-request" | ^^^^^^^^^^^^^^ ``` Without `--strict-config`, Codex keeps the existing permissive behavior and ignores the unknown key. Strict config mode also validates ad-hoc `-c` / `--config` overrides: ```text $ codex --strict-config -c foo=bar Error: unknown configuration field `foo` in -c/--config override $ codex --strict-config -c features.foo=true Error: unknown configuration field `features.foo` in -c/--config override ``` Invalid feature toggles are rejected too, including values that look like nested config paths: ```text $ codex --strict-config --enable does_not_exist Error: Unknown feature flag: does_not_exist $ codex --strict-config --disable does_not_exist Error: Unknown feature flag: does_not_exist $ codex --strict-config --enable multi_agent_v2.subagent_usage_hint_text Error: Unknown feature flag: multi_agent_v2.subagent_usage_hint_text ``` Unsupported commands reject the flag explicitly: ```text $ codex --strict-config cloud list Error: `--strict-config` is not supported for `codex cloud` ``` ## Verification The `codex-cli` `strict_config` tests cover invalid `--enable`, invalid `--disable`, the compound `multi_agent_v2.subagent_usage_hint_text` case, unknown `-c` overrides, app-server strict startup failure through `codex app-server`, and rejection for unsupported commands such as `codex cloud`, `codex mcp`, `codex remote-control`, and `codex app-server proxy`. The config and config-loader tests cover unknown top-level fields, unknown nested fields, unknown `[features]` keys, source-location reporting, non-file managed config sources, and `-c` validation for keys such as `features.foo`. The app-server test suite covers standalone `codex-app-server --strict-config` startup failure for an unknown config field. ## Documentation The Codex CLI docs on developers.openai.com/codex should mention `--strict-config` as an opt-in validation mode for supported config-heavy entry points once this ships.
Michael Bolin ·
2026-05-13 16:08:05 +00:00 -
Support multi-environment apply_patch selection (#21617)
## Summary - add multi-environment apply_patch routing for both freeform and function-call tool flows - parse and reconcile the optional environment selector in the main apply_patch parser, then verify against the selected environment in the handler - carry environment_id through runtime and approval surfaces so remote-targeted patches stay explicit end to end ## Testing - just fmt - remote exec-server e2e: `cargo test -p codex-core --test all apply_patch_multi_environment_uses_remote_executor -- --nocapture` on dev via `scripts/test-remote-env.sh` --------- Co-authored-by: Codex <noreply@openai.com>
starr-openai ·
2026-05-11 16:33:44 -07:00 -
Add process-scoped SQLite telemetry (#22154)
## Summary - add SQLite init, backfill-gate, and fallback telemetry without introducing a cross-cutting state-db access wrapper - install one process-scoped telemetry sink after OTEL startup and let low-level state/rollout paths emit through it directly - add process-start metrics for the process owners that initialize SQLite --------- Co-authored-by: Owen Lin <owen@openai.com>
jif-oai ·
2026-05-11 11:32:40 -07:00 -
extension: wire extension registries into sessions (#21737)
## Why [#21736](https://github.com/openai/codex/pull/21736) introduces the typed extension API, but the runtime does not yet carry a registry through thread/session startup or give contributors host-owned stores to read from. This PR wires that host-side path so later feature migrations can move product-specific behavior behind typed contributions without adding another bespoke seam directly to `codex-core`. ## What changed - Thread `ExtensionRegistry<Config>` through `ThreadManager`, `CodexSpawnArgs`, `Session`, and sub-agent spawn paths. - Wire `ThreadStartContributor` and `ContextContributor` - Expose the small supporting surface needed by non-core callers that construct threads directly, including `empty_extension_registry()` through `codex-core-api`. This PR lands the host plumbing only: the app-server registry is still empty, and concrete feature migrations are intended to follow separately.
jif-oai ·
2026-05-11 11:38:18 +02:00 -
Reapply "Move skills watcher to app-server" (#21652)
## Why PR #21460 reverted the earlier move of skills change watching from `codex-core` into app-server. This reapplies that boundary change so app-server owns client-facing `skills/changed` notifications and core no longer carries the watcher. ## What - Restore the app-server `SkillsWatcher` and register it from thread listener setup. - Remove the core-owned skills watcher and its core live-reload integration surface. - Restore app-server coverage for `skills/changed` notifications after a watched skill file changes. ## Validation - `cargo test -p codex-app-server --test all suite::v2::skills_list::skills_changed_notification_is_emitted_after_skill_change -- --exact --nocapture` - `cargo test -p codex-core --lib --no-run`
pakrym-oai ·
2026-05-08 17:41:15 -07:00 -
Enable
--deny-warningsforcargo shear(#21616)## Summary In https://github.com/openai/codex/pull/21584, we disabled doctests for crates that lack any doctests. We can enforce that property via `cargo shear --deny-warnings`: crates that lack doctests will be flagged if doctests are enabled, and crates with doctests will be flagged if doctests are disabled. A few additional notes: - By adding `--deny-warnings`, `cargo shear` also flagged a number of modules that were not reachable at all. Some of those have been removed. - This PR removes a usage of `windows_modules!` (since `cargo shear` and `rustfmt` couldn't see through it) in favor of simple `#[cfg(target_os = "windows")]` macros. As a consequence, many of these files exhibit churn in this PR, since they weren't being formatted by `rustfmt` at all on main. - Again, to make the code more analyzable, this PR also removes some usages of `#[path = "cwd_junction.rs"]` in favor of a more standard module structure. The bin sidecar structure is still retained, but, e.g., `windows-sandbox-rs/src/bin/command_runner.rs` was moved to `windows-sandbox-rs/src/bin/command_runner/main.rs`, and so on. --------- Co-authored-by: Codex <noreply@openai.com>
Charlie Marsh ·
2026-05-08 20:29:00 +00:00 -
[codex] request desktop attestation from app (#20619)
## Summary TL;DR: teaches `codex-rs` / app-server to request a desktop-provided attestation token and attach it as `x-oai-attestation` on the scoped ChatGPT Codex request paths.  ## Details This PR teaches the Codex app-server runtime how to request and attach an attestation token. It does not generate DeviceCheck tokens directly; instead, it relies on the connected desktop app to advertise that it can generate attestation and then asks that app for a fresh header value when needed. The flow is: 1. The Codex desktop app connects to app-server. 2. During `initialize`, the app can advertise that it supports `requestAttestation`. 3. Before app-server calls selected ChatGPT Codex endpoints, it sends the internal server request `attestation/generate` to the app. 4. app-server receives a pre-encoded header value back. 5. app-server forwards that value as `x-oai-attestation` on the scoped outbound requests. The code in this repo is mostly protocol and runtime plumbing: it adds the app-server request/response shape, introduces an attestation provider in core, wires that provider into Responses / compaction / realtime setup paths, and covers the intended scoping with tests. The signed macOS DeviceCheck generation remains owned by the desktop app PR. ## Related PR - Codex desktop app implementation: https://github.com/openai/openai/pull/878649 ## Validation <details> <summary>Tests run</summary> ```sh cargo test -p codex-app-server-protocol cargo test -p codex-core attestation --lib cargo test -p codex-app-server --lib attestation ``` Also ran: ```sh just fix -p codex-core just fix -p codex-app-server just fix -p codex-app-server-protocol just fmt just write-app-server-schema ``` </details> <details> <summary>E2E DeviceCheck validation</summary> First validated the signed desktop app boundary directly: launched a packaged signed `Codex.app`, sent `attestation/generate`, decoded the returned `v1.` attestation header, and validated the extracted DeviceCheck token with `personal/jm/verify_devicecheck_token.py` using bundle ID `com.openai.codex`. Apple returned `status_code: 200` and `is_ok: true`. Then ran the fuller app + app-server flow. The packaged `Codex.app` launched a current-branch app-server via `CODEX_CLI_PATH`, and a local MITM proxy intercepted outbound `chatgpt.com` traffic. The app-server requested `attestation/generate` from the real Electron app process, and the intercepted `/backend-api/codex/responses` traffic included `x-oai-attestation` on both routes: ```text GET /backend-api/codex/responses Upgrade: websocket x-oai-attestation: present POST /backend-api/codex/responses Upgrade: none x-oai-attestation: present ``` The captured header decoded to a DeviceCheck token that also validated with Apple for `com.openai.codex` (`status_code: 200`, `is_ok: true`, team `2DC432GLL2`). </details> --------- Co-authored-by: Codex <noreply@openai.com>
Jiaming Zhang ·
2026-05-08 12:36:02 -07:00 -
Load configured environments from CODEX_HOME (#20667)
## Why The earlier PRs add stdio transport support and the config-backed environment provider, but the feature remains inert until normal Codex entrypoints construct `EnvironmentManager` with enough context to discover `CODEX_HOME/environments.toml`. This final stack PR activates the provider while preserving the legacy `CODEX_EXEC_SERVER_URL` fallback when no environments file exists. **Stack position:** this is PR 5 of 5. It is the product wiring PR that activates the configured environment provider added in PR 4. ## What Changed - Thread `codex_home` into `EnvironmentManagerArgs`. - Change `EnvironmentManager::new(...)` to load the provider from `CODEX_HOME`. - Preserve legacy behavior by falling back to `DefaultEnvironmentProvider::from_env()` when `environments.toml` is absent. - Make `environments.toml`-backed managers start new threads with all configured environments, default first, while keeping the legacy env-var path single-default. - Update the app-server, TUI, exec, MCP server, connector, prompt-debug, and thread-manager-sample callsites to pass `codex_home` and handle provider-loading errors. ## Self-Review Notes - The multi-environment startup path is intentionally tied to the `environments.toml` provider. Using `>1` configured environment as the only signal would also expand the legacy `CODEX_EXEC_SERVER_URL` provider because it keeps `local` addressable alongside `remote`. - The startup environment list is still derived inside `EnvironmentManager`; the provider only says whether its snapshot should start new threads with all configured environments. - The thread-manager sample was updated to pass the current `ThreadManager::new(...)` installation id argument so the stack compiles under Bazel. ## Stack - 1. https://github.com/openai/codex/pull/20663 - Add stdio exec-server listener - 2. https://github.com/openai/codex/pull/20664 - Add stdio exec-server client transport - 3. https://github.com/openai/codex/pull/20665 - Make environment providers own default selection - 4. https://github.com/openai/codex/pull/20666 - Add CODEX_HOME environments TOML provider - **5. This PR:** https://github.com/openai/codex/pull/20667 - Load configured environments from CODEX_HOME Split from original draft: https://github.com/openai/codex/pull/20508 ## Validation - `just fmt` - `git diff --check` - `bazel build --config=remote --strategy=remote --remote_download_toplevel //codex-rs/thread-manager-sample:codex-thread-manager-sample` - `bazel test --config=remote --strategy=remote --remote_download_toplevel //codex-rs/exec-server:exec-server-unit-tests` - `bazel test --config=remote --strategy=remote --remote_download_toplevel --test_sharding_strategy=disabled --test_arg=default_thread_environment_selections_use_manager_default_id //codex-rs/core:core-unit-tests` - `bazel test --config=remote --strategy=remote --remote_download_toplevel --test_sharding_strategy=disabled --test_arg=start_thread_uses_all_default_environments_from_codex_home //codex-rs/core:core-unit-tests` ## Documentation This activates `CODEX_HOME/environments.toml`; user-facing documentation should be added before this stack is treated as a documented public workflow. --------- Co-authored-by: Codex <noreply@openai.com>
starr-openai ·
2026-05-08 11:17:56 -07:00 -
[codex-analytics] plumb protocol-native review timing (#21434)
## Why We want terminal tool review analytics, but the reducer should not stamp review timing from its own wall clock. This PR plumbs review timing through the real protocol and app-server seams so downstream analytics can consume the emitter's timestamps directly. Guardian reviews keep their enriched `started_at` / `completed_at` analytics fields by deriving those legacy second-based values from the same protocol-native millisecond lifecycle timestamps, rather than sampling a separate analytics clock. ## What changed - add `started_at_ms` to user approval request payloads - add `started_at_ms` / `completed_at_ms` to guardian review notifications - preserve Guardian review `started_at` / `completed_at` enrichment from the protocol-native timing source - stamp typed `ServerResponse` analytics facts with app-server-observed `completed_at_ms` - thread the new timing fields through core, protocol, app-server, TUI, and analytics fixtures ## Verification - `cargo test -p codex-app-server outgoing_message --manifest-path codex-rs/Cargo.toml` - `cargo test -p codex-app-server-protocol guardian --manifest-path codex-rs/Cargo.toml` - `cargo test -p codex-tui guardian --manifest-path codex-rs/Cargo.toml` - `cargo test -p codex-analytics analytics_client_tests --manifest-path codex-rs/Cargo.toml` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/21434). * #18748 * __->__ #21434 * #18747 * #17090 * #17089 * #20514
rhan-oai ·
2026-05-07 20:31:41 -07:00 -
Disable empty Cargo test targets (#21584)
## Summary `cargo test` has entails both running standard Rust tests and doctests. It turns out that the doctest discovery is fairly slow, and it's a cost you pay even for crates that don't include any doctests. This PR disables doctests with `doctest = false` for crates that lack any doctests. For the collection of crates below, this speeds up test execution by >4x. E.g., before this PR: ``` Benchmark 1: cargo test -p codex-utils-absolute-path -p codex-utils-cache -p codex-utils-cli -p codex-utils-home-dir -p codex-utils-output-truncation -p codex-utils-path -p codex-utils-string -p codex-utils-template -p codex-utils-elapsed -p codex-utils-json-to-toml Time (mean ± σ): 1.849 s ± 4.455 s [User: 0.752 s, System: 1.367 s] Range (min … max): 0.418 s … 14.529 s 10 runs ``` And after: ``` Benchmark 1: cargo test -p codex-utils-absolute-path -p codex-utils-cache -p codex-utils-cli -p codex-utils-home-dir -p codex-utils-output-truncation -p codex-utils-path -p codex-utils-string -p codex-utils-template -p codex-utils-elapsed -p codex-utils-json-to-toml Time (mean ± σ): 428.6 ms ± 6.9 ms [User: 187.7 ms, System: 219.7 ms] Range (min … max): 418.0 ms … 436.8 ms 10 runs ``` For a single crate, with >2x speedup, before: ``` Benchmark 1: cargo test -p codex-utils-string Time (mean ± σ): 491.1 ms ± 9.0 ms [User: 229.8 ms, System: 234.9 ms] Range (min … max): 480.9 ms … 512.0 ms 10 runs ``` And after: ``` Benchmark 1: cargo test -p codex-utils-string Time (mean ± σ): 213.9 ms ± 4.3 ms [User: 112.8 ms, System: 84.0 ms] Range (min … max): 206.8 ms … 221.0 ms 13 runs ``` Co-authored-by: Codex <noreply@openai.com>
Charlie Marsh ·
2026-05-07 15:44:17 -07:00 -
Revert state DB injection and agent graph store (#21481)
## Why Reverts #20689 to restore the previous optional state DB plumbing. The conflict resolution keeps the newer installation ID and session/thread identity changes that landed after #20689, while removing the mandatory state DB and agent graph store dependency from ThreadManager construction. ## What changed - Restored `Option<StateDbHandle>` through app-server, MCP server, prompt debug, and test entry points. - Removed the `codex-core` dependency on `codex-agent-graph-store` and reverted descendant lookup back to the existing state DB path when available. - Kept newer `installation_id` forwarding by passing it beside the optional DB handle. - Kept local thread-name updates working when the optional state DB handle is absent. ## Validation - `git diff --check` - `cargo test -p codex-thread-store` - `cargo test -p codex-state -p codex-rollout -p codex-app-server-protocol` - Attempted `env CARGO_INCREMENTAL=0 cargo test -p codex-core -p codex-app-server -p codex-app-server-client -p codex-mcp-server -p codex-thread-manager-sample -p codex-tui`; blocked locally by a rustc ICE while compiling `v8 v146.4.0` with `rustc 1.93.0 (254b59607 2026-01-19)` on `aarch64-apple-darwin`.
pakrym-oai ·
2026-05-06 22:48:29 -07:00 -
pakrym-oai ·
2026-05-07 02:24:20 +00:00 -
Move skills watcher to app-server (#21287)
## Why Skills update notifications are app-server API behavior, but the watcher lived in `codex-core` and surfaced through `EventMsg::SkillsUpdateAvailable`. Moving the watcher out keeps core focused on thread execution and lets app-server own both cache invalidation and the `skills/changed` notification. ## What changed - Added an app-server-owned skills watcher that watches local skill roots, clears the shared skills cache, and emits `skills/changed` directly. - Registers skill watches from the common app-server thread listener attach path, including direct starts, resumes, and app-server-observed child or forked threads. - Stores the `WatchRegistration` on `ThreadState`, so listener replacement, thread teardown, idle unload, and app-server shutdown deregister by dropping the RAII guard. - Removed `EventMsg::SkillsUpdateAvailable`, the core watcher, and the old core live-reload test. - Extended the app-server skills change test to verify a cached skills list is refreshed after a filesystem change without forcing reload. ## Validation - `cargo check -p codex-core -p codex-app-server -p codex-mcp-server -p codex-rollout -p codex-rollout-trace` - `cargo test -p codex-app-server skills_changed_notification_is_emitted_after_skill_change`
pakrym-oai ·
2026-05-06 15:38:11 -07:00 -
Remove core MCP list tools op (#21281)
## Why The core `Op::ListMcpTools` request path is no longer needed. Keeping it around left a dead request/response surface alongside the app-server MCP inventory APIs that own current server status listing. ## What Changed - Removed `Op::ListMcpTools`, `EventMsg::McpListToolsResponse`, and the core handler that built the MCP snapshot response. - Removed the now-unused `codex-mcp` snapshot wrapper/export and passive event handling arms in rollout and MCP-server consumers. - Updated tests that used the old op as a synchronization hook to wait on existing startup/skills events, and deleted the plugin test that only exercised the removed listing op. ## Validation - `cargo test -p codex-protocol` - `cargo test -p codex-mcp` - `cargo test -p codex-rollout -p codex-rollout-trace -p codex-mcp-server` - `cargo test -p codex-core --test all pending_input::queued_inter_agent_mail` - `cargo test -p codex-core --test all rmcp_client::stdio_mcp_tool_call_includes_sandbox_state_meta` - `cargo test -p codex-core --test all rmcp_client::stdio_image_responses` - `just fix -p codex-core -p codex-protocol -p codex-mcp -p codex-rollout -p codex-rollout-trace -p codex-mcp-server`
pakrym-oai ·
2026-05-06 11:20:34 -07:00 -
Move message history out of core (#21278)
## Why Message history was implemented inside `codex-core` and surfaced through core protocol ops and `SessionConfiguredEvent` fields even though the current consumer is TUI-local prompt recall. That made core own UI history persistence and exposed `history_log_id` / `history_entry_count` through surfaces that app-server and other clients do not need. This change moves message history persistence out of core and keeps the recall plumbing local to the TUI. ## What changed - Added a new `codex-message-history` crate for appending, looking up, trimming, and reading metadata from `history.jsonl`. - Removed core protocol history ops/events: `AddToHistory`, `GetHistoryEntryRequest`, and `GetHistoryEntryResponse`. - Removed `history_log_id` and `history_entry_count` from `SessionConfiguredEvent` and updated exec/MCP/test fixtures accordingly. - Updated the TUI to dispatch local app events for message-history append/lookup and keep its persistent-history metadata in TUI session state. ## Validation - `cargo test -p codex-message-history -p codex-protocol` - `cargo test -p codex-exec event_processor_with_json_output` - `cargo test -p codex-mcp-server outgoing_message` - `cargo test -p codex-tui` - `just fix -p codex-message-history -p codex-protocol -p codex-core -p codex-tui -p codex-exec -p codex-mcp-server`
pakrym-oai ·
2026-05-06 08:35:42 -07:00 -
Move installation ID resolution out of core startup (#21182)
## Summary - resolve or inject the installation ID before core startup and pass it through `ThreadManager`, `CodexSpawnArgs`, and `Session` as a plain `String` - keep child sessions on the parent installation ID instead of rediscovering it inside core - propagate installation ID startup failures in `mcp-server` instead of panicking ## Why Core was still touching the filesystem on the session startup path to discover `installation_id`. This moves that work to the outer host boundary so core no longer depends on `codex_home` reads during session construction. --------- Co-authored-by: Codex <noreply@openai.com>
jif-oai ·
2026-05-06 10:48:54 +00:00 -
feat: add
session_id(#20437)## Summary Related to https://openai.slack.com/archives/C095U48JNL9/p1777537279707449 TLDR: We update the meaning of session ids and thread ids: * thread_id stays as now * session_id become a shared id between every thread under a /root thread (i.e. every sub-agent share the same session id) This PR introduces an explicit `SessionId` and threads it through the protocol/client boundary so `session_id` and `thread_id` can diverge when they need to, while preserving compatibility for older serialized `session_configured` events. --------- Co-authored-by: Codex <noreply@openai.com>
jif-oai ·
2026-05-06 10:48:37 +02:00 -
[codex-analytics] rework thread_source for thread analytics (#20949)
## Summary - make `thread_source` an explicit optional thread-level field on `thread/start`, `thread/fork`, and returned thread payloads - persist `thread_source` in rollout/session metadata so resumed live threads retain the original value - replace the old best-effort `session_source` -> `thread_source` mapping with an explicit caller-supplied analytics classification ## Why Before this change, analytics `thread_source` was populated by a best-effort mapping from `session_source`. `session_source` describes the runtime/client surface, not the actual thread-level origin, so that projection was not accurate enough to distinguish cases such as `user`, `subagent`, `memory_consolidation`, and future thread origins reliably. Making `thread_source` explicit keeps one thread-level analytics field while letting callers provide the real classification directly instead of recovering it indirectly from `session_source`. ## Impact For new analytics events, `thread_source` now reflects the explicit thread-level classification supplied by the caller rather than an inferred value derived from `session_source`. Existing protocol fields remain optional; callers that omit `threadSource` now produce `null` instead of a best-effort inferred value. ## Validation - `just write-app-server-schema` - `cargo test -p codex-analytics -p codex-core -p codex-app-server-protocol --no-run` - `cargo test -p codex-app-server-protocol generated_ts_optional_nullable_fields_only_in_params` - `cargo test -p codex-analytics thread_initialized_event_serializes_expected_shape` - `cargo test -p codex-core resume_stopped_thread_from_rollout_preserves_thread_source`
rhan-oai ·
2026-05-06 02:12:31 +00:00 -
[codex] Remove legacy ListSkills op (#21282)
## Why `skills/list` is already exposed through app-server v2 and covered by the app-server test suite. Keeping the separate core `Op::ListSkills` path leaves a duplicate legacy protocol surface that no longer needs to be maintained. ## What Changed - Removed `Op::ListSkills` and `EventMsg::ListSkillsResponse` from the core protocol. - Deleted the corresponding core session handler and stale core integration tests. - Removed rollout/MCP ignore branches and protocol v1 docs references for the deleted event/op. - Left app-server `skills/list` and its existing coverage intact. ## Validation - `cargo test -p codex-protocol` - `cargo test -p codex-core --test all suite::skills` - `cargo check -p codex-mcp-server -p codex-rollout -p codex-rollout-trace` - `just fix -p codex-core`
pakrym-oai ·
2026-05-05 18:58:18 -07:00 -
[codex] Move thread naming to app server (#21260)
## Why Thread names are app-server metadata now, backed by the thread store and sqlite state database. Keeping a core `SetThreadName` op plus a rollout `thread_name_updated` event made rename persistence live in the wrong layer and required historical replay support for an event that new app-server flows should not write. ## What changed - Removed `Op::SetThreadName` and `EventMsg::ThreadNameUpdated` from the core protocol and deleted the core handler path that appended rename events to rollouts. - Updated app-server `thread/name/set` so both loaded and unloaded threads write through thread-store metadata and app-server emits `thread/name/updated` notifications. - Updated local thread-store name metadata updates to write sqlite title metadata and the legacy thread-name index without appending rollout events. - Removed state extraction and rollout handling for the deleted thread-name event. ## Validation - `cargo test -p codex-app-server thread_name_updated_broadcasts` - `cargo test -p codex-app-server thread_name_set_is_reflected_in_read_list_and_resume` - `cargo test -p codex-thread-store update_thread_metadata_sets_name_on_active_rollout_and_indexes_name` - `cargo test -p codex-state` - `cargo check -p codex-mcp-server -p codex-rollout-trace` - `just fix -p codex-app-server -p codex-thread-store -p codex-state -p codex-mcp-server -p codex-rollout-trace` ## Docs No external documentation update is expected for this internal ownership change.
pakrym-oai ·
2026-05-05 17:16:06 -07:00 -
Inject state DB, agent graph store (#20689)
## Why We want the agent graph store to be passed down the stack as a real dependency, the same way we already treat the thread store. This will let us inject the agent graph store as a real dependency and support implementations other than the local SQLite-backed one. Right now most code instantiates a state DB and an agent graph store just-in-time. Ideally, we would not depend on the state DB directly but only read through the higher-level interfaces. This change makes the dependency boundaries explicit and moves state DB initialization to process bootstrap instead of hiding it inside local store implementations. ## What changed - `ThreadManager` now requires a `StateDbHandle` and an `AgentGraphStore` at construction time instead of treating them as optional internals. - The local store constructors no longer lazily initialize SQLite. Callers now initialize the state DB once per process and use that shared handle to build: - `LocalThreadStore` - `LocalAgentGraphStore` - App bootstraps (`app-server`, `mcp-server`, `prompt_debug`, and the thread-manager sample) now initialize the state DB up front and inject the resulting handle down the stack. - `app-server` now consistently uses its process-scoped state DB handle instead of reopening SQLite or trying to recover it from loaded threads. - Device-key storage now reuses the shared state DB handle instead of maintaining its own lazy opener. - The thread archive / descendant traversal paths now use the injected `AgentGraphStore` instead of reaching through local thread-store-specific state. ## Verification - `cargo check -p codex-core -p codex-thread-store -p codex-app-server -p codex-mcp-server -p codex-thread-manager-sample --tests` - `cargo test -p codex-thread-store` - `cargo test -p codex-core thread_manager_accepts_separate_agent_graph_store_and_thread_store -- --nocapture` - `cargo test -p codex-app-server thread_archive_archives_spawned_descendants -- --nocapture`
Rasmus Rygaard ·
2026-05-05 21:45:29 +00:00 -
state: pass state db handles through consumers (#20561)
## Why SQLite state was still being opened from consumer paths, including lazy `OnceCell`-backed thread-store call sites. That let one process construct multiple state DB connections for the same Codex home, which makes SQLite lock contention and `database is locked` failures much easier to hit. State DB lifetime should be chosen by main-like entrypoints and tests, then passed through explicitly. Consumers should use the supplied `Option<StateDbHandle>` or `StateDbHandle` and keep their existing filesystem fallback or error behavior when no handle is available. The startup path also needs to keep the rollout crate in charge of SQLite state initialization. Opening `codex_state::StateRuntime` directly bypasses rollout metadata backfill, so entrypoints should initialize through `codex_rollout::state_db` and receive a handle only after required rollout backfills have completed. ## What Changed - Initialize the state DB in main-like entrypoints for CLI, TUI, app-server, exec, MCP server, and the thread-manager sample. - Pass `Option<StateDbHandle>` through `ThreadManager`, `LocalThreadStore`, app-server processors, TUI app wiring, rollout listing/recording, personality migration, shell snapshot cleanup, session-name lookup, and memory/device-key consumers. - Remove the lazy local state DB wrapper from the thread store so non-test consumers use only the supplied handle or their existing fallback path. - Make `codex_rollout::state_db::init` the local state startup path: it opens/migrates SQLite, runs rollout metadata backfill when needed, waits for concurrent backfill workers up to a bounded timeout, verifies completion, and then returns the initialized handle. - Keep optional/non-owning SQLite helpers, such as remote TUI local reads, as open-only paths that do not run startup backfill. - Switch app-server startup from direct `codex_state::StateRuntime::init` to the rollout state initializer so app-server cannot skip rollout backfill. - Collapse split rollout lookup/list APIs so callers use the normal methods with an optional state handle instead of `_with_state_db` variants. - Restore `getConversationSummary(ThreadId)` to delegate through `ThreadStore::read_thread` instead of a LocalThreadStore-specific rollout path special case. - Keep DB-backed rollout path lookup keyed on the DB row and file existence, without imposing the filesystem filename convention on existing DB rows. - Verify readable DB-backed rollout paths against `session_meta.id` before returning them, so a stale SQLite row that points at another thread's JSONL falls back to filesystem search and read-repairs the DB row. - Keep `debug prompt-input` filesystem-only so a one-off debug command does not initialize or backfill SQLite state just to print prompt input. - Keep goal-session test Codex homes alive only in the goal-specific helper, rather than leaking tempdirs from the shared session test helper. - Update tests and call sites to pass explicit state handles where DB behavior is expected and explicit `None` where filesystem-only behavior is intended. ## Validation - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p codex-rollout -p codex-thread-store -p codex-app-server -p codex-core -p codex-tui -p codex-exec -p codex-cli --tests` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout state_db_` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout find_thread_path` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout find_thread_path -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout try_init_ -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo clippy -p codex-rollout --lib -- -D warnings` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-thread-store read_thread_falls_back_when_sqlite_path_points_to_another_thread -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-thread-store` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core shell_snapshot` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all personality_migration` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all rollout_list_find` - `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all rollout_list_find::find_prefers_sqlite_path_by_id -- --nocapture` - `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all rollout_list_find -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core interrupt_accounts_active_goal_before_pausing` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-app-server get_auth_status -- --test-threads=1` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-app-server --lib` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p codex-rollout -p codex-app-server --tests` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout -p codex-thread-store -p codex-core -p codex-app-server -p codex-tui -p codex-exec -p codex-cli` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout -p codex-app-server` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-core` - `just argument-comment-lint -p codex-core` - `just argument-comment-lint -p codex-rollout` Focused coverage added in `codex-rollout`: - `recorder::tests::state_db_init_backfills_before_returning` verifies the rollout metadata row exists before startup init returns. - `state_db::tests::try_init_waits_for_concurrent_startup_backfill` verifies startup waits for another worker to finish backfill instead of disabling the handle for the process. - `state_db::tests::try_init_times_out_waiting_for_stuck_startup_backfill` verifies startup does not hang indefinitely on a stuck backfill lease. - `tests::find_thread_path_accepts_existing_state_db_path_without_canonical_filename` verifies DB-backed lookup accepts valid existing rollout paths even when the filename does not include the thread UUID. - `tests::find_thread_path_falls_back_when_db_path_points_to_another_thread` verifies DB-backed lookup ignores a stale row whose existing path belongs to another thread and read-repairs the row after filesystem fallback. Focused coverage updated in `codex-core`: - `rollout_list_find::find_prefers_sqlite_path_by_id` now uses a DB-preferred rollout file with matching `session_meta.id`, so it still verifies that valid SQLite paths win without depending on stale/empty rollout contents. `cargo test -p codex-app-server thread_list_respects_search_term_filter -- --test-threads=1 --nocapture` was attempted locally but timed out waiting for the app-server test harness `initialize` response before reaching the changed thread-list code path. `bazel test //codex-rs/thread-store:thread-store-unit-tests --test_output=errors` was attempted locally after the thread-store fix, but this container failed before target analysis while fetching `v8+` through BuildBuddy/direct GitHub. The equivalent local crate coverage, including `cargo test -p codex-thread-store`, passes. A plain local `cargo check -p codex-rollout -p codex-app-server --tests` also requires system `libcap.pc` for `codex-linux-sandbox`; the follow-up app-server check above used `CODEX_SKIP_VENDORED_BWRAP=1` in this container.
Ruslan Nigmatullin ·
2026-05-04 11:46:03 -07:00 -
[codex] Emit image view as core item (#20512)
## Why Image-view results should be represented as a core-produced turn item instead of being reconstructed by app-server. At the same time, existing rollout/history paths still understand the legacy `ViewImageToolCall` event, so this keeps that event as compatibility output generated from the new item lifecycle. ## What changed - Added `TurnItem::ImageView` to `codex-protocol`. - Emitted image-view item start/completion directly from the core `view_image` handler. - Kept `ViewImageToolCall` as a legacy event and generate it from completed `TurnItem::ImageView` items. - Kept `thread_history.rs` on the legacy `ViewImageToolCall` replay path, with `ImageView` item lifecycle events ignored there. - Updated app-server protocol conversion, rollout persistence, and affected exhaustive event matches for the new item plus legacy fan-out shape. ## Verification - `cargo test -p codex-protocol -p codex-app-server-protocol -p codex-rollout -p codex-rollout-trace -p codex-mcp-server -p codex-app-server --lib` - `cargo test -p codex-core --test all view_image_tool_attaches_local_image` - `just fix -p codex-protocol -p codex-core -p codex-app-server-protocol -p codex-app-server -p codex-rollout -p codex-rollout-trace -p codex-mcp-server` - `git diff --check`
pakrym-oai ·
2026-05-01 11:28:30 -07:00 -
Make thread store process-scoped (#19474)
- Build one app-server process ThreadStore from startup config and share it with ThreadManager and CodexMessageProcessor. - Remove per-thread/fork store reconstruction so effective thread config cannot switch the persistence backend. - Add params to ThreadStore create/resume for specifying thread metadata, since otherwise the metadata from store creation would be used (incorrectly).
Tom ·
2026-04-30 21:24:59 -07:00 -
[codex] Remove unused event messages (#20511)
## Why Several legacy `EventMsg` variants were still emitted or mapped even though clients either ignored them or had moved to item/lifecycle events. `Op::Undo` had also degraded to an unavailable shim, so this removes that dead task path instead of preserving a command that cannot do useful work. `McpStartupComplete`, `WebSearchBegin`, and `ImageGenerationBegin` are intentionally kept because useful consumers still depend on them: MCP startup completion drives readiness behavior, and the begin events let app-server/core consumers surface in-progress web-search and image-generation items before the final payload arrives. ## What Changed - Removed weak legacy event variants and payloads from `codex-protocol`, including legacy agent deltas, background events, and undo lifecycle events. - Kept/restored `EventMsg::McpStartupComplete`, `EventMsg::WebSearchBegin`, and `EventMsg::ImageGenerationBegin` with serializer and emission coverage. - Updated core, rollout, MCP server, app-server thread history, review/delegate filtering, and tests to rely on the useful replacement events that remain. - Removed `Op::Undo`, `UndoTask`, the undo test module, and stale TUI slash-command comments. - Stopped agent job/background progress and compaction retry notices from emitting `BackgroundEvent` payloads. ## Verification - `cargo check -p codex-protocol -p codex-app-server-protocol -p codex-core -p codex-rollout -p codex-rollout-trace -p codex-mcp-server` - `cargo test -p codex-protocol -p codex-app-server-protocol -p codex-rollout -p codex-rollout-trace -p codex-mcp-server` - `cargo test -p codex-core --test all suite::items` - `just fix -p codex-protocol -p codex-app-server-protocol -p codex-core -p codex-rollout -p codex-rollout-trace -p codex-mcp-server` - Earlier coverage on this PR also included `codex-mcp`, `codex-tui`, core library tests, MCP/plugin/delegate/review/agent job tests, and MCP startup TUI tests.
pakrym-oai ·
2026-04-30 20:03:26 -07:00 -
Michael Bolin ·
2026-04-29 20:54:59 -07:00 -
Reduce the surface of collaboration modes (#20149)
Collaboration modes were slightly invasive both into ThreadManager construction and ModelProvider
pakrym-oai ·
2026-04-29 17:22:41 -07:00 -
Add ThreadManager sample crate (#20141)
Summary: - Add codex-thread-manager-sample, a one-shot binary that starts a ThreadManager thread, submits a prompt, and prints the final assistant output. - Pass ThreadStore into ThreadManager::new and expose thread_store_from_config for existing callsites. - Build the sample Config directly with only --model and prompt inputs. Verification: - just fmt - cargo check -p codex-thread-manager-sample -p codex-app-server -p codex-mcp-server - git diff --check Tests: Not run per request.
pakrym-oai ·
2026-04-29 11:21:06 -07:00 -
Add environment provider snapshot (#20058)
## Summary - Change `EnvironmentProvider` to return concrete `Environment` instances instead of `EnvironmentConfigurations`. - Make `DefaultEnvironmentProvider` provide the provider-visible `local` environment plus optional `remote` environment from `CODEX_EXEC_SERVER_URL`. - Keep `EnvironmentManager` as the concrete cache while exposing its own explicit local environment for `local_environment()` fallback paths. ## Validation - `just fmt` - `git diff --check` --------- Co-authored-by: Codex <noreply@openai.com>
starr-openai ·
2026-04-28 20:05:18 -07:00 -
permissions: make SessionConfigured profile-only (#19774)
## Why `SessionConfiguredEvent` is the internal event that tells clients what permissions are active for a session. Emitting both `sandbox_policy` and `permission_profile` leaves two possible authorities and forces every consumer to decide which one to honor. At this point in the migration, the profile is expressive enough to represent managed, disabled, and external sandbox enforcement, so the internal event can be profile-only. The wire compatibility concern is older serialized events or rollout data that only contain `sandbox_policy`; those still need to deserialize. ## What Changed - Removes `sandbox_policy` from `SessionConfiguredEvent` and makes `permission_profile` required. - Adds custom deserialization so old payloads with only `sandbox_policy` are upgraded to a cwd-anchored `PermissionProfile`. - Updates core event emission and TUI session handling to sync permissions from the profile directly. - Updates app-server response construction to derive the legacy `sandbox` response field from the active thread snapshot instead of from `SessionConfiguredEvent`. - Updates yolo-mode display logic to treat both `PermissionProfile::Disabled` and managed unrestricted filesystem plus enabled network as full-access, while still preserving the distinction between no sandbox and external sandboxing. ## Verification - `cargo test -p codex-protocol session_configured_event --lib` - `cargo test -p codex-protocol serialize_event --lib` - `cargo test -p codex-exec session_configured --lib` - `cargo test -p codex-app-server thread_response_permission_profile_preserves_enforcement --lib` - `cargo test -p codex-core session_configured_reports_permission_profile_for_external_sandbox --lib` - `cargo test -p codex-tui session_configured --lib` - `cargo test -p codex-tui yolo_mode_includes_managed_full_access_profiles --lib` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19774). * #19900 * #19899 * #19776 * #19775 * __->__ #19774
Michael Bolin ·
2026-04-27 22:06:47 -07:00 -
refactor: make auth loading async (#19762)
## Summary Auth loading used to expose synchronous construction helpers in several places even though some auth sources now need async work. This PR makes the auth-loading surface async and updates the callers to await it. This is intentionally only plumbing. It does not change how AgentIdentity tokens are decoded, how task runtime ids are allocated, or how JWT signatures are verified. ## Stack 1. **This PR:** [refactor: make auth loading async](https://github.com/openai/codex/pull/19762) 2. [refactor: load AgentIdentity runtime eagerly](https://github.com/openai/codex/pull/19763) 3. [feat: verify AgentIdentity JWTs with JWKS](https://github.com/openai/codex/pull/19764) ## Important call sites | Area | Change | | --- | --- | | `codex-login` auth loading | `CodexAuth` and `AuthManager` construction paths now await auth loading. | | app-server startup | Auth manager construction is awaited during initialization. | | CLI/TUI/exec/MCP/chatgpt callers | Existing auth-loading calls now await the same behavior. | | cloud requirements storage loader | The loader becomes async so it can share the same auth construction path. | | auth tests | Tests that load auth now run in async contexts. | ## Testing Tests: targeted Rust auth test compilation, formatter, scoped Clippy fix, and Bazel lock check.
efrazer-oai ·
2026-04-27 11:00:27 -07:00