mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
dev
37 Commits
-
[codex] Use tool search for MCP tools by default (#29486)
## Why MCP tools were only placed behind `tool_search` when a feature flag was enabled or when there were at least 100 tools. That made the model's tool flow depend on both rollout configuration and the number of installed tools. The searched-tool flow is now the intended behavior. Making it unconditional when the model and provider support it gives every supported setup the same behavior and lets us retire the feature flag safely. ## What changed - Defer all effective MCP tools when `tool_search` and namespaced tools are supported. - Keep exposing MCP tools directly when search cannot be used, so older or unsupported model/provider combinations still work. - Mark `tool_search_always_defer_mcp_tools` as removed and ignore old configured values. - Keep plugin filtering, app-only filtering, file handling, and MCP calls working through the searched-tool flow. ## Why many tests changed Many tests used to act as if the model could see MCP tools in its first request and call them immediately. That is no longer the real flow: the model first receives `tool_search`, searches for a tool, receives the matching MCP tool, and then calls it in the next request. The tests therefore needed an extra search step, and checks for tool names, descriptions, and input fields had to move from the first request to the search result. These are not separate product changes; they make the tests follow what the model will actually see after this change. The plugin tests still check which tools are allowed and where they came from, the file tests still check upload fields and behavior, and the MCP round-trip test still checks a successful call from start to finish. ## Tests - `just test -p codex-features` - Focused `codex-core` tests for MCP exposure and tool planning - `just test -p codex-core explicit_plugin_mentions` - `just test -p codex-core stdio_server_round_trip` - Focused `codex-core` tests for tool search, app-only tools, and MCP file uploads
sayan-oai ·
2026-06-22 16:45:23 -07:00 -
[codex] Use expect in integration tests (#28441)
The workspace denies `clippy::expect_used` in production. Although `clippy.toml` allows `expect` in tests, Bazel Clippy compiles integration-test helper code in a way that does not receive that exemption, which encouraged verbose `unwrap_or_else(... panic!(...))` and equivalent `match`/`let else` forms. This allows `clippy::expect_used` once at each integration-test crate root (including aggregated suites and test-support libraries), then replaces manual panic-based Result and Option unwraps with `expect`/`expect_err`. Standalone `tests/*.rs` files remain their own crate roots. Intentional assertion and unexpected-variant panics remain unchanged, and the production `expect_used = "deny"` lint remains in place. The cleanup is mechanical and net-negative in line count.
pakrym-oai ·
2026-06-15 21:53:47 -07:00 -
[codex] Dedupe plugin MCPs by app declaration name (#27607)
## Context This is the next step in the plugin auth-routing stack. The earlier PRs make `PluginsManager` auth-aware and move the broad App/MCP surface decision into that layer. This PR narrows the ChatGPT/SIWC behavior so we only hide a plugin MCP server when it conflicts with an App declaration of the same name. In product terms: if a plugin exposes both an App route and MCP route for `foo`, ChatGPT/SIWC sessions should use the App route for `foo`. If the same plugin also exposes a separate MCP server like `foo2`, that MCP server should remain available. ```json // .app.json { "apps": { "foo": { "id": "connector_abc" } } } ``` ```json // .mcp.json { "mcpServers": { "foo": { "url": "https://mcp.foo.com/mcp" }, "foo2": { "url": "https://mcp.foo2.com/mcp" } } } ``` ## Stack - PR1: #27652 seed plugin manager auth at construction. - PR2: #27459 route plugin surfaces by auth mode. - PR3: #27607 dedupe plugin MCP servers by App declaration name. - PR4: #27602 preserve plugin Apps in connector listings. - PR5: #27461 skip install-time plugin MCP OAuth for matching App routes. ## Summary - Preserve App declaration names in loaded plugin metadata. - Keep public effective App outputs as deduped connector IDs for existing callers. - For ChatGPT/SIWC, suppress only plugin MCP servers whose names match declared App names. ## Validation ```bash cargo fmt --all cargo test -p codex-core-plugins plugin_auth_projection cargo test -p codex-core-plugins effective_apps cargo test -p codex-core-plugins read_plugin_for_config_installed_git_source_reads_from_cache_without_cloning cargo test -p codex-core explicit_plugin_mentions_use_apps_for_chatgpt_dual_surface_plugins cargo test -p codex-core explicit_plugin_mentions_keep_non_conflicting_mcp_for_chatgpt_auth cargo test -p codex-app-server --test all plugin_install_filters_disallowed_apps_needing_auth git diff --check ``` --------- Co-authored-by: Xin Lin <xl@openai.com>felixxia-oai ·
2026-06-13 17:53:09 -07:00 -
[codex] Gate plugin MCP servers by auth route (#27459)
## Context Some plugins expose both Apps and MCP servers. This PR moves auth-aware surface projection into `core-plugins::PluginsManager`, so callers get a consistent effective plugin view. Later PRs narrow the conflict rule and update listing/install paths. The high level goal of this PR is to set up the plumbing to conditionally filter App/MCP in the plugin manager layer. We start by removing MCP servers when using SIWC/Codex-backend auth, and removing Apps when using API-key-style auth. This PR is now stacked on #27652, which contains only the constructor plumbing for seeding `PluginsManager` with the current auth mode. ## Stack - PR1: #27652 seed plugin manager auth at construction. - PR2: #27459 route plugin surfaces by auth mode. - PR3: #27607 dedupe plugin MCP servers by App declaration name. - PR4: #27602 preserve plugin Apps in connector listings. - PR5: #27461 skip install-time plugin MCP OAuth for matching App routes. ## Summary - API-key/non-ChatGPT routes hide plugin Apps and keep plugin MCPs. - ChatGPT/SIWC with Apps enabled keeps plugin Apps and suppresses MCPs for dual-surface plugins. - MCP-only plugins stay available for ChatGPT/SIWC sessions. - Cached plugin load outcomes are re-projected when auth mode changes. ## Validation ```bash cargo test -p codex-core-plugins plugin_auth_projection cargo test -p codex-core list_tool_suggest_discoverable_plugins git diff --check ```
felixxia-oai ·
2026-06-12 19:42:11 -07:00 -
fix(plugins) rm plugin descriptions (#23254)
## Summary Removes Plugin descriptions from the dev message, since descriptions of skills and MCPs cover the capabilities offered by the plugin. ## Testing - [x] Updates unit tests
Dylan Hurd ·
2026-06-12 13:31:11 -07:00 -
Pair thread environment settings (#26687)
## Why Thread cwd and environment selections are a single logical setting in core: updating one without the other can silently desynchronize the next-turn execution context. This change makes that relationship explicit in the internal thread settings flow while preserving the existing app-server public API shape. ## What changed - Moved the cwd/environment pair through internal `ThreadSettingsOverrides.environment_settings` instead of a top-level internal `cwd` field. - Kept `thread/settings/update` public params unchanged, with app-server translating top-level `cwd` into the paired internal settings shape. - Moved `Op::UserInput` environment overrides into thread settings so user turns and settings updates use the same core path. - Updated core, app-server, MCP, memories, sample, and test callsites to construct the paired settings shape. ## Verification - `git diff --check` - Local test run starting after PR creation.
pakrym-oai ·
2026-06-08 13:55:15 -07:00 -
log plugin MCP server names (#26002)
## Summary - emit the plugin capability summary's exact MCP server names in `codex_plugin_used` ## Test - `just test -p codex-analytics` - `just test -p codex-core explicit_plugin_mentions_track_plugin_used_analytics` - `just fix -p codex-analytics`
Chris Dong ·
2026-06-03 16:06:52 -07:00 -
flake: Keep plugin test homes alive (#25857)
## Summary Keep the full `TestCodex` harness alive in plugin integration tests instead of returning only the `CodexThread`. ## Why The helper was moving a temporary `codex_home` into `TestCodex`, then immediately dropping the harness and returning only the thread. For plugin MCP tests, the MCP server cwd is inside that temporary home. If the temp directory is removed while MCP startup is still racing, the server launch can fail with `No such file or directory`. Keeping the harness in scope keeps the temp home alive for the test duration and removes the lifetime race behind the recent `explicit_plugin_mentions_inject_plugin_guidance` flake. ## Validation - `just fmt` - `just test -p codex-core explicit_plugin_mentions_inject_plugin_guidance`
jif ·
2026-06-02 16:21:22 +02:00 -
[codex] Wait for MCP readiness in core integration tests (#24964)
Ensures MCP-backed `codex-core` integration tests exercise initialized servers instead of racing server startup. I've been idly investigating a few flakes and the failure modes are much more confusing when a tool call fails because of a failed server start than when the failed server start causes the test to fail directly.
Adam Perry @ OpenAI ·
2026-05-29 10:22:27 -07:00 -
Add experimental turn additional context (#24154)
## Summary Adds experimental `additionalContext` support to `turn/start` and `turn/steer` so clients can provide ephemeral external context, such as browser or automation state, without turning that plumbing into a visible user prompt or triggering user-prompt lifecycle behavior. ## API Shape The parameter shape is: ```ts additionalContext?: Record<string, { value: string kind: "untrusted" | "application" }> | null ``` Example: ```json { "additionalContext": { "browser_info": { "value": "Active tab is CI failures.", "kind": "untrusted" }, "automation_info": { "value": "CI rerun is in progress.", "kind": "application" } } } ``` The keys are opaque and caller-defined. ## Context Injection When provided, accepted entries are inserted into model context as hidden contextual message items, not as visible thread user-message items. `kind: "untrusted"` entries are inserted with role `user`: ```text <external_${key}>${value}</external_${key}> ``` `kind: "application"` entries are inserted with role `developer`: ```text <${key}>${value}</${key}> ``` Values are not escaped. Each value is truncated to 1k approximate tokens before wrapping. For `turn/start`, accepted additional context is inserted before normal user input. For `turn/steer`, additional context is merged only when the steer includes non-empty user input; context-only steers still reject as empty input. ## Dedupe Strategy `AdditionalContextStore` lives on session state and stores the latest complete additional-context map. Each `turn/start` or non-empty `turn/steer` treats its `additionalContext` as the current complete set of values. Entries are injected only when the key is new or the exact entry for that key changed, including `value` or `kind`. After merging, the store is replaced with the provided map, so omitted keys are removed from the retained set and can be injected again later if reintroduced. Omitting `additionalContext`, passing `null`, or passing an empty object resets the store to empty and injects nothing. ## What Changed - Threads experimental v2 `additionalContext` through app-server into core turn start and steer handling. - Adds separate contextual fragment types for untrusted user-role context and application developer-role context. - Uses pending response input items so additional context can be combined with normal user input without treating it as prompt text. - Adds integration coverage for start/steer flow, role routing, dedupe/reset behavior, deletion/re-add behavior, hook-blocked input behavior, empty context-only steer rejection, external-fragment marker matching, and truncation.pakrym-oai ·
2026-05-26 13:02:34 -07:00 -
Move MCP tool naming mode into manager (#21576)
## Why The `non_prefixed_mcp_tool_names` feature should be applied where MCP tools become model-visible, not by remapping names later in core. Keeping the decision in `McpConnectionManager` construction makes `ToolInfo` the single shaped view that spec building, deferred tool search, routing, and unavailable-tool placeholders can consume directly. This also preserves the existing external behavior while the feature is off, and keeps the feature-on behavior for code mode and hooks explicit at the manager boundary. ## What Changed - Add `McpToolNameMode` to `codex-mcp` and flow it through `McpConfig` into `McpConnectionManager::new`. - Normalize MCP `ToolInfo` names in the manager using either legacy-prefixed namespaces or non-prefixed namespaces; the legacy path adds `mcp__` without restoring the old trailing namespace suffix. - Remove the core-side MCP name remapping path so specs, tool search, session resolution, and unavailable-tool placeholder construction use the manager-provided `ToolName` values directly. - Keep code mode flattening on the `__` namespace separator. - Preserve hook compatibility by giving non-prefixed MCP hook names legacy `mcp__...` matcher aliases. - Add/adjust integration and unit coverage for non-prefixed code-mode behavior, hook matching with the feature on and off, and manager-level legacy prefixing. ## Testing - `cargo test -p codex-mcp --lib` - `cargo test -p codex-core --lib tools::spec::tests -- --nocapture` - `cargo test -p codex-core --lib mcp_tools -- --nocapture` - `cargo test -p codex-core --lib mcp_tool_exposure -- --nocapture` - `cargo test -p codex-core --test all mcp_tool -- --nocapture` - `cargo test -p codex-core --test all search_tool -- --nocapture` - `cargo test -p codex-core --test all hooks_mcp -- --nocapture` - `cargo test -p codex-core --test all code_mode_uses_non_prefixed_mcp_tool_names_when_feature_enabled -- --nocapture` - `cargo test -p codex-tools` - `cargo test -p codex-features`
pakrym-oai ·
2026-05-26 08:21:15 -07:00 -
[1 of 7] Add thread settings to UserInput (#23080)
**Stack position:** [1 of 7] ## Summary The first three PRs in this stack are a cleanup pass before the actual thread settings API work. Today, core has several overlapping "user input" ops: `UserInput`, `UserInputWithTurnContext`, and `UserTurn`. They differ mostly in how much next-turn state they carry, which makes the later queued thread settings update harder to reason about and review. This PR starts that cleanup by adding the shared `ThreadSettingsOverrides` payload and allowing `Op::UserInput` to carry it. Existing variants remain in place here, so this layer is mostly a behavior-preserving API shape change plus mechanical constructor updates. ## End State After PR3 By the end of PR3, `Op::UserInput` is the only "user input" core op. It can carry optional thread settings overrides for callers that need to update stored defaults with a turn, while callers without updates use empty settings. `Op::UserInputWithTurnContext` and `Op::UserTurn` are deleted. ## End State After PR5 By the end of PR5, core will have only two ops for this area: - `Op::UserInput` for user-input-bearing submissions. - `Op::ThreadSettings` for settings-only updates. ## Stack 1. [1 of 7] [Add thread settings to UserInput](https://github.com/openai/codex/pull/23080) (this PR) 2. [2 of 7] [Remove UserInputWithTurnContext](https://github.com/openai/codex/pull/23081) 3. [3 of 7] [Remove UserTurn](https://github.com/openai/codex/pull/23075) 4. [4 of 7] [Placeholder for OverrideTurnContext cleanup](https://github.com/openai/codex/pull/23087) 5. [5 of 7] [Replace OverrideTurnContext with ThreadSettings](https://github.com/openai/codex/pull/22508) 6. [6 of 7] [Add app-server thread settings API](https://github.com/openai/codex/pull/22509) 7. [7 of 7] [Sync TUI thread settings](https://github.com/openai/codex/pull/22510)
Eric Traut ·
2026-05-18 18:48:35 -07:00 -
Remove core MCP list tools op (#21281)
## Why The core `Op::ListMcpTools` request path is no longer needed. Keeping it around left a dead request/response surface alongside the app-server MCP inventory APIs that own current server status listing. ## What Changed - Removed `Op::ListMcpTools`, `EventMsg::McpListToolsResponse`, and the core handler that built the MCP snapshot response. - Removed the now-unused `codex-mcp` snapshot wrapper/export and passive event handling arms in rollout and MCP-server consumers. - Updated tests that used the old op as a synchronization hook to wait on existing startup/skills events, and deleted the plugin test that only exercised the removed listing op. ## Validation - `cargo test -p codex-protocol` - `cargo test -p codex-mcp` - `cargo test -p codex-rollout -p codex-rollout-trace -p codex-mcp-server` - `cargo test -p codex-core --test all pending_input::queued_inter_agent_mail` - `cargo test -p codex-core --test all rmcp_client::stdio_mcp_tool_call_includes_sandbox_state_meta` - `cargo test -p codex-core --test all rmcp_client::stdio_image_responses` - `just fix -p codex-core -p codex-protocol -p codex-mcp -p codex-rollout -p codex-rollout-trace -p codex-mcp-server`
pakrym-oai ·
2026-05-06 11:20:34 -07:00 -
Support Codex Apps auth elicitations (#19193)
## Summary - request URL-mode MCP elicitations when Codex Apps tool calls fail with connector auth metadata - route Codex Apps auth URL elicitations into the TUI app-link flow ## Test plan - `just fmt` - `cargo test -p codex-core mcp_tool_call::tests` - `cargo test -p codex-mcp` - `cargo test -p codex-tui bottom_pane::app_link_view::tests` - `just fix -p codex-core` - `just fix -p codex-mcp` - `just fix -p codex-tui` Also attempted broader local runs: - `cargo test -p codex-core` fails in unrelated config/request-permission/proxy-sensitive tests under the current Codex Desktop environment. - `cargo test -p codex-tui` fails in unrelated status snapshots/trust-default tests because the ambient environment renders workspace-write/network permission defaults.
Matthew Zeng ·
2026-05-06 07:18:00 +00:00 -
Stabilize plugin MCP fixture tests (#19452)
## Why Recent `main` CI had repeated flakes in the plugin fixture tests: - `codex-core::all suite::plugins::explicit_plugin_mentions_inject_plugin_guidance` failed in runs [24909500958](https://github.com/openai/codex/actions/runs/24909500958), [24908076251](https://github.com/openai/codex/actions/runs/24908076251), [24906197645](https://github.com/openai/codex/actions/runs/24906197645), and [24898949647](https://github.com/openai/codex/actions/runs/24898949647). - `codex-core::all suite::plugins::plugin_mcp_tools_are_listed` failed in runs [24909500958](https://github.com/openai/codex/actions/runs/24909500958), [24908076251](https://github.com/openai/codex/actions/runs/24908076251), and [24898949647](https://github.com/openai/codex/actions/runs/24898949647). The failures were in the same plugin/MCP fixture family: assertions expected sample plugin guidance or tool inventory, but the test could observe the session before the sample MCP server had finished startup. ## Root Cause `explicit_plugin_mentions_inject_plugin_guidance` submitted the user turn immediately after constructing the session. MCP startup is asynchronous, so on a slower or busier CI runner the prompt could be built before the sample plugin MCP server had reported its tools. That made the test depend on scheduler timing rather than the fixture being ready. `plugin_mcp_tools_are_listed` already needed the same readiness condition, but its wait logic was local to that test. ## What Changed - Added a shared `wait_for_sample_mcp_ready` helper for the plugin fixture tests. - Wait for `McpStartupComplete` before submitting the explicit plugin mention turn. - Reuse the same readiness helper in the MCP tool-listing test. ## Why This Should Be Reliable The tests now wait for the explicit readiness signal from the sample MCP server before asserting guidance or tools derived from that server. This removes the startup race while still exercising the real fixture path, so the assertions should only run after the plugin inventory is deterministic. ## Verification - `cargo test -p codex-core --test all plugins::` - GitHub CI for this PR is passing.
Dylan Hurd ·
2026-04-28 01:14:44 +00:00 -
Stabilize plugin MCP tools test (#19191)
## Summary The plugin MCP tool-listing test could hide MCP startup failures by polling `ListMcpTools` until its own 30s deadline. If the plugin MCP server startup had already failed or timed out, the session-owned MCP manager would keep returning an empty tool list, so CI only reported `discovered tools: []` instead of the startup state that mattered. This makes the test synchronize on `McpStartupComplete` for the sample plugin MCP server before asserting listed tools, and gives the Bazel-launched test server a larger startup window. ## Notes Confidence is about 80%. The source path strongly supports the RCA: a failed MCP startup is represented as an empty tool list through `ListMcpTools`, so the old polling contract could not distinguish "not ready yet" from "startup already failed." I could not retrieve the CI execution-log artifact to confirm the exact hidden startup error, but the observed Ubuntu Bazel failure matches this path: repeated `ListMcpTools` responses with no tools until the test-local timeout fired. I think this is the right solution because it keeps plugin behavior unchanged and fixes only the test contract. Future startup failures should now report the `McpStartupComplete` failure/cancellation instead of timing out on an empty tool snapshot. This test was introduced in https://github.com/openai/codex/pull/12864.
Eric Traut ·
2026-04-23 14:08:40 -07:00 -
Add turn-scoped environment selections (#18416)
## Summary - add experimental turn/start.environments params for per-turn environment id + cwd selections - pass selections through core protocol ops and resolve them with EnvironmentManager before TurnContext creation - treat omitted selections as default behavior, empty selections as no environment, and non-empty selections as first environment/cwd as the turn primary ## Testing - ran `just fmt` - ran `just write-app-server-schema` - not run: unit tests for this stacked PR --------- Co-authored-by: Codex <noreply@openai.com>
starr-openai ·
2026-04-21 17:48:33 -07:00 -
Update models.json (#18586)
- Replace the active models-manager catalog with the deleted core catalog contents. - Replace stale hardcoded test model slugs with current bundled model slugs. - Keep this as a stacked change on top of the cleanup PR.
Ahmed Ibrahim ·
2026-04-20 10:27:01 -07:00 -
register all mcp tools with namespace (#17404)
stacked on #17402. MCP tools returned by `tool_search` (deferred tools) get registered in our `ToolRegistry` with a different format than directly available tools. this leads to two different ways of accessing MCP tools from our tool catalog, only one of which works for each. fix this by registering all MCP tools with the namespace format, since this info is already available. also, direct MCP tools are registered to responsesapi without a namespace, while deferred MCP tools have a namespace. this means we can receive MCP `FunctionCall`s in both formats from namespaces. fix this by always registering MCP tools with namespace, regardless of deferral status. make code mode track `ToolName` provenance of tools so it can map the literal JS function name string to the correct `ToolName` for invocation, rather than supporting both in core. this lets us unify to a single canonical `ToolName` representation for each MCP tool and force everywhere to use that one, without supporting fallbacks.
sayan-oai ·
2026-04-15 21:02:59 +08:00 -
Forward app-server turn clientMetadata to Responses (#16009)
## Summary App-server v2 already receives turn-scoped `clientMetadata`, but the Rust app-server was dropping it before the outbound Responses request. This change keeps the fix lightweight by threading that metadata through the existing turn-metadata path rather than inventing a new transport. ## What we're trying to do and why We want turn-scoped metadata from the app-server protocol layer, especially fields like Hermes/GAAS run IDs, to survive all the way to the actual Responses API request so it is visible in downstream websocket request logging and analytics. The specific bug was: - app-server protocol uses camelCase `clientMetadata` - Responses transport already has an existing turn metadata carrier: `x-codex-turn-metadata` - websocket transport already rewrites that header into `request.request_body.client_metadata["x-codex-turn-metadata"]` - but the Rust app-server never parsed or stored `clientMetadata`, so nothing from the app-server request was making it into that existing path This PR fixes that without adding a new header or a second metadata channel. ## How we did it ### Protocol surface - Add optional `clientMetadata` to v2 `TurnStartParams` and `TurnSteerParams` - Regenerate the JSON schema / TypeScript fixtures - Update app-server docs to describe the field and its behavior ### Runtime plumbing - Add a dedicated core op for app-server user input carrying turn-scoped metadata: `Op::UserInputWithClientMetadata` - Wire `turn/start` and `turn/steer` through that op / signature path instead of dropping the metadata at the message-processor boundary - Store the metadata in `TurnMetadataState` ### Transport behavior - Reuse the existing serialized `x-codex-turn-metadata` payload - Merge the new app-server `clientMetadata` into that JSON additively - Do **not** replace built-in reserved fields already present in the turn metadata payload - Keep websocket behavior unchanged at the outer shape level: it still sends only `client_metadata["x-codex-turn-metadata"]`, but that JSON string now contains the merged fields - Keep HTTP fallback behavior unchanged except that the existing `x-codex-turn-metadata` header now includes the merged fields too ### Request shape before / after Before, a websocket `response.create` looked like: ```json { "type": "response.create", "client_metadata": { "x-codex-turn-metadata": "{\"session_id\":\"...\",\"turn_id\":\"...\"}" } } ``` Even if the app-server caller supplied `clientMetadata`, it was not represented there. After, the same request shape is preserved, but the serialized payload now includes the new turn-scoped fields: ```json { "type": "response.create", "client_metadata": { "x-codex-turn-metadata": "{\"session_id\":\"...\",\"turn_id\":\"...\",\"fiber_run_id\":\"fiber-start-123\",\"origin\":\"gaas\"}" } } ``` ## Validation ### Targeted tests added / updated - protocol round-trip coverage for `clientMetadata` on `turn/start` and `turn/steer` - protocol round-trip coverage for `Op::UserInputWithClientMetadata` - `TurnMetadataState` merge test proving client metadata is added without overwriting reserved built-in fields - websocket request-shape test proving outbound `response.create` contains merged metadata inside `client_metadata["x-codex-turn-metadata"]` - app-server integration tests proving: - `turn/start` forwards `clientMetadata` into the outbound Responses request path - websocket warmup + real turn request both behave correctly - `turn/steer` updates the follow-up request metadata ### Commands run - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-protocol` - `cargo test -p codex-core turn_metadata_state_merges_client_metadata_without_replacing_reserved_fields --lib` - `cargo test -p codex-core --test all responses_websocket_preserves_custom_turn_metadata_fields` - `cargo test -p codex-app-server --test all client_metadata` - `cargo test -p codex-app-server --test all turn_start_forwards_client_metadata_to_responses_websocket_request_body_v2 -- --nocapture` - `just fmt` - `just fix -p codex-core -p codex-protocol -p codex-app-server-protocol -p codex-app-server` - `just fix -p codex-exec -p codex-tui-app-server` - `just argument-comment-lint` ### Full suite note `cargo test` in `codex-rs` still fails in: - `suite::v2::turn_interrupt::turn_interrupt_resolves_pending_command_approval_request` I verified that same failure on a clean detached `HEAD` worktree with an isolated `CARGO_TARGET_DIR`, so it is not caused by this patch.neil-oai ·
2026-04-09 11:52:37 -07:00 -
core: remove cross-crate re-exports from lib.rs (#16512)
## Why `codex-core` was re-exporting APIs owned by sibling `codex-*` crates, which made downstream crates depend on `codex-core` as a proxy module instead of the actual owner crate. Removing those forwards makes crate boundaries explicit and lets leaf crates drop unnecessary `codex-core` dependencies. In this PR, this reduces the dependency on `codex-core` to `codex-login` in the following files: ``` codex-rs/backend-client/Cargo.toml codex-rs/mcp-server/tests/common/Cargo.toml ``` ## What - Remove `codex-rs/core/src/lib.rs` re-exports for symbols owned by `codex-login`, `codex-mcp`, `codex-rollout`, `codex-analytics`, `codex-protocol`, `codex-shell-command`, `codex-sandboxing`, `codex-tools`, and `codex-utils-path`. - Delete the `default_client` forwarding shim in `codex-rs/core`. - Update in-crate and downstream callsites to import directly from the owning `codex-*` crate. - Add direct Cargo dependencies where callsites now target the owner crate, and remove `codex-core` from `codex-rs/backend-client`.
Michael Bolin ·
2026-04-01 23:06:24 -07:00 -
[codex-analytics] thread events (#15690)
- add event for thread initialization - thread/start, thread/fork, thread/resume - feature flagged behind `FeatureFlag::GeneralAnalytics` - does not yet support threads started by subagents PR stack: - --> [[telemetry] thread events #15690](https://github.com/openai/codex/pull/15690) - [[telemetry] subagent events #15915](https://github.com/openai/codex/pull/15915) - [[telemetry] turn events #15591](https://github.com/openai/codex/pull/15591) - [[telemetry] steer events #15697](https://github.com/openai/codex/pull/15697) - [[telemetry] queued prompt data #15804](https://github.com/openai/codex/pull/15804) Sample extracted logs in Codex-backend ``` INFO | 2026-03-29 16:39:37 | codex_backend.routers.analytics_events | analytics_events.track_analytics_events:398 | Tracked analytics event codex_thread_initialized thread_id=019d3bf7-9f5f-7f82-9877-6d48d1052531 product_surface=codex product_client_id=CODEX_CLI client_name=codex-tui client_version=0.0.0 rpc_transport=in_process experimental_api_enabled=True codex_rs_version=0.0.0 runtime_os=macos runtime_os_version=26.4.0 runtime_arch=aarch64 model=gpt-5.3-codex ephemeral=False thread_source=user initialization_mode=new subagent_source=None parent_thread_id=None created_at=1774827577 | INFO | 2026-03-29 16:45:46 | codex_backend.routers.analytics_events | analytics_events.track_analytics_events:398 | Tracked analytics event codex_thread_initialized thread_id=019d3b84-5731-79d0-9b3b-9c6efe5f5066 product_surface=codex product_client_id=CODEX_CLI client_name=codex-tui client_version=0.0.0 rpc_transport=in_process experimental_api_enabled=True codex_rs_version=0.0.0 runtime_os=macos runtime_os_version=26.4.0 runtime_arch=aarch64 model=gpt-5.3-codex ephemeral=False thread_source=user initialization_mode=resumed subagent_source=None parent_thread_id=None created_at=1774820022 | INFO | 2026-03-29 16:45:49 | codex_backend.routers.analytics_events | analytics_events.track_analytics_events:398 | Tracked analytics event codex_thread_initialized thread_id=019d3bfd-4cd6-7c12-a13e-48cef02e8c4d product_surface=codex product_client_id=CODEX_CLI client_name=codex-tui client_version=0.0.0 rpc_transport=in_process experimental_api_enabled=True codex_rs_version=0.0.0 runtime_os=macos runtime_os_version=26.4.0 runtime_arch=aarch64 model=gpt-5.3-codex ephemeral=False thread_source=user initialization_mode=forked subagent_source=None parent_thread_id=None created_at=1774827949 | INFO | 2026-03-29 17:20:29 | codex_backend.routers.analytics_events | analytics_events.track_analytics_events:398 | Tracked analytics event codex_thread_initialized thread_id=019d3c1d-0412-7ed2-ad24-c9c0881a36b0 product_surface=codex product_client_id=CODEX_SERVICE_EXEC client_name=codex_exec client_version=0.0.0 rpc_transport=in_process experimental_api_enabled=True codex_rs_version=0.0.0 runtime_os=macos runtime_os_version=26.4.0 runtime_arch=aarch64 model=gpt-5.3-codex ephemeral=False thread_source=user initialization_mode=new subagent_source=None parent_thread_id=None created_at=1774830027 | ``` Notes - `product_client_id` gets canonicalized in codex-backend - subagent threads are addressed in a following pr
rhan-oai ·
2026-03-31 12:16:44 -07:00 -
Split features into codex-features crate (#15253)
- Split the feature system into a new `codex-features` crate. - Cut `codex-core` and workspace consumers over to the new config and warning APIs. Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com> Co-authored-by: Codex <noreply@openai.com>
Ahmed Ibrahim ·
2026-03-19 20:12:07 -07:00 -
[apps] Add tool call meta. (#14647)
- [x] Add resource_uri and other things to _meta to shortcut resource lookup and speed things up.
Matthew Zeng ·
2026-03-14 22:24:13 -07:00 -
move plugin/skill instructions into dev msg and reorder (#14609)
Move the general `Apps`, `Skills` and `Plugins` instructions blocks out of `user_instructions` and into the developer message, with new `Apps -> Skills -> Plugins` order for better clarity. Also wrap those sections in stable XML-style instruction tags (like other sections) and update prompt-layout tests/snapshots. This makes the tests less brittle in snapshot output (we can parse the sections), and it consolidates the capability instructions in one place. #### Tests Updated snapshots, added tests. `<AGENTS_MD>` disappearing in snapshots is expected: before this change, the wrapped user-instructions message was kept alive by `Skills` content. Now that `Skills` and `Plugins` are in the developer message, that wrapper only appears when there is real project-doc/user-instructions content. --------- Co-authored-by: Charley Cunningham <ccunningham@openai.com>
sayan-oai ·
2026-03-13 20:51:01 -07:00 -
Normalize MCP tool names to code-mode safe form (#14605)
Code mode doesn't allow `-` in names and it's better if function names and code-mode names are the same.
pakrym-oai ·
2026-03-13 14:50:16 -07:00 -
chore: clarify plugin + app copy in model instructions (#14541)
- clarify app mentions are in user messages - clarify what it means for tools to be provided via `codex_apps` MCP - add plugin descriptions (with basic sanitization) to top-level `## Plugins` section alongside the corresponding plugin names - explain that skills from plugins are prefixed with `plugin_name:` in top-level `##Plugins` section changes to more logically organize `Apps`, `Skills`, and `Plugins` instructions will be in a separate PR, as that shuffles dev + user instructions in ways that change tests broadly. ### Tests confirmed in local rollout, some new tests.
sayan-oai ·
2026-03-13 10:57:41 -07:00 -
Add plugin usage telemetry (#14531)
adding metrics including: * plugin used * plugin installed/uninstalled * plugin enabled/disabled
alexsong-oai ·
2026-03-12 19:22:30 -07:00 -
feat: search_tool migrate to bring you own tool of Responses API (#14274)
## Why to support a new bring your own search tool in Responses API(https://developers.openai.com/api/docs/guides/tools-tool-search#client-executed-tool-search) we migrating our bm25 search tool to use official way to execute search on client and communicate additional tools to the model. ## What - replace the legacy `search_tool_bm25` flow with client-executed `tool_search` - add protocol, SSE, history, and normalization support for `tool_search_call` and `tool_search_output` - return namespaced Codex Apps search results and wire namespaced follow-up tool calls back into MCP dispatch
Anton Panasenko ·
2026-03-11 17:51:51 -07:00 -
[apps] Fix apps enablement condition. (#14011)
- [x] Fix apps enablement condition to check both the feature flag and that the user is not an API key user.
Matthew Zeng ·
2026-03-09 22:25:43 -07:00 -
feat: structured plugin parsing (#13711)
#### What Add structured `@plugin` parsing and TUI support for plugin mentions. - Core: switch from plain-text `@display_name` parsing to structured `plugin://...` mentions via `UserInput::Mention` and `[$...](plugin://...)` links in text, same pattern as apps/skills. - TUI: add plugin mention popup, autocomplete, and chips when typing `$`. Load plugin capability summaries and feed them into the composer; plugin mentions appear alongside skills and apps. - Generalize mention parsing to a sigil parameter, still defaults to `$` <img width="797" height="119" alt="image" src="https://github.com/user-attachments/assets/f0fe2658-d908-4927-9139-73f850805ceb" /> Builds on #13510. Currently clients have to build their own `id` via `plugin@marketplace` and filter plugins to show by `enabled`, but we will add `id` and `available` as fields returned from `plugin/list` soon. ####Tests Added tests, verified locally.
sayan-oai ·
2026-03-06 11:08:36 -08:00 -
add @plugin mentions (#13510)
## Note-- added plugin mentions via @, but that conflicts with file mentions depends and builds upon #13433. - introduces explicit `@plugin` mentions. this injects the plugin's mcp servers, app names, and skill name format into turn context as a dev message. - we do not yet have UI for these mentions, so we currently parse raw text (as opposed to skills and apps which have UI chips, autocomplete, etc.) this depends on a `plugins/list` app-server endpoint we can feed the UI with, which is upcoming - also annotate mcp and app tool descriptions with the plugin(s) they come from. this gives the model a first class way of understanding what tools come from which plugins, which will help implicit invocation. ### Tests Added and updated tests, unit and integration. Also confirmed locally a raw `@plugin` injects the dev message, and the model knows about its apps, mcps, and skills.
sayan-oai ·
2026-03-06 00:03:39 +00:00 -
feat: track plugins mcps/apps and add plugin info to user_instructions (#13433)
### first half of changes, followed by #13510 Track plugin capabilities as derived summaries on `PluginLoadOutcome` for enabled plugins with at least one skill/app/mcp. Also add `Plugins` section to `user_instructions` injected on session start. These introduce the plugins concept and list enabled plugins, but do NOT currently include paths to enabled plugins or details on what apps/mcps the plugins contain (current plan is to inject this on @-mention). that can be adjusted in a follow up and based on evals. ### tests Added/updated tests, confirmed locally that new `Plugins` section + currently enabled plugins show up in `user_instructions`.
sayan-oai ·
2026-03-04 19:46:13 -08:00 -
config: enforce enterprise feature requirements (#13388)
## Why Enterprises can already constrain approvals, sandboxing, and web search through `requirements.toml` and MDM, but feature flags were still only configurable as managed defaults. That meant an enterprise could suggest feature values, but it could not actually pin them. This change closes that gap and makes enterprise feature requirements behave like the other constrained settings. The effective feature set now stays consistent with enterprise requirements during config load, when config writes are validated, and when runtime code mutates feature flags later in the session. It also tightens the runtime API for managed features. `ManagedFeatures` now follows the same constraint-oriented shape as `Constrained<T>` instead of exposing panic-prone mutation helpers, and production code can no longer construct it through an unconstrained `From<Features>` path. The PR also hardens the `compact_resume_fork` integration coverage on Windows. After the feature-management changes, `compact_resume_after_second_compaction_preserves_history` was overflowing the libtest/Tokio thread stacks on Windows, so the test now uses an explicit larger-stack harness as a pragmatic mitigation. That may not be the ideal root-cause fix, and it merits a parallel investigation into whether part of the async future chain should be boxed to reduce stack pressure instead. ## What Changed Enterprises can now pin feature values in `requirements.toml` with the requirements-side `features` table: ```toml [features] personality = true unified_exec = false ``` Only canonical feature keys are allowed in the requirements `features` table; omitted keys remain unconstrained. - Added a requirements-side pinned feature map to `ConfigRequirementsToml`, threaded it through source-preserving requirements merge and normalization in `codex-config`, and made the TOML surface use `[features]` (while still accepting legacy `[feature_requirements]` for compatibility). - Exposed `featureRequirements` from `configRequirements/read`, regenerated the JSON/TypeScript schema artifacts, and updated the app-server README. - Wrapped the effective feature set in `ManagedFeatures`, backed by `ConstrainedWithSource<Features>`, and changed its API to mirror `Constrained<T>`: `can_set(...)`, `set(...) -> ConstraintResult<()>`, and result-returning `enable` / `disable` / `set_enabled` helpers. - Removed the legacy-usage and bulk-map passthroughs from `ManagedFeatures`; callers that need those behaviors now mutate a plain `Features` value and reapply it through `set(...)`, so the constrained wrapper remains the enforcement boundary. - Removed the production loophole for constructing unconstrained `ManagedFeatures`. Non-test code now creates it through the configured feature-loading path, and `impl From<Features> for ManagedFeatures` is restricted to `#[cfg(test)]`. - Rejected legacy feature aliases in enterprise feature requirements, and return a load error when a pinned combination cannot survive dependency normalization. - Validated config writes against enterprise feature requirements before persisting changes, including explicit conflicting writes and profile-specific feature states that normalize into invalid combinations. - Updated runtime and TUI feature-toggle paths to use the constrained setter API and to persist or apply the effective post-constraint value rather than the requested value. - Updated the `core_test_support` Bazel target to include the bundled core model-catalog fixtures in its runtime data, so helper code that resolves `core/models.json` through runfiles works in remote Bazel test environments. - Renamed the core config test coverage to emphasize that effective feature values are normalized at runtime, while conflicting persisted config writes are rejected. - Ran `compact_resume_after_second_compaction_preserves_history` inside an explicit 8 MiB test thread and Tokio runtime worker stack, following the existing larger-stack integration-test pattern, to keep the Windows `compact_resume_fork` test slice from aborting while a parallel investigation continues into whether some of the underlying async futures should be boxed. ## Verification - `cargo test -p codex-config` - `cargo test -p codex-core feature_requirements_ -- --nocapture` - `cargo test -p codex-core load_requirements_toml_produces_expected_constraints -- --nocapture` - `cargo test -p codex-core compact_resume_after_second_compaction_preserves_history -- --nocapture` - `cargo test -p codex-core compact_resume_fork -- --nocapture` - Re-ran the built `codex-core` `tests/all` binary with `RUST_MIN_STACK=262144` for `compact_resume_after_second_compaction_preserves_history` to confirm the explicit-stack harness fixes the deterministic low-stack repro. - `cargo test -p codex-core` - This still fails locally in unrelated integration areas that expect the `codex` / `test_stdio_server` binaries or hit existing `search_tool` wiremock mismatches. ## Docs `developers.openai.com/codex` should document the requirements-side `[features]` table for enterprise and MDM-managed configuration, including that it only accepts canonical feature keys and that conflicting config writes are rejected.
Michael Bolin ·
2026-03-04 04:40:22 +00:00 -
feat: load plugin apps (#13401)
load plugin-apps from `.app.json`. make apps runtime-mentionable iff `codex_apps` MCP actually exposes tools for that `connector_id`. if the app isn't available, it's filtered out of runtime connector set, so no tools are added and no app-mentions resolve. right now we don't have a clean cli-side error for an app not being installed. can look at this after. ### Tests Added tests, tested locally that using a plugin that bundles an app picks up the app.
sayan-oai ·
2026-03-03 16:29:15 -08:00 -
Refactor plugin config and cache path (#13333)
Update config.toml plugin entries to use <plugin_name>@<marketplace_name> as the key. Plugin now stays in [plugins/cache/marketplace-name/plugin-name/$version/] Clean up the plugin code structure. Add plugin install functionality (not used yet).
xl-openai ·
2026-03-03 15:00:18 -08:00 -
feat: load from plugins (#12864)
Support loading plugins. Plugins can now be enabled via [plugins.<name>] in config.toml. They are loaded as first-class entities through PluginsManager, and their default skills/ and .mcp.json contributions are integrated into the existing skills and MCP flows.
xl-openai ·
2026-03-01 10:50:56 -08:00