mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
e2f074e16c522bfa55d9bcd344a5ea0ba5a4580f
7552 Commits
-
code-mode: move cell state into library actor (#28599)
A code-mode cell is a single JavaScript execution that can produce output, call tools, wait for asynchronous work, resume, or be terminated. This PR extracts the existing per-cell run loop into a dedicated actor that owns the cell’s lifecycle state. It is primarily an ownership change rather than a new lifecycle contract: existing behavior now has one clear implementation boundary. ### Architecture The session service remains responsible for session-wide concerns: allocating cell IDs, storing shared values, creating cells, and routing requests to them. Once a cell is created, its execution state belongs to its actor. Callers interact with the actor through a handle. The actor receives two kinds of input: runtime events and control requests. A single event loop serializes these inputs and applies the lifecycle rules. It tracks the current observer—the caller waiting for an update—along with accumulated output, outstanding callbacks, runtime state, yield deadlines, and termination progress. Observation, termination, completion, and cleanup therefore have one consistent owner. When the runtime has no immediately runnable work and is waiting only on timers or tool results, the actor can return accumulated output and information about outstanding tool calls while keeping the cell available to resume. On completion or termination, it performs the appropriate callback cleanup before publishing the final result and removing the cell from the session. A small host interface connects the actor to session-owned facilities such as tool dispatch, notifications, stored values, and final cell removal, keeping those responsibilities outside the actor itself. ### Why Previously, cell lifecycle state and coordination lived alongside session management. The actor boundary makes each cell a self-contained state machine with a single writer, while the service becomes a registry and adapter around it. This makes lifecycle behavior easier to reason about and test in isolation. It also establishes a clean boundary for later changing where cells run or how they communicate without recreating their lifecycle rules.
Channing Conger ·
2026-06-16 19:28:55 -07:00 -
[codex] Support object-valued plugin MCP manifests (#28580)
## Summary This fixes plugin manifest parsing for MCP servers declared as an object directly in `plugin.json`. Before this change, Codex modeled `mcpServers` as only a string path, for example: ```json { "name": "counter-sample", "version": "1.1.1", "mcpServers": "./.mcp.json" } ``` Some migrated plugins instead provide the server map directly in the manifest: ```json { "name": "counter-sample", "version": "1.1.1", "description": "Plugin that declares MCP servers in the manifest", "mcpServers": { "counter": { "type": "http", "url": "https://sample.example/counter/mcp" } } } ``` That object form previously failed during install/load with an error like: ```text failed to parse plugin manifest: invalid type: map, expected a string ``` ## What changed - Add a manifest representation for `mcpServers` as either `Path(Resource)` or `Object(map)`. - Parse `plugin.json` `mcpServers` as either a string path or an object. - Route object-valued MCP server maps through the existing plugin MCP config parser instead of adding a second parser. - Apply existing per-plugin MCP server policy to object-valued MCP servers the same way as file-backed MCP servers. - Include object-valued MCP server names in plugin telemetry/capability metadata. - Support object-valued MCP config for executor plugins without requiring a `.mcp.json` filesystem read. - Update the bundled plugin-creator validator and `plugin-json-spec.md` so generated-plugin validation accepts the same object-valued shape. ## Compatibility Existing plugin manifests that use `"mcpServers": "./.mcp.json"` continue to work. Plugins can now also use the object shape shown above. ## Tests Added coverage for the new manifest attribute shape at the install, normal load, telemetry, and executor-provider layers: - `install_accepts_manifest_mcp_server_objects` - `load_plugins_loads_manifest_mcp_server_objects` - `plugin_telemetry_metadata_uses_manifest_mcp_server_objects` - `reads_manifest_object_config_without_executor_file_system_access` Also smoke-tested the plugin-creator validator against both supported forms: - `mcpServers` as a direct object in `plugin.json` - `mcpServers` as `"./.mcp.json"` with a companion `.mcp.json` ## Validation - `just test -p codex-plugin` - `just test -p codex-core-plugins` - `just test -p codex-mcp-extension` - `just bazel-lock-update` - `just bazel-lock-check` - `just fmt` - `git diff --check` - Focused rename/object-form rerun: `just test -p codex-core-plugins manager::tests::load_plugins_loads_manifest_mcp_server_objects manager::tests::plugin_telemetry_metadata_uses_manifest_mcp_server_objects store::tests::install_accepts_manifest_mcp_server_objects` - Focused executor rerun: `just test -p codex-mcp-extension executor_plugin::provider::tests::reads_manifest_object_config_without_executor_file_system_access` - `python3 codex-rs/skills/src/assets/samples/plugin-creator/scripts/validate_plugin.py /private/tmp/codex-validator-object` - `python3 codex-rs/skills/src/assets/samples/plugin-creator/scripts/validate_plugin.py /private/tmp/codex-validator-path`charlesgong-openai ·
2026-06-16 19:22:57 -07:00 -
thread-store: fix response fixture compilation (#28642)
## Why A `codex-thread-store` test fixture still constructs `ResponseItem::FunctionCallOutput` without its required `metadata` field, preventing the crate's test targets from compiling on `main`. ## What changed - Set the fixture's response-item metadata to `None`. ## Testing - `cargo check -p codex-thread-store --tests`
pakrym-oai ·
2026-06-16 19:16:16 -07:00 -
[codex] core: restore absolute turn context cwd (#28629)
## Why #28152 jumped the gun on moving the rollout format to store URIs, and would likely break compat with some features that don't go through the same types as the core logic. ## What Make `TurnContextItem.cwd` an `AbsolutePathBuf` again, remove test added for `PathUri` serialization in rollouts. Also drops a bunch of error paths that are no longer needed.
Adam Perry @ OpenAI ·
2026-06-16 19:05:26 -07:00 -
[codex] Gate remote plugin catalog by auth (#28625)
## Summary - Treat the remote global plugin catalog as active only when `remote_plugin` is enabled and the current auth uses the Codex backend. - Skip the local OpenAI curated marketplace for remote-enabled ChatGPT users while preserving configured marketplaces. - Keep the local curated marketplace for API-key users, unauthenticated fallback, and ChatGPT users with `remote_plugin` disabled. - Apply the same effective-remote gate to the remote installed-marketplace cache. ## Root cause The tool-suggestion discovery path unconditionally included the local OpenAI curated marketplace. For remote-enabled ChatGPT users, that made remote discovery additive: Codex parsed every local curated `plugin.json` before also loading the remote catalog. ## Validation - `just fmt` - `cargo build -p codex-cli --bin codex` - Targeted auth/feature matrix tests pass, including API-key auth with `remote_plugin` enabled. - Manual CLI validation confirmed: - ChatGPT + remote off includes local curated. - ChatGPT + remote on excludes local curated. - API-key auth keeps local curated when remote is enabled. - `just test -p codex-core-plugins`: 235 passed; one unrelated existing marketplace test failed because it loaded the developer's home marketplace configuration.
xl-openai ·
2026-06-16 17:24:48 -07:00 -
Revert "Tell codex about PathUri serde compat. (#28595)" (#28627)
This reverts commit
bd2a786326, which didn't capture all the nuance we need for this migration.Adam Perry @ OpenAI ·
2026-06-16 17:18:20 -07:00 -
Add thread recencyAt for sidebar ordering (#27910)
## Summary Add a server-owned `recencyAt` timestamp and `recency_at` thread-list sort key for product recency ordering while preserving the existing meaning of `updatedAt` as the latest persisted thread mutation. This is the server-side alternative to #27697. Rather than narrowing `updatedAt`, clients can sort the sidebar by `recency_at` and continue treating `updatedAt` as mutation time. Paired Codex Apps PR: [openai/openai#1024599](https://github.com/openai/openai/pull/1024599) ## Contract - `recencyAt` initializes when a thread is created. - A turn start advances `recencyAt` monotonically. - Commentary, agent output, tool results, token/accounting updates, turn completion, archive, unarchive, resume, and generic metadata writes do not advance it. - `updatedAt` retains its existing behavior and continues to advance for persisted thread mutations. - Current servers populate `recencyAt`; the response field is optional in generated TypeScript so clients connected to older servers can fall back to `updatedAt`. - Filesystem-only fallback uses existing updated/mtime ordering when SQLite is unavailable. ## Persistence and compatibility Migration 0038 adds second- and millisecond-precision recency columns, backfills them from the existing updated timestamp, creates list indexes, and includes an insert trigger so older binaries writing to a migrated database seed recency without causing later mutations to advance it. Generic metadata upserts preserve existing recency values. Turn-start updates use a dedicated monotonic touch, and process-local allocation keeps millisecond cursor values unique. State DB list, search, read, filtered-list repair, rollout fallback propagation, and app-server conversions all carry the new field. ## API `Thread` responses include: ```ts recencyAt?: number ``` `thread/list` and `thread/search` accept: ```json { "sortKey": "recency_at" } ``` Generated TypeScript and JSON schemas are included. ## Validation - `just test -p codex-state` — 146 passed - `just test -p codex-rollout` — 69 passed - `just test -p codex-thread-store` — 81 passed - `just test -p codex-app-server-protocol` — 231 passed - Focused app-server list ordering, response mapping, archive/unarchive, and resume lifecycle tests passed - Scoped `just fix` for state, rollout, thread-store, app-server-protocol, and app-server - `just fmt` - `git diff --check` - Independent correctness, simplicity, elegance, security, and test-quality reviews; actionable ordering, lifecycle, query-projection, and timestamp-uniqueness findings were addressed
Jeremy Rose ·
2026-06-16 17:06:22 -07:00 -
PAC 1 - Add system proxy feature config surface (#26706)
## Summary Introduces the default-off `respect_system_proxy` feature flag used to gate first-class system PAC/proxy support for Codex-owned native clients. With the feature disabled or absent, behavior remains unchanged. This PR establishes the configuration and managed-requirement surface; proxy discovery and request routing are implemented by follow-up PRs. ## Configuration User configuration uses the standard boolean feature form: ```toml [features] respect_system_proxy = true ``` Managed feature requirements use the corresponding boolean key. The effective runtime configuration is exposed as a boolean and defaults to `false`. ## Implementation - Registers `respect_system_proxy` as an under-development, default-off feature. - Resolves user configuration and managed feature requirements into `Config.respect_system_proxy`. - Provides bootstrap resolution for startup paths that must evaluate the feature before full configuration loading completes. - Uses the standard feature CLI and config-editing behavior. - Excludes `features.respect_system_proxy` from project-local configuration. - Updates the generated configuration schema. ## End-user behavior - No networking behavior changes when the feature is absent or disabled. - Enabling the feature makes the boolean available to the native proxy-routing implementation in follow-up PRs. - Repository-local configuration cannot enable the feature. ## Test coverage Covers scalar configuration and CLI override resolution, managed requirement constraints, bootstrap resolution, and project-local filtering.
canvrno-oai ·
2026-06-16 16:54:37 -07:00 -
[codex] [4/4] Simplify recommended plugin install schema (#28403)
## Summary - Simplify recommendation-context `request_plugin_install` arguments to `plugin_id` and `suggest_reason`. - Derive plugin type and install action from the matched candidate while preserving Codex-owned elicitation metadata. - Keep the legacy list-backed schema unchanged and accept resumed calls that still use `tool_id`. ## Stack - #28399 - #28400 - #27704 - This PR ## Validation - `just test -p codex-tools -p codex-core request_plugin_install` (25 passed) - `just fix -p codex-tools -p codex-core` - `just fmt` - `git diff --check`
Alex Daley ·
2026-06-16 23:44:42 +00:00 -
core: render remote environment cwd natively (#28152)
## Why Model-visible `<environment_context>` should match the environment of the executor, not of the app server. Stacked on #28146. ## What - Keep selected environment cwd values as `PathUri` while building environment context. - Render cwd text using the path convention represented by the URI, with the canonical URI as a fallback. - Preserve compatibility with legacy `TurnContextItem.cwd` values when reconstructing and diffing context. - Extend the Wine-backed remote Windows test to assert that the model sees `powershell` and `C:\windows`.
Adam Perry @ OpenAI ·
2026-06-16 16:17:47 -07:00 -
[codex] [3/4] Activate endpoint plugin recommendations (#27704)
Summary\n- Await endpoint recommendation selection while constructing each authenticated turn, removing the first-turn cache race.\n- Snapshot and filter endpoint candidates once per turn, then use that same set for the bounded contextual user fragment, tool exposure, and exact install validation.\n- Keep recommendation selection ephemeral: do not persist recommendation state in or gate resumed threads on prior context.\n- Hide the legacy list tool in endpoint mode and preserve legacy discovery unchanged when the endpoint is disabled or unavailable.\n- Keep remote plugin and connector app identities out of model-visible context and attach them only to Codex-owned elicitation metadata.\n\nStack\n- 3/4, based on #28400.\n- Endpoint client and cache: #28399.\n- Generalized suggestion presentation: #28400.\n- Install-schema follow-up: #28403.\n\nValidation\n- \n- \n- \n- \n- Full : 2,649 passed and 88 environment-dependent tests failed because this sandbox cannot write , nest Seatbelt, or locate auxiliary test binaries.
Alex Daley ·
2026-06-16 23:04:07 +00:00 -
[codex] [2/4] Generalize plugin suggestion presentation (#28400)
Summary - Add list-backed and developer-context presentations for plugin suggestion candidates. - Let tool planning, install validation, and request-tool copy follow the selected presentation. - Keep every production caller on the existing list-backed presentation, preserving the current list tool, request schema, connector behavior, and model-visible copy. - Leave developer-context presentation latent until the final PR in the stack. Stack - 2/3, based on #28399. - Follow-up: #27704 activates endpoint recommendations. Validation - `just test -p codex-core request_plugin_install` - `just test -p codex-core spec_plan` - `just fix -p codex-core` - `just fmt` - `git diff --check`
Alex Daley ·
2026-06-16 22:44:10 +00:00 -
[codex] [1/4] Add recommended plugin endpoint cache (#28399)
Summary - Add authenticated parsing for `/ps/plugins/suggested?scope=GLOBAL`, including remote plugin and connector app identities. - Validate, deduplicate, sort, and cap endpoint candidates before caching them by backend and account identity. - Deduplicate concurrent cache misses and warm recommendations from the existing remote-installed-plugin refresh path used at startup and after account changes. - Keep endpoint results model-invisible in this PR; failures and responses without `enabled: true` resolve to legacy mode. Stack - 1/3. Follow-up: #28400 generalizes plugin suggestion presentation without activating endpoint recommendations. - Final activation: #27704. Validation - `just test -p codex-core-plugins recommended_plugins` - `just fix -p codex-core-plugins` - `just fmt` - `git diff --check`
Alex Daley ·
2026-06-16 22:22:21 +00:00 -
Tell codex about PathUri serde compat. (#28595)
This addresses another wrinkle I keep having to re-prompt codex about when migrating to cross-OS paths.
Adam Perry @ OpenAI ·
2026-06-16 15:01:22 -07:00 -
app-server: preserve target-native environment cwd (#28146)
## Why app-server may run on a different OS from the selected exec-server environment. Parsing that environment’s cwd with the Codex host’s path rules prevents thread startup. ## What Carry environment cwd values as `LegacyAppPathString` at the app-server boundary and `PathUri` internally. Existing tool-call schemas and relative-path behavior stay host-native; remaining local-only consumers convert explicitly and leave follow-up TODOs. The Wine integration test verifies app-server can start a thread and complete an ordinary turn with a Windows environment cwd from Linux. ## Validation - `bazel test //codex-rs/core/tests/remote_env_windows:smoke-test --test_output=errors` - focused app-server environment-selection and protocol schema tests - scoped Clippy for `codex-core` and `codex-app-server-protocol`
Adam Perry @ OpenAI ·
2026-06-16 21:42:28 +00:00 -
Record invariants for path migration. (#28589)
## Why Help Codex understand how to execute the migration to support cross-OS paths. ## What Expand the path-types skill with our goals and constraints.
Adam Perry @ OpenAI ·
2026-06-16 21:05:32 +00:00 -
Clarify model-generated and legacy app path types (#28577)
## Why `ApiPathString` kind of implies that it can be used anywhere we pull a path out of JSON, but it's not really appropriate for tool arguments when the model might generate relative paths. Prefer `String` for model-generated paths and we can handle the conversion per feature for now and define a shared abstraction later if it makes sense. # What Rename `ApiPathString` to `AppLegacyPathString` to clarify its role. Expand the `path-types` skill to tell the model to leave tool args as bare strings.
Adam Perry @ OpenAI ·
2026-06-16 20:47:43 +00:00 -
[codex] test exec relative additional permissions (#28587)
## Why Review caught some would-be regressions in changes to unified_exec that weren't surfaced in CI. ## What Add coverage for requesting permissions through unified exec when there are additional permissions. Previously this flow was only tested against shell_command.
Adam Perry @ OpenAI ·
2026-06-16 20:45:57 +00:00 -
code-mode: extend test coverage to lock in cell lifecycle (#28468)
This PR establishes the intended behavior as an executable contract before a refactor of the cell runtime begins. It also fixes cases where a second observer or termination request could replace an existing response channel and leave the original caller unresolved. ### Behavior codified - A cell can yield output and subsequently resume to completion. - A caller can run a cell until it has no immediately runnable work, receive its accumulated output and outstanding tool-call IDs, and then resume the same cell when the awaited work is available. - Each cell admits one active observer: - a second observer receives an explicit busy error - the existing observer remains registered and is not displaced - A natural result (conclusion of the js module) that has already reached the cell controller wins over a later termination request. - Otherwise, termination preempts execution and resolves both: - the active observer, if present - the caller requesting termination - Repeated termination requests are rejected while termination is already in progress. - Terminal responses are sent only after outstanding callback work has been handled: - natural completion drains notifications and cancels outstanding tool calls - termination cancels and drains both notification and tool callbacks. - Cell removal and cell_closed notification happen after callback cleanup
Channing Conger ·
2026-06-16 13:34:16 -07:00 -
[codex] re-enable absolute workdir integration test (#28581)
## Why In #28146 I missed the invariant that an absolute `exec_command` workdir must override the environment cwd. The existing integration test would have caught that regression, but it was ignored as flaky. ## What Re-enable `unified_exec_respects_workdir_override`. ## Validation `just test -p codex-core unified_exec_respects_workdir_override`
Adam Perry @ OpenAI ·
2026-06-16 20:19:41 +00:00 -
[codex-app-server-test-client] Plugin Install/Uninstall Analytics Smoke Test (#27100)
## This PR The original [combined remote plugin analytics PR #26281](https://github.com/openai/codex/pull/26281) mixed reusable analytics test infrastructure, two manual smoke workflows, a metadata refactor, and the final identity behavior. This PR adds the account-mutating validation workflow separately so its cleanup and recovery guarantees can be reviewed without the final analytics behavior change. - Add a manually invoked remote plugin install/uninstall smoke workflow. - Require explicit account-mutation confirmation and an initially uninstalled plugin. - Validate the current `codex_plugin_installed` contract, where `plugin_id` is the backend ID. - Restore and verify the original uninstalled state, with a dedicated recovery command. This baseline intentionally does not require `codex_plugin_uninstalled`, because production does not emit that event yet. The final PR will update this smoke to require local `plugin_id`, `remote_plugin_id`, and uninstall emission. Review this PR as the net diff against #27099. ## Testing - `just test -p codex-app-server-test-client` (3 focused capture/validation tests passed) - The live workflow was previously exercised on the green combined reference branch, and the original uninstalled account state was restored. - CI is green across the required platform matrix. ## Split Overview ```text main ├── #27093 Debug analytics capture │ └── #27099 Non-mutating plugin smoke │ └── #27100 Remote install/uninstall smoke ← you are here └── #27102 Plugin telemetry metadata refactor After #27093, #27099, #27100, and #27102 merge: └── Final PR: add remote_plugin_id to plugin analytics ``` Review order and dependencies: 1. [#27093 Add debug-only analytics event capture](https://github.com/openai/codex/pull/27093) (based on `main`) 2. [#27099 Add a plugin analytics smoke workflow](https://github.com/openai/codex/pull/27099) (stacked on #27093) 3. [#27100 Add a remote plugin analytics mutation smoke workflow](https://github.com/openai/codex/pull/27100) **(this PR, stacked on #27099)** 4. [#27102 Centralize plugin telemetry metadata construction](https://github.com/openai/codex/pull/27102) (independent, based on `main`) 5. Final remote-ID behavior PR (created after PRs 1-4 merge) The original [#26281](https://github.com/openai/codex/pull/26281) remains open as the green aggregate reference until the final PR is published.
jameswt-oai ·
2026-06-16 12:28:45 -07:00 -
[codex] Route MCP file uploads through environment filesystem (#27923)
## Why Codex Apps tools can mark arguments with `openai/fileParams`, but the execution path resolved and opened those files directly on the host. That bypassed the selected turn environment and prevented annotated file arguments from working with remote environments. ## What changed - resolve annotated file arguments against the primary turn environment - read file metadata and contents through that environment's sandboxed `ExecutorFileSystem` - reject files over the 512 MiB limit from metadata before reading or transferring them - retain the buffered upload-size check as defense in depth - make the OpenAI upload API accept a filename and buffered contents instead of owning local filesystem access - describe the model-visible argument as a path in the primary environment This builds on #27927, which added `size` to internal filesystem metadata. ## Testing - `just test -p codex-api upload_openai_file_returns_canonical_uri` - `just test -p codex-mcp tool_with_model_visible_input_schema_masks_file_params` - `just test -p codex-core mcp_openai_file` - `just test -p codex-core codex_apps_file_params_upload_environment_files_before_mcp_tool_call`
pakrym-oai ·
2026-06-16 11:27:46 -07:00 -
ci: run code-mode unit tests on all bazel targets (#28562)
## Why V8 should be stable under Bazel, so the `codex-code-mode` unit tests should run across the Bazel platform matrix. If these tests prove unstable, we should fix the tests rather than exclude them from CI. ## What changed - Remove the explicit `//codex-rs/code-mode:code-mode-unit-tests` exclusion from the macOS and Linux Bazel test jobs. - Remove the same exclusion from the native Windows post-merge job. - Keep the existing Windows gnullvm shard coverage. ## Bazel test coverage The target contains 26 unit tests. A fresh uncached local Bazel execution ran all 26 with 0 failures, 0 ignored tests, and 0 filtered tests. PR Bazel CI selected the target on every enabled platform and reported a cached pass: | Platform | Passing CI job | | --- | --- | | macOS aarch64 | [Bazel test passed](https://github.com/openai/codex/actions/runs/27636617545/job/81725447804) | | macOS x86_64 | [Bazel test passed in 2.2s](https://github.com/openai/codex/actions/runs/27636617545/job/81725448008) | | Linux GNU | [Bazel test passed in 0.4s](https://github.com/openai/codex/actions/runs/27636617545/job/81725447898) | | Linux musl | [Bazel test passed in 0.4s](https://github.com/openai/codex/actions/runs/27636617545/job/81725448117) | | Windows gnullvm | [Bazel test passed in shard 4/4 in 1.6s](https://github.com/openai/codex/actions/runs/27636617545/job/81725448166) |
Channing Conger ·
2026-06-16 11:26:33 -07:00 -
feat(tui): add rate-limit reset redemption to /usage (#28154)
## Why Codex users can earn personal rate-limit reset credits, but the CLI does not currently provide a way to view or redeem them. The `/usage` command restored in #27925 is intended to be the entry point for usage-related actions, so reset redemption belongs there rather than in a separate dashed slash command. Depends on #28143 for the app-server and backend-client reset-credit APIs. ## What changed - Turn bare `/usage` into a menu with entries for token activity and earned rate-limit resets while preserving `/usage daily`, `/usage weekly`, and `/usage cumulative`. - Add loading, empty, confirmation, success, retry, and error states with a caller-generated UUID idempotency key reused across retries of the same logical reset. - Show an availability hint only for backend-classified rate-limit errors with credits available. - Hide the reset entry for workspace accounts. ## Validation - `just test -p codex-tui chatwidget::tests::usage` — 19 passed. - `just fix -p codex-tui` — passed. - `just fmt` — passed. - `cargo insta pending-snapshots` from `codex-rs/tui` — no pending snapshots. ## Examples <img width="1168" height="304" alt="image" src="https://github.com/user-attachments/assets/caa4c1e3-e996-494d-ae17-50b521f5dce8" /> <img width="908" height="260" alt="image" src="https://github.com/user-attachments/assets/e38a726b-77cc-4bd0-9ea8-9f3ad21c5768" /> ### Reset flow <img width="1509" height="312" alt="image" src="https://github.com/user-attachments/assets/d987013c-78a5-48a2-ad8d-c61ad267a327" /> <img width="585" height="190" alt="image" src="https://github.com/user-attachments/assets/de32be19-79b9-4a3e-8574-6f1c208c98ae" /> <img width="600" height="210" alt="image" src="https://github.com/user-attachments/assets/88a165cf-796d-4fdc-a7bc-ea89917573da" /> <img width="512" height="193" alt="image" src="https://github.com/user-attachments/assets/d2353998-5aa8-442e-a5f8-3a8a5b832753" />
jay ·
2026-06-16 17:59:40 +00:00 -
Add incremental thread history changes
Add ThreadHistoryBuilder APIs for collecting incremental thread item and turn changes while applying rollout items. Batch handling coalesces repeated changes so callers can get the latest incremental thread item changes for a set of rollout items without rebuilding full history.
Tom ·
2026-06-16 10:56:29 -07:00 -
[codex] Warn clearly when code mode output is truncated (#28467)
## Summary - make `formatted_truncate_text` prepend `Warning: truncated output (original token count: N)` above the existing `Total output lines` header - update direct formatter, unified-exec, user-shell, and code-mode expectations - add core unit coverage that runs in Bazel without requiring the skipped V8-backed code-mode integration suite ## Validation - `cargo test -p codex-utils-output-truncation -- --nocapture` (17 passed) - `cargo test -p codex-core --lib truncated_text_output_starts_with_warning -- --nocapture` - `cargo test -p codex-core --test all clamps_model_requested_max_output_tokens_to_policy -- --nocapture` (2 passed) - `cargo test -p codex-core --test all unified_exec_formats_large_output_summary -- --nocapture` - `cargo test -p codex-core --test all user_shell_command_output_is_truncated_in_history -- --nocapture` - Bazel CI exercises the shared formatter and downstream integration expectations
Ahmed Ibrahim ·
2026-06-16 10:37:06 -07:00 -
fix(tui): highlight C++ module files (#28554)
## Why Codex syntax-highlights diffs for conventional C++ extensions such as `.cpp` and `.cxx`, but C++ module interface files using `.cppm`, `.ixx`, or `.cxxm` fall back to plain diff coloring. The bundled syntax set already includes C++, but it does not resolve those module extensions by itself. Closes #28223. ## What changed - map `.cppm`, `.ixx`, and `.cxxm` to the existing `cpp` syntax in `render/highlight.rs` - extend alias-resolution coverage for all three module extensions - verify `.cpp`, `.cppm`, `.ixx`, and `.cxxm` diffs produce syntax-highlighted RGB spans while unknown extensions retain the plain fallback - snapshot the syntax-colored token segmentation for the supported C++ module extensions ## How to Test 1. Ask Codex to create or modify a C++ module interface file using `.cppm`, `.ixx`, or `.cxxm`. 2. Confirm C++ tokens in the rendered diff receive syntax colors instead of only the red/green diff treatment. 3. Modify an equivalent `.cpp` file and confirm its existing highlighting remains unchanged. 4. Modify a file with an unknown extension and confirm it still uses the plain diff fallback. Targeted tests: - `just test -p codex-tui -E 'test(find_syntax_resolves_languages_and_aliases) | test(cpp_module_extensions_use_cpp_highlighting) | test(unknown_extension_falls_back_without_syntax_highlighting)'`
Felipe Coury ·
2026-06-16 17:33:13 +00:00 -
[codex-app-server-test-client & codex-app-server] Plugin Usage Analytics Smoke Test (#27099)
## This PR The original [combined remote plugin analytics PR #26281](https://github.com/openai/codex/pull/26281) mixed reusable analytics test infrastructure, two manual smoke workflows, a metadata refactor, and the final identity behavior. This PR establishes a non-mutating end-to-end plugin smoke workflow before any analytics identity semantics change. - Add `plugin-analytics-smoke` to the existing app-server test client. - Exercise plugin disable, enable, and use through production app-server RPC paths. - Isolate config writes in a temporary file and use a loopback Responses API server. - Capture analytics without sending them to the production analytics backend. - Validate the current local `plugin_id`, names, capability metadata, thread, turn, and model fields. This is intentionally a baseline smoke workflow. It does not assert `remote_plugin_id`; the final PR will update it when that field exists. Review this PR as the net diff against #27093. ## Testing - The test-client target compiles successfully. - The combined reference branch exercised the manual smoke against the live remote plugin service. - CI is green across the required platform matrix. ## Split Overview ```text main ├── #27093 Debug analytics capture │ └── #27099 Non-mutating plugin smoke ← you are here │ └── #27100 Remote install/uninstall smoke └── #27102 Plugin telemetry metadata refactor After #27093, #27099, #27100, and #27102 merge: └── Final PR: add remote_plugin_id to plugin analytics ``` Review order and dependencies: 1. [#27093 Add debug-only analytics event capture](https://github.com/openai/codex/pull/27093) (based on `main`) 2. [#27099 Add a plugin analytics smoke workflow](https://github.com/openai/codex/pull/27099) **(this PR, stacked on #27093)** 3. [#27100 Add a remote plugin analytics mutation smoke workflow](https://github.com/openai/codex/pull/27100) (stacked on this PR) 4. [#27102 Centralize plugin telemetry metadata construction](https://github.com/openai/codex/pull/27102) (independent, based on `main`) 5. Final remote-ID behavior PR (created after PRs 1-4 merge) The original [#26281](https://github.com/openai/codex/pull/26281) remains open as the green aggregate reference until the final PR is published.
jameswt-oai ·
2026-06-16 10:11:41 -07:00 -
chore: side prompt (#28553)
Fix side bug with prompt
jif ·
2026-06-16 19:05:03 +02:00 -
[codex] exec-server: stream files in chunks (#28354)
## Why `fs/readFile` buffers the entire file in one response, which makes large remote reads expensive and prevents callers from applying backpressure. We need an opt-in streaming path with bounded block sizes while preserving the existing single-call API for small and sandboxed reads. ## What changed - Add `ExecServerClient::stream`, returning a named `FileReadStream` that implements `futures::Stream` and yields immutable 1 MiB byte blocks. - Add internal `fs/open`, `fs/readBlock`, and `fs/close` RPCs. `fs/readBlock` accepts an explicit offset and length. - Keep unsandboxed files open between block reads, cap open handles per connection, and clean them up on EOF, error, stream drop, explicit close, or connection shutdown. - Reject platform-sandboxed streaming opens instead of turning the one-shot sandbox helper into a persistent server. Existing `fs/readFile` behavior is unchanged. ## Testing - `just test -p codex-exec-server` - Integration coverage for 1 MiB chunking, exact block-boundary EOF, sandbox rejection, and continued reads from the opened file after path replacement. - Handle-manager coverage for non-sequential offsets, variable block lengths, the 128-handle limit, and capacity release after close.
pakrym-oai ·
2026-06-16 09:50:55 -07:00 -
fix(tui): restore TUI after suspend (#28342)
## Why On Linux, suspending Codex with `Ctrl+Z` and returning with `fg` can leave the composer misaligned or inject terminal response bytes such as focus reports into the prompt. Shell job-control output moves the cursor while Codex is suspended, and terminal input polling can race with the responses used to restore the inline viewport. Fixes #26564. ## What changed - preserve and restore keyboard reporting without disturbing the parent terminal stack - pause terminal event polling while Codex is suspended and flush buffered input before resuming it - force crossterm's cached raw-mode state back in sync after the shell completes its `fg` handoff - probe the actual post-`fg` cursor position with the tolerant terminal-response parser, then realign the inline viewport before redrawing ## How to Test 1. On Linux, start the development TUI with `just c`. 2. Type text into the composer without submitting it. 3. Press `Ctrl+Z`, run any harmless shell command, then run `fg`. 4. Confirm the composer redraws below the shell output, the draft text is preserved, and no raw escape sequences appear. 5. Repeat the suspend/resume cycle and confirm normal typing still works. Targeted tests: - `cargo test -p codex-tui --lib parses_cursor_position_as_zero_based -j 1` - `cargo test -p codex-tui --lib tui::event_stream::tests -j 1`
Felipe Coury ·
2026-06-16 09:09:24 -07:00 -
path-uri: clarify invalid host path errors (#28473)
## Why Ensure a consistent string format when exposing path conversion errors to the model. ## What - Render `PathUriParseError::InvalidFileUriPath` as `'$PATH' is invalid on '$OS'`.
Adam Perry @ OpenAI ·
2026-06-16 09:03:44 -07:00 -
perf(config): defer remote sandbox hostname lookup (#28542)
## Why [#18763](https://github.com/openai/codex/pull/18763) added canonical hostname resolution for `remote_sandbox_config`. Requirements composition currently performs that synchronous DNS lookup on every fresh process, even when none of the loaded requirements layers contains `[[remote_sandbox_config]]`. On hosts with slow local DNS resolution, this can add several seconds to Codex startup. ## What - defer hostname resolution until a parsed requirements layer actually contains `remote_sandbox_config` - cache the resolver result once per requirements composition, preserving the existing single-lookup behavior across multiple layers - keep the existing FQDN resolution and per-layer requirements precedence unchanged - cover both the ordinary no-lookup path and the multi-layer single-lookup path ## How to Test On a host where local canonical-name resolution is slow: 1. Start Codex without `[[remote_sandbox_config]]` in any managed requirements layer and confirm startup no longer waits for hostname resolution. 2. Add a matching `[[remote_sandbox_config]]` entry and confirm its `allowed_sandbox_modes` still overrides the layer's top-level value. 3. Add remote sandbox entries to multiple requirements layers and confirm precedence remains unchanged while the hostname is resolved only once. Targeted tests: - `just test -p codex-config hostname_resolver` - `just test -p codex-config` (181 passed)
Felipe Coury ·
2026-06-16 11:17:41 -04:00 -
core: surface terminal subagent errors to parent agents (#28375)
## Why When a subagent exhausts its retries, it emits an `Error`, but the generic task lifecycle then emits `TurnComplete(None)`. That completion used to overwrite the subagent's `Errored` status with `Completed(None)`, so the parent received an empty completion notification. This made a failed child look indistinguishable from a child that completed without an answer. In unattended or long-running multi-agent work, the root could silently continue without knowing that delegated work failed or how to restart it. ## Behavior Before, a terminal stream failure was reduced to an empty completion: ```text <subagent_notification> {"agent_path":"/root/worker","status":{"completed":null}} </subagent_notification> ``` Now the parent receives the actual terminal error, bounded to 1,000 tokens, together with an actionable recovery hint: ```text <subagent_notification> { "agent_path": "/root/worker", "status": { "errored": "stream disconnected before completion: stream closed before response.completed" }, "next_action": "This agent's turn failed. If you still need this agent, use `followup_task` to give it another task." } </subagent_notification> ``` The notification remains queue-only: it does not wake the root or replay the failed request. The root sees it at the next sampling boundary and can use `followup_task` to start a new turn for that agent. ## What changed - Added terminal-error precedence to the [agent status reducer](https://github.com/openai/codex/blob/e95fcfe2bb6a02f1a75650afa20048859f556511/codex-rs/core/src/agent/status.rs#L23-L34), so a closing `TurnComplete` cannot erase an immediately preceding `Errored` status. - Made MultiAgentV2 completion forwarding use the retained session status instead of re-deriving `Completed(None)` from the final event. - Extended the [subagent notification fragment](https://github.com/openai/codex/blob/e95fcfe2bb6a02f1a75650afa20048859f556511/codex-rs/core/src/context/subagent_notification.rs#L6-L60) with a `next_action` for terminal errors and a hard cap on model-visible error text. - Kept successful completions and interrupted turns unchanged. ## Verification - Added a status-reducer test proving that `Errored` survives the trailing `TurnComplete`. - Added an integration test that exhausts a subagent's stream retries and verifies the exact `agent_message` delivered to the parent, including the error and `followup_task` guidance. - Re-ran the existing successful-completion and interrupted-turn notification tests.jif ·
2026-06-16 14:34:54 +02:00 -
[codex] Clarify plugin load and runtime capability stages (#28472)
## Summary Plugin loading and auth projection both previously produced `PluginLoadOutcome`. That made an unfiltered load result look like runtime-ready capabilities and generated capability summaries before auth routing had run. This change keeps loaded plugin records in the cache, applies the current auth policy in `PluginsManager`, and only then builds `PluginLoadOutcome` and its summaries. Auth changes still reuse the cached disk load and re-resolve apps and MCP servers without reloading plugins. The updated tests cover cached auth changes and verify that capability summaries match the effective app/MCP surface. ## Testing - `just test -p codex-core-plugins` - `just test -p codex-plugin` - `just fix -p codex-core-plugins`
xl-openai ·
2026-06-16 12:57:21 +01:00 -
[tests] Keep Apps out of generic core test harness (#28508)
## Summary - disable the stable Apps feature in the generic `test_codex()` integration-test harness - keep Apps-specific tests explicit: their builders re-enable Apps and point it at a local mock server ## Why Generic tests that use dummy ChatGPT auth were also enabling the host-owned `codex_apps` MCP server. That made unrelated tests contact `chatgpt.com` and wait for MCP startup, causing the Bazel timeouts observed on #28368. The generic harness should be hermetic and should not start an external service that the test did not request. This is test-only; production Apps behavior is unchanged. The broader optional-MCP startup behavior is being handled separately in #28407. ## Testing - `just test -p codex-core -E 'test(pre_sampling_compact_runs_when_comp_hash_changes) | test(model_switch_to_smaller_model_updates_token_context_window) | test(codex_apps_file_params_upload_local_paths_before_mcp_tool_call)'` - `just fix -p codex-core` - `just fmt`
jif ·
2026-06-16 13:07:43 +02:00 -
feat: render typed envelopes for multi-agent v2 messages (#28368)
## Why Multi-agent v2 messages need a consistent, model-visible envelope that identifies what kind of interaction occurred, who sent it, and which agent it targets. Previously, encrypted deliveries exposed only `encrypted_content`, while child completion used the legacy `<subagent_notification>` shape. That meant the client could not consistently present `NEW_TASK`, `MESSAGE`, and `FINAL_ANSWER` using the same format. This change adds the routing envelope as plaintext while keeping task and message payloads encrypted. No new Responses API field is required: an encrypted delivery is represented as an `input_text` header immediately followed by its existing `encrypted_content` item. Every envelope now follows this shape: ```text Message Type: <NEW_TASK | MESSAGE | FINAL_ANSWER> Task name: <recipient agent path> Sender: <author agent path> Payload: <message payload> ``` ## Message types ### `NEW_TASK` `NEW_TASK` is used when the recipient should begin a new turn, including an initial `spawn_agent` task and a later `followup_task`. For a root agent spawning `/root/worker`, the request contains a plaintext envelope followed by the encrypted task: ```json { "type": "agent_message", "author": "/root", "recipient": "/root/worker", "content": [ { "type": "input_text", "text": "Message Type: NEW_TASK\nTask name: /root/worker\nSender: /root\nPayload:\n" }, { "type": "encrypted_content", "encrypted_content": "<encrypted task payload>" } ] } ``` Conceptually, the model receives: ```text Message Type: NEW_TASK Task name: /root/worker Sender: /root Payload: Review the authentication changes and report any regressions. ``` ### `MESSAGE` `MESSAGE` is used for a queued `send_message` delivery. It communicates with an existing agent without starting a new turn. For `/root/worker` reporting progress to the root agent, the request contains: ```json { "type": "agent_message", "author": "/root/worker", "recipient": "/root", "content": [ { "type": "input_text", "text": "Message Type: MESSAGE\nTask name: /root\nSender: /root/worker\nPayload:\n" }, { "type": "encrypted_content", "encrypted_content": "<encrypted message payload>" } ] } ``` Conceptually, the model receives: ```text Message Type: MESSAGE Task name: /root Sender: /root/worker Payload: The protocol tests pass; I am checking the resume path now. ``` ### `FINAL_ANSWER` `FINAL_ANSWER` is emitted when a child agent reaches a terminal state and reports its result to its parent. Completion payloads are already available locally, so the complete envelope is represented as plaintext rather than as a plaintext header plus encrypted content. For `/root/worker` completing work for the root agent, the request contains: ```json { "type": "agent_message", "author": "/root/worker", "recipient": "/root", "content": [ { "type": "input_text", "text": "Message Type: FINAL_ANSWER\nTask name: /root\nSender: /root/worker\nPayload:\nNo regressions found." } ] } ``` The model-visible form is: ```text Message Type: FINAL_ANSWER Task name: /root Sender: /root/worker Payload: No regressions found. ``` Errored, shut down, and missing agents also use `FINAL_ANSWER`, with a terminal-status description in the payload. ## What changed - Render `NEW_TASK` or `MESSAGE` in `InterAgentCommunication::to_model_input_item`, based on whether the encrypted delivery starts a turn. - Replace the multi-agent v2 `<subagent_notification>` completion payload with a model-visible `FINAL_ANSWER` envelope. - Document `Task name`, `Sender`, and `Payload` consistently in the multi-agent developer instructions. - Prevent local-only history projections from treating an encrypted message's plaintext header as the complete assistant message. - Preserve rollout-trace interaction edges when an agent message contains both plaintext and encrypted content. Legacy multi-agent behavior remains unchanged. ## Verification - `just test -p codex-protocol` - `just test -p codex-rollout-trace` - `just test -p codex-web-search-extension` - `just test -p codex-core encrypted_multi_agent_v2_spawn_sends_agent_message_to_child` - `just test -p codex-core plaintext_multi_agent_v2_completion_sends_agent_message` - `just test -p codex-core multi_agent_v2_followup_task_completion_notifies_parent_on_every_turn` - `just test -p codex-core multi_agent_v2_completion_queues_message_for_direct_parent`jif ·
2026-06-16 11:46:59 +02:00 -
[codex] Compress cold active rollouts (#28338)
## Why The local rollout compression worker currently scans only `archived_sessions`, so cold unarchived thread history remains expanded indefinitely. ## What changed - Scan `sessions` after `archived_sessions` within the existing worker runtime budget. - Update rollout compression coverage to require both cold active and archived rollouts to be compressed while fresh active rollouts remain plain. The worker remains behind the disabled-by-default `local_thread_store_compression` feature, and the existing seven-day cold-file threshold is unchanged. ## Validation - `just test -p codex-rollout` (69 passed) - `just fmt` - `git diff --check`
jif ·
2026-06-16 10:52:21 +02:00 -
[codex] expose Bedrock credential source in account/read (#27751)
## Why `account/read` currently reports only `type: "amazonBedrock"`, so clients cannot distinguish a Codex-managed Bedrock API key from credentials supplied by AWS. The app UI needs that distinction to render the appropriate account state without duplicating provider-auth logic. Credential-source selection belongs to the Bedrock model provider because it already owns the precedence between managed Bedrock auth and the external AWS credential path. This builds on #27443 and #27689. ## What changed - Added `AmazonBedrockCredentialSource` with `codexManaged` and `awsManaged` values. - Included the selected credential source in `ProviderAccount::AmazonBedrock` and the app-server `Account` response. - Made `AmazonBedrockModelProvider::account_state()` classify the source from its managed-auth state. - Regenerated the app-server JSON and TypeScript schemas. - Updated app-server account documentation and downstream TUI matches. `codexManaged` means the provider found a managed Bedrock API key. `awsManaged` identifies the provider's external AWS credential path; it does not assert that the AWS credential chain has been validated. ## Testing - Added model-provider coverage for Codex-managed precedence and AWS-managed fallback. - Added app-server protocol serialization coverage for both wire values. - Added app-server integration coverage for both `account/read` responses. - `just test -p codex-protocol -p codex-model-provider -p codex-app-server-protocol` (497 tests passed). After rebasing onto #27711, the `codex-app-server` test target compiled past the image-generation `PathUri` migration. Local linking was then interrupted by disk exhaustion (`No space left on device`).
Celia Chen ·
2026-06-16 07:14:53 +00:00 -
[codex] Record external agent import results (#28396)
## Summary - restore `externalAgentConfig/import/progress` notifications while keeping `externalAgentConfig/import/completed` as the must-deliver event - persist completed external-agent config imports in state DB by `importId`, including concrete success/failure details for config, AGENTS.md, skills, plugins, MCP servers, subagents, hooks, commands, and sessions - add `externalAgentConfig/import/readHistories` so clients can recover persisted import results after missing the live completion notification - include `errorType` on import failures in protocol responses/notifications and persisted DB JSON so future code can classify failures without another wire/storage shape change ## Validation - `git diff --check` - `just test -p codex-state external_agent_config_imports` - `just test -p codex-app-server-protocol` - `CODEX_SQLITE_HOME=/private/tmp/codex-app-server-sqlite-read-details just test -p codex-app-server external_agent_config_import_sends_completion_notification_for_sync_only_import` Also ran earlier broader checks before publishing: - `just test -p codex-state` - `CODEX_SQLITE_HOME=/private/tmp/codex-app-server-external-agent-test-sqlite just test -p codex-app-server external_agent_config` - `just test -p codex-external-agent-migration`
charlesgong-openai ·
2026-06-15 23:17:24 -07:00 -
[codex] Use local environment for user shell commands (#28163)
## Why User shell commands still read the legacy turn cwd and session shell even though execution context is now owned by selected turn environments. App-server also defines `thread/shellCommand` as a local-host escape hatch, so it must use an available local environment even when a remote environment is primary. ## What changed - Add `ResolvedTurnEnvironments::local()` to find the selected local environment. - Resolve the user shell command cwd and shell from that local `TurnEnvironment`. - Emit the standard `shell is unavailable in this session` error when no selected local environment or resolved local shell is available. - Add an integration test covering `/shell` without a local environment. ## Test plan - `just test -p codex-core user_shell_command_without_local_environment_emits_error`
pakrym-oai ·
2026-06-16 04:55:20 +00:00 -
[codex] Use expect in integration tests (#28441)
The workspace denies `clippy::expect_used` in production. Although `clippy.toml` allows `expect` in tests, Bazel Clippy compiles integration-test helper code in a way that does not receive that exemption, which encouraged verbose `unwrap_or_else(... panic!(...))` and equivalent `match`/`let else` forms. This allows `clippy::expect_used` once at each integration-test crate root (including aggregated suites and test-support libraries), then replaces manual panic-based Result and Option unwraps with `expect`/`expect_err`. Standalone `tests/*.rs` files remain their own crate roots. Intentional assertion and unexpected-variant panics remain unchanged, and the production `expect_used = "deny"` lint remains in place. The cleanup is mechanical and net-negative in line count.
pakrym-oai ·
2026-06-15 21:53:47 -07:00 -
[codex] Add interruptible sleep tool (#28429)
## Why Models sometimes need to pause briefly while waiting for external work, but using a shell command for that delay ties the wait to a process and does not naturally resume when new turn input arrives. ## What changed - add a built-in `sleep` tool behind the under-development `sleep_tool` feature - accept a bounded `duration_ms` argument, matching the millisecond convention used by unified exec - end the sleep early when either steered user input or mailbox input arrives - include elapsed wall-clock time in completed and interrupted outputs - emit a dedicated core `SleepItem` through `item/started` and `item/completed` - expose the sleep item as app-server v2 `ThreadItem::Sleep` and retain it in reconstructed thread history - regenerate the configuration schema for the new feature flag - regenerate app-server JSON and TypeScript schema fixtures ## Test plan - `just test -p codex-core sleep_tool_follows_feature_gate` - `just test -p codex-core any_new_input_interrupts_sleep` - `just test -p codex-app-server-protocol` - `just test -p codex-app-server sleep_emits_started_and_completed_items`
pakrym-oai ·
2026-06-15 21:39:21 -07:00 -
[codex] Bind shell snapshots to retained thread environments (#28421)
## Why Shell snapshots are currently session-scoped even though shell and cwd are properties of a selected turn environment. That makes snapshot refresh depend on separate session-cwd plumbing, prevents retained environments from retaining their snapshot work, and can make snapshot construction use a different shell than command execution. This follows #27955 by making the retained thread-environment service own environment snapshot lifecycles. Session configuration remains the requested selection state, while `ThreadEnvironments` remains the source of successfully resolved environments. ## What changed - Configure the shell-snapshot builder before initial environment resolution. - Start each local environment snapshot task when its `TurnEnvironment` is built and retain that shared task while environment ID and cwd still match. - Inherit retained environment snapshots into spawned child threads. - Carry the selected `TurnEnvironment` through shell runtimes so snapshot construction and command execution use the same environment-specific shell and cwd. - Load project instructions and warm plugins/skills after initial environment resolution. - Continue decoding invalid UTF-8 instruction files lossily without emitting a startup warning. - Keep requested selections in `SessionConfiguration`; failed or duplicate resolutions only affect the resolved environment snapshot. ## Validation - `cargo check -p codex-core --tests` - `just test -p codex-home instructions` (6 passed) - Focused environment, instruction, shell-snapshot, and user-shell tests (84 passed) - Focused shell-snapshot, user-shell, and unified-exec tests (126 passed; two event-timing tests passed on retry)
pakrym-oai ·
2026-06-15 20:10:53 -07:00 -
Use ApiPathString in app-server filesystem permission paths (#28367)
## Why Clients running an app-server on one OS and an exec-server on another OS need to be able to pass sandbox config to app-server that refers to resources on the executor's foreign OS. ## What `AbsolutePathBuf` can't represent these paths and we don't want users to be exposed to `PathUri` yet, so this moves the public app-server API to be expressed in terms of `ApiPathString`. Stacked on #28165. - change app-server v2 filesystem permission paths, including legacy read/write roots, to `ApiPathString` - localize API paths through `PathUri` when converting into the current native core permission types - make path-bearing permission conversions fallible and surface localization failures instead of silently treating malformed grants as ordinary denials - propagate conversion failures through app-server and TUI approval handling - regenerate the app-server JSON and TypeScript schemas - leave migration TODOs on native-path conversions so they can be removed once core permission paths use `PathUri`
Adam Perry @ OpenAI ·
2026-06-15 19:25:54 -07:00 -
[codex] Make plugin details capability aware (#27958)
## Summary Makes plugin details/read flows capability-aware so auth-filtered plugin surfaces report the same usable app/MCP/skill shape as the marketplace and install flows. ## Validation Not run; this change was rebased onto the current plugin auth stack and pushed as a draft PR. **Manual test** 1. set up a local marketplace with a plugin that has both app and mcp declarations ``` // .app.json { "apps": { "linear": { "id": "some_id" } } } ``` ``` // .mcp.json { "mcpServers": { "linear": { "type": "http", "url": "https://mcp.linear.app/mcp", "oauth_resource": "https://mcp.linear.app/mcp" }, "linear2": { "type": "http", "url": "https://mcp.linear2.app/mcp", "oauth_resource": "https://mcp.linear2.app/mcp" } } } ``` 2a. **login in with api key** and observe plugin details page which shows no apps (note we don't show "app not available due to api key log in as there's no way to differentiate between no apps and app without substitute mcp exists" without significantly more code changes, i've separated this to a follow up if we want that behaviour. <img width="1170" height="279" alt="Screenshot 2026-06-15 at 23 45 40" src="https://github.com/user-attachments/assets/d36cb160-fbec-461e-9643-9c761dbae7bb" /> <img width="975" height="640" alt="Screenshot 2026-06-15 at 18 40 30" src="https://github.com/user-attachments/assets/90ec0bc8-7506-4b90-bbd3-070720de799e" /> 2b. **log in with chat** and observe intended conflict resolution logic <img width="1165" height="224" alt="Screenshot 2026-06-15 at 17 17 30" src="https://github.com/user-attachments/assets/80adfbf2-7dac-4f08-8b76-8eeeab6c95e7" /> <img width="968" height="567" alt="Screenshot 2026-06-15 at 18 38 59" src="https://github.com/user-attachments/assets/9ea92c5e-535b-4aa4-8ad0-ee513b57bc3c" />felixxia-oai ·
2026-06-16 01:25:22 +00:00 -
[codex] Load API curated marketplace by auth (#28383)
## Summary - choose the local OpenAI curated marketplace manifest based on auth: Codex backend auth gets the existing marketplace, direct provider auth gets `api_marketplace.json` - include Bedrock API key auth in the direct-provider API marketplace path - safely skip the API marketplace when `api_marketplace.json` is absent ## Validation - `just fmt` - `git diff --check origin/main...HEAD` - CI should run the full validation ## Manual Testing ### - New api marketplace not available for API key sign 1. Safely not display anything from api marketplace <img width="1161" height="289" alt="Screenshot 2026-06-15 at 21 37 43" src="https://github.com/user-attachments/assets/a5f16642-8a20-4ac1-a0de-1274a4c7b5b2" /> ### - New api marketplace for API key sign in 1. Setup api_marketplace.json ``` { "name": "openai-curated", "interface": { "displayName": "Codex official" }, "plugins": [ { "name": "linear", "source": { "source": "local", "path": "./plugins/linear" }, "policy": { "installation": "AVAILABLE", "authentication": "ON_INSTALL" }, "category": "Productivity" } ] } ``` 2. Log in with API key, observe that only the defined plugin from api_marketplace.json is available from "Codex Official" (outside of local testing marketplaces) <img width="1167" height="446" alt="Screenshot 2026-06-15 at 21 16 53" src="https://github.com/user-attachments/assets/7cf61477-d826-4ef6-bc05-0a23ac1c0259" /> also checked functionality on codex app ### - SiWC users Still uses 'default' marketplace.json and renders all plugins <img width="1171" height="502" alt="Screenshot 2026-06-15 at 21 40 25" src="https://github.com/user-attachments/assets/d212ea9b-0aa5-470b-8ea4-450efe65bb2b" /> also checked functionality on codex app ## Notes - `just test -p codex-core-plugins` was started locally before splitting branches, but I stopped relying on local tests per follow-up and left final validation to PR CI.
felixxia-oai ·
2026-06-16 01:16:11 +00:00 -
exec-server: default remote transport to Noise (#26245)
## Why The transport in [openai/codex#26242](https://github.com/openai/codex/pull/26242) needs to be used by every remote orchestrator-to-executor connection before JSON-RPC traffic starts. ## Changes - Generates one executor Noise identity when remote exec-server starts and registers its public key. - Creates a harness identity for each physical remote environment connection. - Fetches a fresh registry bundle before connecting and validates the authenticated harness key before completing the executor handshake. - Multiplexes encrypted logical streams over the existing executor WebSocket. - Adds bounded stream, handshake-failure, and reassembly state. - Adds safe lifecycle diagnostics without logging keys, authorizations, plaintext, or ciphertext. - Covers reconnects, replay rejection, validation failure, framing limits, and encrypted JSON-RPC tool traffic. ## Stack 1. [openai/codex#26242](https://github.com/openai/codex/pull/26242): Noise channel and relay transport 2. **[openai/codex#26245](https://github.com/openai/codex/pull/26245)**: remote registration and runtime activation ## Verification - `just test -p codex-exec-server` - `just fix -p codex-exec-server` - `just bazel-lock-check` - `cargo shear` --------- Co-authored-by: Codex <noreply@openai.com>
viyatb-oai ·
2026-06-15 17:39:00 -07:00 -
Run core integration tests against a Wine-backed Windows executor (#28401)
## Why We want to exercise a linux app-server against a windows exec-server without having to repeat every test case. This approach has slight precedent in the remote docker test setup. ## What Run the shared `codex-core` integration suite against Windows exec-server behavior from Linux. This makes cross-OS path and shell regressions visible while keeping unsupported cases owned by individual tests. - Add `local`, `docker`, and `wine-exec` test environment selection with legacy Docker compatibility. - Extend `codex_rust_crate` to generate a sharded Wine-exec variant using a cross-built Windows server and pinned Bazel Wine/PowerShell runtimes. - Teach remote-aware helpers about Windows paths and track temporary incompatibilities with source-local `skip_if_wine_exec!` calls and follow-up reasons.
Adam Perry @ OpenAI ·
2026-06-16 00:38:41 +00:00 -
Preserve hook trust bypass in codex exec threads (#26434)
Addresses #26383 and #26452 ## Summary `codex exec --dangerously-bypass-hook-trust` printed the bypass warning, but valid untrusted hooks still did not run. Exec applied the flag to its initial config, then lost it when app-server reloaded config for the new or resumed thread. ## Fix Forward `bypass_hook_trust: true` through the existing thread request config override for both start and resume. The override is omitted when the flag is not enabled, preserving normal trust behavior. ## Testing Added: - A test confirming start and resume preserve the override. - An end-to-end exec test confirming a `SessionStart` hook runs and creates a marker file.
Abhinav ·
2026-06-15 17:36:21 -07:00