mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
dev
19 Commits
-
Use ApiPathString in app-server filesystem permission paths (#28367)
## Why Clients running an app-server on one OS and an exec-server on another OS need to be able to pass sandbox config to app-server that refers to resources on the executor's foreign OS. ## What `AbsolutePathBuf` can't represent these paths and we don't want users to be exposed to `PathUri` yet, so this moves the public app-server API to be expressed in terms of `ApiPathString`. Stacked on #28165. - change app-server v2 filesystem permission paths, including legacy read/write roots, to `ApiPathString` - localize API paths through `PathUri` when converting into the current native core permission types - make path-bearing permission conversions fallible and surface localization failures instead of silently treating malformed grants as ordinary denials - propagate conversion failures through app-server and TUI approval handling - regenerate the app-server JSON and TypeScript schemas - leave migration TODOs on native-path conversions so they can be removed once core permission paths use `PathUri`
Adam Perry @ OpenAI ·
2026-06-15 19:25:54 -07:00 -
Disable empty Cargo test targets (#21584)
## Summary `cargo test` has entails both running standard Rust tests and doctests. It turns out that the doctest discovery is fairly slow, and it's a cost you pay even for crates that don't include any doctests. This PR disables doctests with `doctest = false` for crates that lack any doctests. For the collection of crates below, this speeds up test execution by >4x. E.g., before this PR: ``` Benchmark 1: cargo test -p codex-utils-absolute-path -p codex-utils-cache -p codex-utils-cli -p codex-utils-home-dir -p codex-utils-output-truncation -p codex-utils-path -p codex-utils-string -p codex-utils-template -p codex-utils-elapsed -p codex-utils-json-to-toml Time (mean ± σ): 1.849 s ± 4.455 s [User: 0.752 s, System: 1.367 s] Range (min … max): 0.418 s … 14.529 s 10 runs ``` And after: ``` Benchmark 1: cargo test -p codex-utils-absolute-path -p codex-utils-cache -p codex-utils-cli -p codex-utils-home-dir -p codex-utils-output-truncation -p codex-utils-path -p codex-utils-string -p codex-utils-template -p codex-utils-elapsed -p codex-utils-json-to-toml Time (mean ± σ): 428.6 ms ± 6.9 ms [User: 187.7 ms, System: 219.7 ms] Range (min … max): 418.0 ms … 436.8 ms 10 runs ``` For a single crate, with >2x speedup, before: ``` Benchmark 1: cargo test -p codex-utils-string Time (mean ± σ): 491.1 ms ± 9.0 ms [User: 229.8 ms, System: 234.9 ms] Range (min … max): 480.9 ms … 512.0 ms 10 runs ``` And after: ``` Benchmark 1: cargo test -p codex-utils-string Time (mean ± σ): 213.9 ms ± 4.3 ms [User: 112.8 ms, System: 84.0 ms] Range (min … max): 206.8 ms … 221.0 ms 13 runs ``` Co-authored-by: Codex <noreply@openai.com>
Charlie Marsh ·
2026-05-07 15:44:17 -07:00 -
Refactor config loading to use filesystem abstraction (#18209)
Initial pass propagating FileSystem through config loading.
pakrym-oai ·
2026-04-17 00:51:21 +00:00 -
fix(guardian): fix ordering of guardian events (#16462)
Guardian events were emitted a bit out of order for CommandExecution items. This would make it hard for the frontend to render a guardian auto-review, which has this payload: ``` pub struct ItemGuardianApprovalReviewStartedNotification { pub thread_id: String, pub turn_id: String, pub target_item_id: String, pub review: GuardianApprovalReview, // FYI this is no longer a json blob pub action: Option<JsonValue>, } ``` There is a `target_item_id` the auto-approval review is referring to, but the actual item had not been emitted yet. Before this PR: - `item/autoApprovalReview/started` - `item/autoApprovalReview/completed`, and if approved... - `item/started` - `item/completed` After this PR: - `item/started` - `item/autoApprovalReview/started` - `item/autoApprovalReview/completed` - `item/completed` This lines up much better with existing patterns (i.e. human review in `Default mode`, where app-server would send a server request to prompt for user approval after `item/started`), and makes it easier for clients to render what guardian is actually reviewing. We do this following a similar pattern as `FileChange` (aka apply patch) items, where we create a FileChange item and emit `item/started` if we see the apply patch approval request, before the actual apply patch call runs.Owen Lin ·
2026-04-06 19:14:27 +00:00 -
Move git utilities into a dedicated crate (#15564)
- create `codex-git-utils` and move the shared git helpers into it with file moves preserved for diff readability - move the `GitInfo` helpers out of `core` so stacked rollout work can depend on the shared crate without carrying its own git info module --------- Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com> Co-authored-by: Codex <noreply@openai.com>
Ahmed Ibrahim ·
2026-03-24 13:26:23 -07:00 -
feat(app-server): support mcp elicitations in v2 api (#13425)
This adds a first-class server request for MCP server elicitations: `mcpServer/elicitation/request`. Until now, MCP elicitation requests only showed up as a raw `codex/event/elicitation_request` event from core. That made it hard for v2 clients to handle elicitations using the same request/response flow as other server-driven interactions (like shell and `apply_patch` tools). This also updates the underlying MCP elicitation request handling in core to pass through the full MCP request (including URL and form data) so we can expose it properly in app-server. ### Why not `item/mcpToolCall/elicitationRequest`? This is because MCP elicitations are related to MCP servers first, and only optionally to a specific MCP tool call. In the MCP protocol, elicitation is a server-to-client capability: the server sends `elicitation/create`, and the client replies with an elicitation result. RMCP models it that way as well. In practice an elicitation is often triggered by an MCP tool call, but not always. ### What changed - add `mcpServer/elicitation/request` to the v2 app-server API - translate core `codex/event/elicitation_request` events into the new v2 server request - map client responses back into `Op::ResolveElicitation` so the MCP server can continue - update app-server docs and generated protocol schema - add an end-to-end app-server test that covers the full round trip through a real RMCP elicitation flow - The new test exercises a realistic case where an MCP tool call triggers an elicitation, the app-server emits mcpServer/elicitation/request, the client accepts it, and the tool call resumes and completes successfully. ### app-server API flow - Client starts a thread with `thread/start`. - Client starts a turn with `turn/start`. - App-server sends `item/started` for the `mcpToolCall`. - While that tool call is in progress, app-server sends `mcpServer/elicitation/request`. - Client responds to that request with `{ action: "accept" | "decline" | "cancel" }`. - App-server sends `serverRequest/resolved`. - App-server sends `item/completed` for the mcpToolCall. - App-server sends `turn/completed`. - If the turn is interrupted while the elicitation is pending, app-server still sends `serverRequest/resolved` before the turn finishes.Owen Lin ·
2026-03-05 07:20:20 -08:00 -
app-server service tier plumbing (plus some cleanup) (#13334)
followup to https://github.com/openai/codex/pull/13212 to expose fast tier controls to app server (majority of this PR is generated schema jsons - actual code is +69 / -35 and +24 tests ) - add service tier fields to the app-server protocol surfaces used by thread lifecycle, turn start, config, and session configured events - thread service tier through the app-server message processor and core thread config snapshots - allow runtime config overrides to carry service tier for app-server callers cleanup: - Removing useless "legacy" code supporting "standard" - we moved to None | "fast", so "standard" is not needed.
pash-openai ·
2026-03-03 02:35:09 -08:00 -
app-server: improve thread resume rejoin flow (#11776)
thread/resume response includes latest turn with all items, in band so no events are stale or lost Testing - e2e tested using app-server-test-client using flow described in "Testing Thread Rejoin Behavior" in codex-rs/app-server-test-client/README.md - e2e tested in codex desktop by reconnecting to a running turn
Max Johnson ·
2026-02-20 05:29:05 +00:00 -
feat(app-server): experimental flag to persist extended history (#11227)
This PR adds an experimental `persist_extended_history` bool flag to app-server thread APIs so rollout logs can retain a richer set of EventMsgs for non-lossy Thread > Turn > ThreadItems reconstruction (i.e. on `thread/resume`). ### Motivation Today, our rollout recorder only persists a small subset (e.g. user message, reasoning, assistant message) of `EventMsg` types, dropping a good number (like command exec, file change, etc.) that are important for reconstructing full item history for `thread/resume`, `thread/read`, and `thread/fork`. Some clients want to be able to resume a thread without lossiness. This lossiness is primarily a UI thing, since what the model sees are `ResponseItem` and not `EventMsg`. ### Approach This change introduces an opt-in `persist_full_history` flag to preserve those events when you start/resume/fork a thread (defaults to `false`). This is done by adding an `EventPersistenceMode` to the rollout recorder: - `Limited` (existing behavior, default) - `Extended` (new opt-in behavior) In `Extended` mode, persist additional `EventMsg` variants needed for non-lossy app-server `ThreadItem` reconstruction. We now store the following ThreadItems that we didn't before: - web search - command execution - patch/file changes - MCP tool calls - image view calls - collab tool outcomes - context compaction - review mode enter/exit For **command executions** in particular, we truncate the output using the existing `truncate_text` from core to store an upper bound of 10,000 bytes, which is also the default value for truncating tool outputs shown to the model. This keeps the size of the rollout file and command execution items returned over the wire reasonable. And we also persist `EventMsg::Error` which we can now map back to the Turn's status and populates the Turn's error metadata. #### Updates to EventMsgs To truly make `thread/resume` non-lossy, we also needed to persist the `status` on `EventMsg::CommandExecutionEndEvent` and `EventMsg::PatchApplyEndEvent`. Previously it was not obvious whether a command failed or was declined (similar for apply_patch). These EventMsgs were never persisted before so I made it a required field.
Owen Lin ·
2026-02-12 19:34:22 +00:00 -
feat: replace custom mcp-types crate with equivalents from rmcp (#10349)
We started working with MCP in Codex before https://crates.io/crates/rmcp was mature, so we had our own crate for MCP types that was generated from the MCP schema: https://github.com/openai/codex/blob/8b95d3e082376f4cb23e92641705a22afb28a9da/codex-rs/mcp-types/README.md Now that `rmcp` is more mature, it makes more sense to use their MCP types in Rust, as they handle details (like the `_meta` field) that our custom version ignored. Though one advantage that our custom types had is that our generated types implemented `JsonSchema` and `ts_rs::TS`, whereas the types in `rmcp` do not. As such, part of the work of this PR is leveraging the adapters between `rmcp` types and the serializable types that are API for us (app server and MCP) introduced in #10356. Note this PR results in a number of changes to `codex-rs/app-server-protocol/schema`, which merit special attention during review. We must ensure that these changes are still backwards-compatible, which is possible because we have: ```diff - export type CallToolResult = { content: Array<ContentBlock>, isError?: boolean, structuredContent?: JsonValue, }; + export type CallToolResult = { content: Array<JsonValue>, structuredContent?: JsonValue, isError?: boolean, _meta?: JsonValue, }; ``` so `ContentBlock` has been replaced with the more general `JsonValue`. Note that `ContentBlock` was defined as: ```typescript export type ContentBlock = TextContent | ImageContent | AudioContent | ResourceLink | EmbeddedResource; ``` so the deletion of those individual variants should not be a cause of great concern. Similarly, we have the following change in `codex-rs/app-server-protocol/schema/typescript/Tool.ts`: ``` - export type Tool = { annotations?: ToolAnnotations, description?: string, inputSchema: ToolInputSchema, name: string, outputSchema?: ToolOutputSchema, title?: string, }; + export type Tool = { name: string, title?: string, description?: string, inputSchema: JsonValue, outputSchema?: JsonValue, annotations?: JsonValue, icons?: Array<JsonValue>, _meta?: JsonValue, }; ``` so: - `annotations?: ToolAnnotations` ➡️ `JsonValue` - `inputSchema: ToolInputSchema` ➡️ `JsonValue` - `outputSchema?: ToolOutputSchema` ➡️ `JsonValue` and two new fields: `icons?: Array<JsonValue>, _meta?: JsonValue` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10349). * #10357 * __->__ #10349 * #10356
Michael Bolin ·
2026-02-02 17:41:55 -08:00 -
feat: experimental flags (#10231)
## Problem being solved - We need a single, reliable way to mark app-server API surface as experimental so that: 1. the runtime can reject experimental usage unless the client opts in 2. generated TS/JSON schemas can exclude experimental methods/fields for stable clients. Right now that’s easy to drift or miss when done ad-hoc. ## How to declare experimental methods and fields - **Experimental method**: add `#[experimental("method/name")]` to the `ClientRequest` variant in `client_request_definitions!`. - **Experimental field**: on the params struct, derive `ExperimentalApi` and annotate the field with `#[experimental("method/name.field")]` + set `inspect_params: true` for the method variant so `ClientRequest::experimental_reason()` inspects params for experimental fields. ## How the macro solves it - The new derive macro lives in `codex-rs/codex-experimental-api-macros/src/lib.rs` and is used via `#[derive(ExperimentalApi)]` plus `#[experimental("reason")]` attributes. - **Structs**: - Generates `ExperimentalApi::experimental_reason(&self)` that checks only annotated fields. - The “presence” check is type-aware: - `Option<T>`: `is_some_and(...)` recursively checks inner. - `Vec`/`HashMap`/`BTreeMap`: must be non-empty. - `bool`: must be `true`. - Other types: considered present (returns `true`). - Registers each experimental field in an `inventory` with `(type_name, serialized field name, reason)` and exposes `EXPERIMENTAL_FIELDS` for that type. Field names are converted from `snake_case` to `camelCase` for schema/TS filtering. - **Enums**: - Generates an exhaustive `match` returning `Some(reason)` for annotated variants and `None` otherwise (no wildcard arm). - **Wiring**: - Runtime gating uses `ExperimentalApi::experimental_reason()` in `codex-rs/app-server/src/message_processor.rs` to reject requests unless `InitializeParams.capabilities.experimental_api == true`. - Schema/TS export filters use the inventory list and `EXPERIMENTAL_CLIENT_METHODS` from `client_request_definitions!` to strip experimental methods/fields when `experimental_api` is false.jif-oai ·
2026-02-02 11:06:50 +00:00 -
feat: vendor app-server protocol schema fixtures (#10371)
Similar to what @sayan-oai did in openai/codex#8956 for `config.schema.json`, this PR updates the repo so that it includes the output of `codex app-server generate-json-schema` and `codex app-server generate-ts` and adds a test to verify it is in sync with the current code. Motivation: - This makes any schema changes introduced by a PR transparent during code review. - In particular, this should help us catch PRs that would introduce a non-backwards-compatible change to the app schema (eventually, this should also be enforced by tooling). - Once https://github.com/openai/codex/pull/10231 is in to formalize the notion of "experimental" fields, we can work on ensuring the non-experimental bits are backwards-compatible. `codex-rs/app-server-protocol/tests/schema_fixtures.rs` was added as the test and `just write-app-server-schema` can be use to generate the vendored schema files. Incidentally, when I run: ``` rg _ codex-rs/app-server-protocol/schema/typescript/v2 ``` I see a number of `snake_case` names that should be `camelCase`.
Michael Bolin ·
2026-02-01 23:38:43 -08:00 -
fix: introduce AbsolutePathBuf as part of sandbox config (#7856)
Changes the `writable_roots` field of the `WorkspaceWrite` variant of the `SandboxPolicy` enum from `Vec<PathBuf>` to `Vec<AbsolutePathBuf>`. This is helpful because now callers can be sure the value is an absolute path rather than a relative one. (Though when using an absolute path in a Seatbelt config policy, we still have to _canonicalize_ it first.) Because `writable_roots` can be read from a config file, it is important that we are able to resolve relative paths properly using the parent folder of the config file as the base path.
Michael Bolin ·
2025-12-12 15:25:22 -08:00 -
chore: add cargo-deny configuration (#7119)
- add GitHub workflow running cargo-deny on push/PR - document cargo-deny allowlist with workspace-dep notes and advisory ignores - align workspace crates to inherit version/edition/license for consistent checks
Josh McKinney ·
2025-11-24 12:22:18 -08:00 -
[app-server & core] introduce new codex error code and v2 app-server error events (#6938)
This PR does two things: 1. populate a new `codex_error_code` protocol in error events sent from core to client; 2. old v1 core events `codex/event/stream_error` and `codex/event/error` will now both become `error`. We also show codex error code for turncompleted -> error status. new events in app server test: ``` < { < "method": "codex/event/stream_error", < "params": { < "conversationId": "019aa34c-0c14-70e0-9706-98520a760d67", < "id": "0", < "msg": { < "codex_error_code": { < "response_stream_disconnected": { < "http_status_code": 401 < } < }, < "message": "Reconnecting... 2/5", < "type": "stream_error" < } < } < } { < "method": "error", < "params": { < "error": { < "codexErrorCode": { < "responseStreamDisconnected": { < "httpStatusCode": 401 < } < }, < "message": "Reconnecting... 2/5" < } < } < } < { < "method": "turn/completed", < "params": { < "turn": { < "error": { < "codexErrorCode": { < "responseTooManyFailedAttempts": { < "httpStatusCode": 401 < } < }, < "message": "exceeded retry limit, last status: 401 Unauthorized, request id: 9a1b495a1a97ed3e-SJC" < }, < "id": "0", < "items": [], < "status": "failed" < } < } < } ```Celia Chen ·
2025-11-20 23:06:55 +00:00 -
[app-server] feat: add v2 command execution approval flow (#6758)
This PR adds the API V2 version of the command‑execution approval flow for the shell tool. This PR wires the new RPC (`item/commandExecution/requestApproval`, V2 only) and related events (`item/started`, `item/completed`, and `item/commandExecution/delta`, which are emitted in both V1 and V2) through the app-server protocol. The new approval RPC is only sent when the user initiates a turn with the new `turn/start` API so we don't break backwards compatibility with VSCE. The approach I took was to make as few changes to the Codex core as possible, leveraging existing `EventMsg` core events, and translating those in app-server. I did have to add additional fields to `EventMsg::ExecCommandEndEvent` to capture the command's input so that app-server can statelessly transform these events to a `ThreadItem::CommandExecution` item for the `item/completed` event. Once we stabilize the API and it's complete enough for our partners, we can work on migrating the core to be aware of command execution items as a first-class concept. **Note**: We'll need followup work to make sure these APIs work for the unified exec tool, but will wait til that's stable and landed before doing a pass on app-server. Example payloads below: ``` { "method": "item/started", "params": { "item": { "aggregatedOutput": null, "command": "/bin/zsh -lc 'touch /tmp/should-trigger-approval'", "cwd": "/Users/owen/repos/codex/codex-rs", "durationMs": null, "exitCode": null, "id": "call_lNWWsbXl1e47qNaYjFRs0dyU", "parsedCmd": [ { "cmd": "touch /tmp/should-trigger-approval", "type": "unknown" } ], "status": "inProgress", "type": "commandExecution" } } } ``` ``` { "id": 0, "method": "item/commandExecution/requestApproval", "params": { "itemId": "call_lNWWsbXl1e47qNaYjFRs0dyU", "parsedCmd": [ { "cmd": "touch /tmp/should-trigger-approval", "type": "unknown" } ], "reason": "Need to create file in /tmp which is outside workspace sandbox", "risk": null, "threadId": "019a93e8-0a52-7fe3-9808-b6bc40c0989a", "turnId": "1" } } ``` ``` { "id": 0, "result": { "acceptSettings": { "forSession": false }, "decision": "accept" } } ``` ``` { "params": { "item": { "aggregatedOutput": null, "command": "/bin/zsh -lc 'touch /tmp/should-trigger-approval'", "cwd": "/Users/owen/repos/codex/codex-rs", "durationMs": 224, "exitCode": 0, "id": "call_lNWWsbXl1e47qNaYjFRs0dyU", "parsedCmd": [ { "cmd": "touch /tmp/should-trigger-approval", "type": "unknown" } ], "status": "completed", "type": "commandExecution" } } } ```Owen Lin ·
2025-11-18 00:23:54 +00:00 -
[app-server] feat: export.rs supports a v2 namespace, initial v2 notifications (#6212)
**Typescript and JSON schema exports** While working on Thread/Turn/Items type definitions, I realize we will run into name conflicts between v1 and v2 APIs (e.g. `RateLimitWindow` which won't be reusable since v1 uses `RateLimitWindow` from `protocol/` which uses snake_case, but we want to expose camelCase everywhere, so we'll define a V2 version of that struct that serializes as camelCase). To set us up for a clean and isolated v2 API, generate types into a `v2/` namespace for both typescript and JSON schema. - TypeScript: v2 types emit under `out_dir/v2/*.ts`, and root index.ts now re-exports them via `export * as v2 from "./v2"`;. - JSON Schemas: v2 definitions bundle under `#/definitions/v2/*` rather than the root. The location for the original types (v1 and types pulled from `protocol/` and other core crates) haven't changed and are still at the root. This is for backwards compatibility: no breaking changes to existing usages of v1 APIs and types. **Notifications** While working on export.rs, I: - refactored server/client notifications with macros (like we already do for methods) so they also get exported (I noticed they weren't being exported at all). - removed the hardcoded list of types to export as JSON schema by leveraging the existing macros instead - and took a stab at API V2 notifications. These aren't wired up yet, and I expect to iterate on these this week.
Owen Lin ·
2025-11-05 01:02:39 +00:00 -
Generate JSON schema for app-server protocol (#5063)
Add annotations and an export script that let us generate app-server protocol types as typescript and JSONSchema. The script itself is a bit hacky because we need to manually label some of the types. Unfortunately it seems that enum variants don't get good names by default and end up with something like `EventMsg1`, `EventMsg2`, etc. I'm not an expert in this by any means, but since this is only run manually and we already need to enumerate the types required to describe the protocol, it didn't seem that much worse. An ideal solution here would be to have some kind of root that we could generate schemas for in one go, but I'm not sure if that's compatible with how we generate the protocol today.
Rasmus Rygaard ·
2025-10-20 11:45:11 -07:00 -
fix: remove mcp-types from app server protocol (#4537)
We continue the separation between `codex app-server` and `codex mcp-server`. In particular, we introduce a new crate, `codex-app-server-protocol`, and migrate `codex-rs/protocol/src/mcp_protocol.rs` into it, renaming it `codex-rs/app-server-protocol/src/protocol.rs`. Because `ConversationId` was defined in `mcp_protocol.rs`, we move it into its own file, `codex-rs/protocol/src/conversation_id.rs`, and because it is referenced in a ton of places, we have to touch a lot of files as part of this PR. We also decide to get away from proper JSON-RPC 2.0 semantics, so we also introduce `codex-rs/app-server-protocol/src/jsonrpc_lite.rs`, which is basically the same `JSONRPCMessage` type defined in `mcp-types` except with all of the `"jsonrpc": "2.0"` removed. Getting rid of `"jsonrpc": "2.0"` makes our serialization logic considerably simpler, as we can lean heavier on serde to serialize directly into the wire format that we use now.
Michael Bolin ·
2025-10-01 02:16:26 +00:00