Commit Graph

30 Commits

  • [hooks] use a user message > developer message for prompt continuation (#14867)
    ## Summary
    
    Persist Stop-hook continuation prompts as `user` messages instead of
    hidden `developer` messages + some requested integration tests
    
    This is a followup to @pakrym 's comment in
    https://github.com/openai/codex/pull/14532 to make sure stop-block
    continuation prompts match training for turn loops
    
    - Stop continuation now writes `<hook_prompt hook_run_id="...">stop
    hook's user prompt<hook_prompt>`
    - Introduces quick-xml dependency, though we already indirectly depended
    on it anyway via syntect
    - This PR only has about 500 lines of actual logic changes, the rest is
    tests/schema
    
    ## Testing
    
    Example run (with a sessionstart hook and 3 stop hooks) - this shows
    context added by session start, then two stop hooks sending their own
    additional prompts in a new turn. The model responds with a single
    message addressing both. Then when that turn ends, the hooks detect that
    they just ran using `stop_hook_active` and decide not to infinite loop
    
    test files for this (unzip, move codex -> .codex):
    [codex.zip](https://github.com/user-attachments/files/26075806/codex.zip)
    
    ```
    › cats
    
    
    • Running SessionStart hook: lighting the observatory
    
    SessionStart hook (completed)
      warning: Hi, I'm a session start hook for wizard-tower (startup).
      hook context: A wimboltine stonpet is an exotic cuisine from hyperspace
    
    • Cats are tiny zen wizards, my friend: equal parts nap, mystery, and chaos. If you want, we can talk
      cat facts, cat breeds, cat names, or build something cat-themed in this repo.
    
    • Running Stop hook: checking the tower wards
    
    • Running Stop hook: sacking the guards
    
    • Running Stop hook: hiring the guards
    
    Stop hook (completed)
      warning: Wizard Tower Stop hook reviewed the completed reply (177 chars).
    
    Stop hook (blocked)
      warning: Wizard Tower Stop hook continuing conversation
      feedback: cook the stonpet
    
    Stop hook (blocked)
      warning: Wizard Tower Stop hook continuing conversation
      feedback: eat the cooked stonpet
    
    • Stonpet’s cooked, aloha style: flash-seared over a blue quasiflame, glazed with nebula salt, and
      rested until the hyperspace juices settle.
    
      Now we eat with gratitude, my friend. One mindful bite in, and the flavor is pure cosmic surf:
      smoky, bright, and totally out of this dimension.
    
    • Running Stop hook: checking the tower wards
    
    • Running Stop hook: sacking the guards
    
    • Running Stop hook: hiring the guards
    
    Stop hook (completed)
      warning: Wizard Tower Stop hook reviewed the completed reply (285 chars).
    
    Stop hook (completed)
      warning: Wizard Tower Stop hook saw a second pass and stayed calm to avoid a loop.
    
    Stop hook (completed)
      warning: Wizard Tower Stop hook saw a second pass and stayed calm to avoid a loop.
    ```
  • Return image URL from view_image tool (#15072)
    Cleanup image semantics in code mode.
    
    `view_image` now returns `{image_url:string, details?: string}` 
    
    `image()` now allows both string parameter and `{image_url:string,
    details?: string}`
  • client side modelinfo overrides (#12101)
    TL;DR
    Add top-level `model_catalog_json` config support so users can supply a
    local model catalog override from a JSON file path (including adding new
    models) without backend changes.
    
    ### Problem
    Codex previously had no clean client-side way to replace/overlay model
    catalog data for local testing of model metadata and new model entries.
    
    ### Fix
    - Add top-level `model_catalog_json` config field (JSON file path).
    - Apply catalog entries when resolving `ModelInfo`:
      1. Base resolved model metadata (remote/fallback)
      2. Catalog overlay from `model_catalog_json`
    3. Existing global top-level overrides (`model_context_window`,
    `model_supports_reasoning_summaries`, etc.)
    
    ### Note
    Will revisit per-field overrides in a follow-up
    
    ### Tests
    Added tests
  • feat: replace custom mcp-types crate with equivalents from rmcp (#10349)
    We started working with MCP in Codex before
    https://crates.io/crates/rmcp was mature, so we had our own crate for
    MCP types that was generated from the MCP schema:
    
    
    https://github.com/openai/codex/blob/8b95d3e082376f4cb23e92641705a22afb28a9da/codex-rs/mcp-types/README.md
    
    Now that `rmcp` is more mature, it makes more sense to use their MCP
    types in Rust, as they handle details (like the `_meta` field) that our
    custom version ignored. Though one advantage that our custom types had
    is that our generated types implemented `JsonSchema` and `ts_rs::TS`,
    whereas the types in `rmcp` do not. As such, part of the work of this PR
    is leveraging the adapters between `rmcp` types and the serializable
    types that are API for us (app server and MCP) introduced in #10356.
    
    Note this PR results in a number of changes to
    `codex-rs/app-server-protocol/schema`, which merit special attention
    during review. We must ensure that these changes are still
    backwards-compatible, which is possible because we have:
    
    ```diff
    - export type CallToolResult = { content: Array<ContentBlock>, isError?: boolean, structuredContent?: JsonValue, };
    + export type CallToolResult = { content: Array<JsonValue>, structuredContent?: JsonValue, isError?: boolean, _meta?: JsonValue, };
    ```
    
    so `ContentBlock` has been replaced with the more general `JsonValue`.
    Note that `ContentBlock` was defined as:
    
    ```typescript
    export type ContentBlock = TextContent | ImageContent | AudioContent | ResourceLink | EmbeddedResource;
    ```
    
    so the deletion of those individual variants should not be a cause of
    great concern.
    
    Similarly, we have the following change in
    `codex-rs/app-server-protocol/schema/typescript/Tool.ts`:
    
    ```
    - export type Tool = { annotations?: ToolAnnotations, description?: string, inputSchema: ToolInputSchema, name: string, outputSchema?: ToolOutputSchema, title?: string, };
    + export type Tool = { name: string, title?: string, description?: string, inputSchema: JsonValue, outputSchema?: JsonValue, annotations?: JsonValue, icons?: Array<JsonValue>, _meta?: JsonValue, };
    ```
    
    so:
    
    - `annotations?: ToolAnnotations` ➡️ `JsonValue`
    - `inputSchema: ToolInputSchema` ➡️ `JsonValue`
    - `outputSchema?: ToolOutputSchema` ➡️ `JsonValue`
    
    and two new fields: `icons?: Array<JsonValue>, _meta?: JsonValue`
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10349).
    * #10357
    * __->__ #10349
    * #10356
  • feat(core) RequestRule (#9489)
    ## Summary
    Instead of trying to derive the prefix_rule for a command mechanically,
    let's let the model decide for us.
    
    ## Testing
    - [x] tested locally
  • fix: introduce AbsolutePathBuf as part of sandbox config (#7856)
    Changes the `writable_roots` field of the `WorkspaceWrite` variant of
    the `SandboxPolicy` enum from `Vec<PathBuf>` to `Vec<AbsolutePathBuf>`.
    This is helpful because now callers can be sure the value is an absolute
    path rather than a relative one. (Though when using an absolute path in
    a Seatbelt config policy, we still have to _canonicalize_ it first.)
    
    Because `writable_roots` can be read from a config file, it is important
    that we are able to resolve relative paths properly using the parent
    folder of the config file as the base path.
  • Fix: gracefully error out for unsupported images (#7478)
    Fix for #7459 
    ## What
    Since codex errors out for unsupported images, stop attempting to
    base64/attach them and instead emit a clear placeholder when the file
    isn’t a supported image MIME.
    
    ## Why
    Local uploads for unsupported formats (e.g., SVG/GIF/etc.) were
    dead-ending after decode failures because of the 400 retry loop. Users
    now get an explicit “cannot attach … unsupported image format …”
    response.
    
    ## How
    Replace the fallback read/encode path with MIME detection that bails out
    for non-image or unsupported image types, returning a consistent
    placeholder. Unreadable and invalid images still produce their existing
    error placeholders.
  • chore: add cargo-deny configuration (#7119)
    - add GitHub workflow running cargo-deny on push/PR
    - document cargo-deny allowlist with workspace-dep notes and advisory
    ignores
    - align workspace crates to inherit version/edition/license for
    consistent checks
  • fix: add more fields to ThreadStartResponse and ThreadResumeResponse (#6847)
    This adds the following fields to `ThreadStartResponse` and
    `ThreadResumeResponse`:
    
    ```rust
        pub model: String,
        pub model_provider: String,
        pub cwd: PathBuf,
        pub approval_policy: AskForApproval,
        pub sandbox: SandboxPolicy,
        pub reasoning_effort: Option<ReasoningEffort>,
    ```
    
    This is important because these fields are optional in
    `ThreadStartParams` and `ThreadResumeParams`, so the caller needs to be
    able to determine what values were ultimately used to start/resume the
    conversation. (Though note that any of these could be changed later
    between turns in the conversation.)
    
    Though to get this information reliably, it must be read from the
    internal `SessionConfiguredEvent` that is created in response to the
    start of a conversation. Because `SessionConfiguredEvent` (as defined in
    `codex-rs/protocol/src/protocol.rs`) did not have all of these fields, a
    number of them had to be added as part of this PR.
    
    Because `SessionConfiguredEvent` is referenced in many tests, test
    instances of `SessionConfiguredEvent` had to be updated, as well, which
    is why this PR touches so many files.
  • chore: merge git crates (#5909)
    Merge `git-apply` and `git-tooling` into `utils/`
  • feat: image resizing (#5446)
    Add image resizing on the client side to reduce load on the API
  • Add ItemStarted/ItemCompleted events for UserInputItem (#5306)
    Adds a new ItemStarted event and delivers UserMessage as the first item
    type (more to come).
    
    
    Renames `InputItem` to `UserInput` considering we're using the `Item`
    suffix for actual items.
  • Generate JSON schema for app-server protocol (#5063)
    Add annotations and an export script that let us generate app-server
    protocol types as typescript and JSONSchema.
    
    The script itself is a bit hacky because we need to manually label some
    of the types. Unfortunately it seems that enum variants don't get good
    names by default and end up with something like `EventMsg1`,
    `EventMsg2`, etc. I'm not an expert in this by any means, but since this
    is only run manually and we already need to enumerate the types required
    to describe the protocol, it didn't seem that much worse. An ideal
    solution here would be to have some kind of root that we could generate
    schemas for in one go, but I'm not sure if that's compatible with how we
    generate the protocol today.
  • fix: remove mcp-types from app server protocol (#4537)
    We continue the separation between `codex app-server` and `codex
    mcp-server`.
    
    In particular, we introduce a new crate, `codex-app-server-protocol`,
    and migrate `codex-rs/protocol/src/mcp_protocol.rs` into it, renaming it
    `codex-rs/app-server-protocol/src/protocol.rs`.
    
    Because `ConversationId` was defined in `mcp_protocol.rs`, we move it
    into its own file, `codex-rs/protocol/src/conversation_id.rs`, and
    because it is referenced in a ton of places, we have to touch a lot of
    files as part of this PR.
    
    We also decide to get away from proper JSON-RPC 2.0 semantics, so we
    also introduce `codex-rs/app-server-protocol/src/jsonrpc_lite.rs`, which
    is basically the same `JSONRPCMessage` type defined in `mcp-types`
    except with all of the `"jsonrpc": "2.0"` removed.
    
    Getting rid of `"jsonrpc": "2.0"` makes our serialization logic
    considerably simpler, as we can lean heavier on serde to serialize
    directly into the wire format that we use now.
  • fix: use macros to ensure request/response symmetry (#4529)
    Manually curating `protocol-ts/src/lib.rs` was error-prone, as expected.
    I finally asked Codex to write some Rust macros so we can ensure that:
    
    - For every variant of `ClientRequest` and `ServerRequest`, there is an
    associated `params` and `response` type.
    - All response types are included automatically in the output of `codex
    generate-ts`.
  • chore: unify cargo versions (#4044)
    Unify cargo versions at root
  • Switch to uuid_v7 and tighten ConversationId usage (#3819)
    Make sure conversations have a timestamp.
  • fix: include rollout_path in NewConversationResponse (#3352)
    Adding the `rollout_path` to the `NewConversationResponse` makes it so a
    client can perform subsequent operations on a `(ConversationId,
    PathBuf)` pair. #3353 will introduce support for `ArchiveConversation`.
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/3352).
    * #3353
    * __->__ #3352
  • feat: Run cargo shear during CI (#3338)
    Run cargo shear as part of the CI to ensure no unused dependencies
  • Generate more typescript types and return conversation id with ConversationSummary (#3219)
    This PR does multiple things that are necessary for conversation resume
    to work from the extension. I wanted to make sure everything worked so
    these changes wound up in one PR:
    1. Generate more ts types
    2. Resume rollout history files rather than create a new one every time
    it is resumed so you don't see a duplicate conversation in history for
    every resume. Chatted with @aibrahim-oai to verify this
    3. Return conversation_id in conversation summaries
    4. [Cleanup] Use serde and strong types for a lot of the rollout file
    parsing
  • Format large numbers in a more readable way. (#2046)
    - In the bottom line of the TUI, print the number of tokens to 3 sigfigs
      with an SI suffix, e.g. "1.23K".
    - Elsewhere where we print a number, I figure it's worthwhile to print
      the exact number, because e.g. it's a summary of your session. Here we print
      the numbers comma-separated.
  • fix: fix serde_as annotation and verify with test (#3170)
    I didn't do https://github.com/openai/codex/pull/3163 correctly the
    first time: now verified with a test.
  • fix: use a more efficient wire format for ExecCommandOutputDeltaEvent.chunk (#3163)
    When serializing to JSON, the existing solution created an enormous
    array of ints, which is far more bytes on the wire than a base64-encoded
    string would be.
  • Move models.rs to protocol (#2595)
    Moving models.rs to protocol so we can use them in `Codex` operations
  • chore: move mcp-server/src/wire_format.rs to protocol/src/mcp_protocol.rs (#2423)
    The existing `wire_format.rs` should share more types with the
    `codex-protocol` crate (like `AskForApproval` instead of maintaining a
    parallel `CodexToolCallApprovalPolicy` enum), so this PR moves
    `wire_format.rs` into `codex-protocol`, renaming it as
    `mcp-protocol.rs`. We also de-dupe types, where appropriate.
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2423).
    * #2424
    * __->__ #2423