Commit Graph

141 Commits

  • [core] add optional status_code to error events (#6865)
    We want to better uncover error status code for clients. Add an optional
    status_code to error events (thread error, error, stream error) so app
    server could uncover the status code from the client side later.
    
    in event log:
    ```
    < {
    <   "method": "codex/event/stream_error",
    <   "params": {
    <     "conversationId": "019a9a32-f576-7292-9711-8e57e8063536",
    <     "id": "0",
    <     "msg": {
    <       "message": "Reconnecting... 5/5",
    <       "status_code": 401,
    <       "type": "stream_error"
    <     }
    <   }
    < }
    < {
    <   "method": "codex/event/error",
    <   "params": {
    <     "conversationId": "019a9a32-f576-7292-9711-8e57e8063536",
    <     "id": "0",
    <     "msg": {
    <       "message": "exceeded retry limit, last status: 401 Unauthorized, request id: 9a0cb03a485067f7-SJC",
    <       "status_code": 401,
    <       "type": "error"
    <     }
    <   }
    < }
    ```
  • storing credits (#6858)
    Expand the rate-limit cache/TUI: store credit snapshots alongside
    primary and secondary windows, render “Credits” when the backend reports
    they exist (unlimited vs rounded integer balances)
  • fix: typos in model picker (#6859)
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
    
    Include a link to a bug report or enhancement request.
  • fix: add more fields to ThreadStartResponse and ThreadResumeResponse (#6847)
    This adds the following fields to `ThreadStartResponse` and
    `ThreadResumeResponse`:
    
    ```rust
        pub model: String,
        pub model_provider: String,
        pub cwd: PathBuf,
        pub approval_policy: AskForApproval,
        pub sandbox: SandboxPolicy,
        pub reasoning_effort: Option<ReasoningEffort>,
    ```
    
    This is important because these fields are optional in
    `ThreadStartParams` and `ThreadResumeParams`, so the caller needs to be
    able to determine what values were ultimately used to start/resume the
    conversation. (Though note that any of these could be changed later
    between turns in the conversation.)
    
    Though to get this information reliably, it must be read from the
    internal `SessionConfiguredEvent` that is created in response to the
    start of a conversation. Because `SessionConfiguredEvent` (as defined in
    `codex-rs/protocol/src/protocol.rs`) did not have all of these fields, a
    number of them had to be added as part of this PR.
    
    Because `SessionConfiguredEvent` is referenced in many tests, test
    instances of `SessionConfiguredEvent` had to be updated, as well, which
    is why this PR touches so many files.
  • feat: remote compaction (#6795)
    Co-authored-by: pakrym-oai <pakrym@openai.com>
  • [app-server] feat: add v2 command execution approval flow (#6758)
    This PR adds the API V2 version of the command‑execution approval flow
    for the shell tool.
    
    This PR wires the new RPC (`item/commandExecution/requestApproval`, V2
    only) and related events (`item/started`, `item/completed`, and
    `item/commandExecution/delta`, which are emitted in both V1 and V2)
    through the app-server
    protocol. The new approval RPC is only sent when the user initiates a
    turn with the new `turn/start` API so we don't break backwards
    compatibility with VSCE.
    
    The approach I took was to make as few changes to the Codex core as
    possible, leveraging existing `EventMsg` core events, and translating
    those in app-server. I did have to add additional fields to
    `EventMsg::ExecCommandEndEvent` to capture the command's input so that
    app-server can statelessly transform these events to a
    `ThreadItem::CommandExecution` item for the `item/completed` event.
    
    Once we stabilize the API and it's complete enough for our partners, we
    can work on migrating the core to be aware of command execution items as
    a first-class concept.
    
    **Note**: We'll need followup work to make sure these APIs work for the
    unified exec tool, but will wait til that's stable and landed before
    doing a pass on app-server.
    
    Example payloads below:
    ```
    {
      "method": "item/started",
      "params": {
        "item": {
          "aggregatedOutput": null,
          "command": "/bin/zsh -lc 'touch /tmp/should-trigger-approval'",
          "cwd": "/Users/owen/repos/codex/codex-rs",
          "durationMs": null,
          "exitCode": null,
          "id": "call_lNWWsbXl1e47qNaYjFRs0dyU",
          "parsedCmd": [
            {
              "cmd": "touch /tmp/should-trigger-approval",
              "type": "unknown"
            }
          ],
          "status": "inProgress",
          "type": "commandExecution"
        }
      }
    }
    ```
    
    ```
    {
      "id": 0,
      "method": "item/commandExecution/requestApproval",
      "params": {
        "itemId": "call_lNWWsbXl1e47qNaYjFRs0dyU",
        "parsedCmd": [
          {
            "cmd": "touch /tmp/should-trigger-approval",
            "type": "unknown"
          }
        ],
        "reason": "Need to create file in /tmp which is outside workspace sandbox",
        "risk": null,
        "threadId": "019a93e8-0a52-7fe3-9808-b6bc40c0989a",
        "turnId": "1"
      }
    }
    ```
    
    ```
    {
      "id": 0,
      "result": {
        "acceptSettings": {
          "forSession": false
        },
        "decision": "accept"
      }
    }
    ```
    
    ```
    {
      "params": {
        "item": {
          "aggregatedOutput": null,
          "command": "/bin/zsh -lc 'touch /tmp/should-trigger-approval'",
          "cwd": "/Users/owen/repos/codex/codex-rs",
          "durationMs": 224,
          "exitCode": 0,
          "id": "call_lNWWsbXl1e47qNaYjFRs0dyU",
          "parsedCmd": [
            {
              "cmd": "touch /tmp/should-trigger-approval",
              "type": "unknown"
            }
          ],
          "status": "completed",
          "type": "commandExecution"
        }
      }
    }
    ```
  • core/tui: non-blocking MCP startup (#6334)
    This makes MCP startup not block TUI startup. Messages sent while MCPs
    are booting will be queued.
    
    
    https://github.com/user-attachments/assets/96e1d234-5d8f-4932-a935-a675d35c05e0
    
    
    Fixes #6317
    
    ---------
    
    Co-authored-by: pakrym-oai <pakrym@openai.com>
  • Handle "Don't Trust" directory selection in onboarding (#4941)
    Fixes #4940
    Fixes #4892
    
    When selecting "No, ask me to approve edits and commands" during
    onboarding, the code wasn't applying the correct approval policy,
    causing Codex to block all write operations instead of requesting
    approval.
    
    This PR fixes the issue by persisting the "DontTrust" decision in
    config.toml as `trust_level = "untrusted"` and handling it in the
    sandbox and approval policy logic, so Codex correctly asks for approval
    before making changes.
    
    ## Before (bug)
    <img width="709" height="500" alt="bef"
    src="https://github.com/user-attachments/assets/5aced26d-d810-4754-879a-89d9e4e0073b"
    />
    
    ## After (fixed)
    <img width="713" height="359" alt="aft"
    src="https://github.com/user-attachments/assets/9887bbcb-a9a5-4e54-8e76-9125a782226b"
    />
    
    ---------
    
    Co-authored-by: Eric Traut <etraut@openai.com>
  • feat: better UI for unified_exec (#6515)
    <img width="376" height="132" alt="Screenshot 2025-11-12 at 17 36 22"
    src="https://github.com/user-attachments/assets/ce693f0d-5ca0-462e-b170-c20811dcc8d5"
    />
  • [app-server] small fixes for JSON schema export and one-of types (#6614)
    A partner is consuming our generated JSON schema bundle for app-server
    and identified a few issues:
    - not all polymorphic / one-of types have a type descriminator
    - `"$ref": "#/definitions/v2/SandboxPolicy"` is missing
    - "Option<>" is an invalid schema name, and also unnecessary
    
    This PR:
    - adds the type descriminator to the various types that are missing it
    except for `SessionSource` and `SubAgentSource` because they are
    serialized to disk (adding this would break backwards compat for
    resume), and they should not be necessary to consume for an integration
    with app-server.
    - removes the special handling in `export.rs` of various types like
    SandboxPolicy, which turned out to be unnecessary and incorrect
    - filters out `Option<>` which was auto-generated for request params
    that don't need a body
    
    For context, we currently pull in wayyy more types than we need through
    the `EventMsg` god object which we are **not** planning to expose in API
    v2 (this is how I suspect `SessionSource` and `SubAgentSource` are being
    pulled in). But until we have all the necessary v2 notifications in
    place that will allow us to remove `EventMsg`, we will keep exporting it
    for now.
  • [App-server] add new v2 events:item/reasoning/delta, item/agentMessage/delta & item/reasoning/summaryPartAdded (#6559)
    core event to app server event mapping:
    1. `codex/event/reasoning_content_delta` ->
    `item/reasoning/summaryTextDelta`.
    2. `codex/event/reasoning_raw_content_delta` ->
    `item/reasoning/textDelta`
    3. `codex/event/agent_message_content_delta` →
    `item/agentMessage/delta`.
    4. `codex/event/agent_reasoning_section_break` ->
    `item/reasoning/summaryPartAdded`.
    
    Also added a change in core to pass down content index, summary index
    and item id from events.
    
    Tested with the `git checkout owen/app_server_test_client && cargo run
    -p codex-app-server-test-client -- send-message-v2 "hello"` and verified
    that new events are emitted correctly.
  • Reasoning level update (#6586)
    Automatically update reasoning levels when migrating between models
  • Set verbosity to low for 5.1 (#6568)
    And improve test coverage
  • feat: shell_command tool (#6510)
    This adds support for a new variant of the shell tool behind a flag. To
    test, run `codex` with `--enable shell_command_tool`, which will
    register the tool with Codex under the name `shell_command` that accepts
    the following shape:
    
    ```python
    {
      command: str
      workdir: str | None,
      timeout_ms: int | None,
      with_escalated_permissions: bool | None,
      justification: str | None,
    }
    ```
    
    This is comparable to the existing tool registered under
    `shell`/`container.exec`. The primary difference is that it accepts
    `command` as a `str` instead of a `str[]`. The `shell_command` tool
    executes by running `execvp(["bash", "-lc", command])`, though the exact
    arguments to `execvp(3)` depend on the user's default shell.
    
    The hypothesis is that this will simplify things for the model. For
    example, on Windows, instead of generating:
    
    ```json
    {"command": ["pwsh.exe", "-NoLogo", "-Command", "ls -Name"]}
    ```
    
    The model could simply generate:
    
    ```json
    {"command": "ls -Name"}
    ```
    
    As part of this change, I extracted some logic out of `user_shell.rs` as
    `Shell::derive_exec_args()` so that it can be reused in
    `codex-rs/core/src/tools/handlers/shell.rs`. Note the original code
    generated exec arg lists like:
    
    ```javascript
    ["bash", "-lc", command]
    ["zsh", "-lc", command]
    ["pwsh.exe", "-NoProfile", "-Command", command]
    ```
    
    Using `-l` for Bash and Zsh, but then specifying `-NoProfile` for
    PowerShell seemed inconsistent to me, so I changed this in the new
    implementation while also adding a `use_login_shell: bool` option to
    make this explicit. If we decide to add a `login: bool` to
    `ShellCommandToolCallParams` like we have for unified exec:
    
    
    https://github.com/openai/codex/blob/807e2c27f0a9f2e85c50e7e6df5533f0d9b853c7/codex-rs/core/src/tools/handlers/unified_exec.rs#L33-L34
    
    Then this should make it straightforward to support.
  • Include reasoning tokens in the context window calculation (#6161)
    This value is used to determine whether mid-turn compaction is required.
    Reasoning items are only excluded between turns (and soon will start to
    be preserved even across turns) so it's incorrect to subtract
    reasoning_output_tokens mid term.
    
    This will result in higher values reported between turns but we are also
    looking into preserving reasoning items for the entire conversation to
    improve performance and caching.
  • Changes to sandbox command assessment feature based on initial experiment feedback (#6091)
    * Removed sandbox risk categories; feedback indicates that these are not
    that useful and "less is more"
    * Tweaked the assessment prompt to generate terser answers
    * Fixed bug in orchestrator that prevents this feature from being
    exposed in the extension
  • Add warning on compact (#6052)
    This PR introduces the ability for `core` to send `warnings` as it can
    send `errors. It also sends a warning on compaction.
    
    <img width="811" height="187" alt="image"
    src="https://github.com/user-attachments/assets/0947a42d-b720-420d-b7fd-115f8a65a46a"
    />
  • [app-server] remove serde(skip_serializing_if = "Option::is_none") annotations (#5939)
    We had this annotation everywhere in app-server APIs which made it so
    that fields get serialized as `field?: T`, meaning if the field as
    `None` we would omit the field in the payload. Removing this annotation
    changes it so that we return `field: T | null` instead, which makes
    codex app-server's API more aligned with the convention of public OpenAI
    APIs like Responses.
    
    Separately, remove the `#[ts(optional_fields = nullable)]` annotations
    that were recently added which made all the TS types become `field?: T |
    null` which is not great since clients need to handle undefined and
    null.
    
    I think generally it'll be best to have optional types be either:
    - `field: T | null` (preferred, aligned with public OpenAI APIs)
    - `field?: T` where we have to, such as types generated from the MCP
    schema:
    https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/schema/2025-06-18/schema.ts
    (see changes to `mcp-types/`)
    
    I updated @etraut-openai's unit test to check that all generated TS
    types are one or the other, not both (so will error if we have a type
    that has `field?: T | null`). I don't think there's currently a good use
    case for that - but we can always revisit.
  • Send delegate header (#5942)
    Send delegate type header
  • Add item streaming events (#5546)
    Adds AgentMessageContentDelta, ReasoningContentDelta,
    ReasoningRawContentDelta item streaming events while maintaining
    compatibility for old events.
    
    ---------
    
    Co-authored-by: Owen Lin <owen@openai.com>
  • Delegate review to codex instance (#5572)
    In this PR, I am exploring migrating task kind to an invocation of
    Codex. The main reason would be getting rid off multiple
    `ConversationHistory` state and streamlining our context/history
    management.
    
    This approach depends on opening a channel between the sub-codex and
    codex. This channel is responsible for forwarding `interactive`
    (`approvals`) and `non-interactive` events. The `task` is responsible
    for handling those events.
    
    This opens the door for implementing `codex as a tool`, replacing
    `compact` and `review`, and potentially subagents.
    
    One consideration is this code is very similar to `app-server` specially
    in the approval part. If in the future we wanted an interactive
    `sub-codex` we should consider using `codex-mcp`
  • Add a wrapper around raw response items (#5923)
    We currently have nested enums when sending raw response items in the
    app-server protocol. This makes downstream schemas confusing because we
    need to embed `type`-discriminated enums within each other.
    
    This PR adds a small wrapper around the response item so we can keep the
    schemas separate
  • Add missing "nullable" macro to protocol structs that contain optional fields (#5901)
    This PR addresses a current hole in the TypeScript code generation for
    the API server protocol. Fields that are marked as "Optional<>" in the
    Rust code are serialized such that the value is omitted when it is
    deserialized — appearing as `undefined`, but the TS type indicates
    (incorrectly) that it is always defined but possibly `null`. This can
    lead to subtle errors that the TypeScript compiler doesn't catch. The
    fix is to include the `#[ts(optional_fields = nullable)]` macro for all
    protocol structs that contain one or more `Optional<>` fields.
    
    This PR also includes a new test that validates that all TS protocol
    code containing "| null" in its type is marked optional ("?") to catch
    cases where `#[ts(optional_fields = nullable)]` is omitted.
  • feat: deprecation warning (#5825)
    <img width="955" height="311" alt="Screenshot 2025-10-28 at 14 26 25"
    src="https://github.com/user-attachments/assets/99729b3d-3bc9-4503-aab3-8dc919220ab4"
    />
  • chore: merge git crates (#5909)
    Merge `git-apply` and `git-tooling` into `utils/`
  • feature: Add "!cmd" user shell execution (#2471)
    feature: Add "!cmd" user shell execution
    
    This change lets users run local shell commands directly from the TUI by
    prefixing their input with ! (e.g. !ls). Output is truncated to keep the
    exec cell usable, and Ctrl-C cleanly
      interrupts long-running commands (e.g. !sleep 10000).
    
    **Summary of changes**
    
    - Route Op::RunUserShellCommand through a dedicated UserShellCommandTask
    (core/src/tasks/user_shell.rs), keeping the task logic out of codex.rs.
    - Reuse the existing tool router: the task constructs a ToolCall for the
    local_shell tool and relies on ShellHandler, so no manual MCP tool
    lookup is required.
    - Emit exec lifecycle events (ExecCommandBegin/ExecCommandEnd) so the
    TUI can show command metadata, live output, and exit status.
    
    **End-to-end flow**
    
      **TUI handling**
    
    1. ChatWidget::submit_user_message (TUI) intercepts messages starting
    with !.
    2. Non-empty commands dispatch Op::RunUserShellCommand { command };
    empty commands surface a help hint.
    3. No UserInput items are created, so nothing is enqueued for the model.
    
      **Core submission loop**
    4. The submission loop routes the op to handlers::run_user_shell_command
    (core/src/codex.rs).
    5. A fresh TurnContext is created and Session::spawn_user_shell_command
    enqueues UserShellCommandTask.
    
      **Task execution**
    6. UserShellCommandTask::run emits TaskStartedEvent, formats the
    command, and prepares a ToolCall targeting local_shell.
      7. ToolCallRuntime::handle_tool_call dispatches to ShellHandler.
    
      **Shell tool runtime**
    8. ShellHandler::run_exec_like launches the process via the unified exec
    runtime, honoring sandbox and shell policies, and emits
    ExecCommandBegin/End.
    9. Stdout/stderr are captured for the UI, but the task does not turn the
    resulting ToolOutput into a model response.
    
      **Completion**
    10. After ExecCommandEnd, the task finishes without an assistant
    message; the session marks it complete and the exec cell displays the
    final output.
    
      **Conversation context**
    
    - The command and its output never enter the conversation history or the
    model prompt; the flow is local-only.
      - Only exec/task events are emitted for UI rendering.
    
    **Demo video**
    
    
    https://github.com/user-attachments/assets/fcd114b0-4304-4448-a367-a04c43e0b996
  • verify mime type of images (#5888)
    solves: https://github.com/openai/codex/issues/5675
    
    Block non-image uploads in the view_image workflow. We now confirm the
    file’s MIME is image/* before building the data URL; otherwise we emit a
    “unsupported MIME type” error to the model. This stops the agent from
    sending application/json blobs that the Responses API rejects with 400s.
    
    <img width="409" height="556" alt="Screenshot 2025-10-28 at 1 15 10 PM"
    src="https://github.com/user-attachments/assets/a92199e8-2769-4b1d-8e33-92d9238c90fe"
    />
  • [MCP] Render MCP tool call result images to the model (#5600)
    It's pretty amazing we have gotten here without the ability for the
    model to see image content from MCP tool calls.
    
    This PR builds off of 4391 and fixes #4819. I would like @KKcorps to get
    adequete credit here but I also want to get this fix in ASAP so I gave
    him a week to update it and haven't gotten a response so I'm going to
    take it across the finish line.
    
    
    This test highlights how absured the current situation is. I asked the
    model to read this image using the Chrome MCP
    <img width="2378" height="674" alt="image"
    src="https://github.com/user-attachments/assets/9ef52608-72a2-4423-9f5e-7ae36b2b56e0"
    />
    
    After this change, it correctly outputs:
    > Captured the page: image dhows a dark terminal-style UI labeled
    `OpenAI Codex (v0.0.0)` with prompt `model: gpt-5-codex medium` and
    working directory `/codex/codex-rs`
    (and more)  
    
    Before this change, it said:
    > Took the full-page screenshot you asked for. It shows a long,
    horizontally repeating pattern of stylized people in orange, light-blue,
    and mustard clothing, holding hands in alternating poses against a white
    background. No text or other graphics-just rows of flat illustration
    stretching off to the right.
    
    Without this change, the Figma, Playwright, Chrome, and other visual MCP
    servers are pretty much entirely useless.
    
    I tested this change with the openai respones api as well as a third
    party completions api
  • fix: move account struct to app-server-protocol and use camelCase (#5829)
    Makes sense to move this struct to `app-server-protocol/` since we want
    to serialize as camelCase, but we don't for structs defined in
    `protocol/`
    
    It was:
    ```
    export type Account = { "type": "ApiKey", api_key: string, } | { "type": "chatgpt", email: string | null, plan_type: PlanType, };
    ```
    
    But we want:
    ```
    export type Account = { "type": "apiKey", apiKey: string, } | { "type": "chatgpt", email: string | null, planType: PlanType, };
    ```
  • feat: image resizing (#5446)
    Add image resizing on the client side to reduce load on the API
  • feat: annotate conversations with model_provider for filtering (#5658)
    Because conversations that use the Responses API can have encrypted
    reasoning messages, trying to resume a conversation with a different
    provider could lead to confusing "failed to decrypt" errors. (This is
    reproducible by starting a conversation using ChatGPT login and resuming
    it as a conversation that uses OpenAI models via Azure.)
    
    This changes `ListConversationsParams` to take a `model_providers:
    Option<Vec<String>>` and adds `model_provider` on each
    `ConversationSummary` it returns so these cases can be disambiguated.
    
    Note this ended up making changes to
    `codex-rs/core/src/rollout/tests.rs` because it had a number of cases
    where it expected `Some` for the value of `next_cursor`, but the list of
    rollouts was complete, so according to this docstring:
    
    
    https://github.com/openai/codex/blob/bcd64c7e7231d6316a2377d1525a0fa74f21b783/codex-rs/app-server-protocol/src/protocol.rs#L334-L337
    
    If there are no more items to return, then `next_cursor` should be
    `None`. This PR updates that logic.
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/5658).
    * #5803
    * #5793
    * __->__ #5658
  • Added model summary and risk assessment for commands that violate sandbox policy (#5536)
    This PR adds support for a model-based summary and risk assessment for
    commands that violate the sandbox policy and require user approval. This
    aids the user in evaluating whether the command should be approved.
    
    The feature works by taking a failed command and passing it back to the
    model and asking it to summarize the command, give it a risk level (low,
    medium, high) and a risk category (e.g. "data deletion" or "data
    exfiltration"). It uses a new conversation thread so the context in the
    existing thread doesn't influence the answer. If the call to the model
    fails or takes longer than 5 seconds, it falls back to the current
    behavior.
    
    For now, this is an experimental feature and is gated by a config key
    `experimental_sandbox_command_assessment`.
    
    Here is a screen shot of the approval prompt showing the risk assessment
    and summary.
    
    <img width="723" height="282" alt="image"
    src="https://github.com/user-attachments/assets/4597dd7c-d5a0-4e9f-9d13-414bd082fd6b"
    />
  • [app-server] add new account method API stubs (#5527)
    These are the schema definitions for the new JSON-RPC APIs associated
    with accounts. These are not wired up to business logic yet and will
    currently throw an internal error indicating these are unimplemented.
  • Add new thread items and rewire event parsing to use them (#5418)
    1. Adds AgentMessage,  Reasoning,  WebSearch items.
    2. Switches the ResponseItem parsing to use new items and then also emit
    3. Removes user-item kind and filters out "special" (environment) user
    items when returning to clients.
  • [app-server] read rate limits API (#5302)
    Adds a `GET account/rateLimits/read` API to app-server. This calls the
    codex backend to fetch the user's current rate limits.
    
    This would be helpful in checking rate limits without having to send a
    message.
    
    For calling the codex backend usage API, I generated the types and
    manually copied the relevant ones into `codex-backend-openapi-types`.
    It'll be nice to extend our internal openapi generator to support Rust
    so we don't have to run these manual steps.
    
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
  • Add ItemStarted/ItemCompleted events for UserInputItem (#5306)
    Adds a new ItemStarted event and delivers UserMessage as the first item
    type (more to come).
    
    
    Renames `InputItem` to `UserInput` considering we're using the `Item`
    suffix for actual items.
  • Use int timestamps for rate limit reset_at (#5383)
    The backend will be returning unix timestamps (seconds since epoch)
    instead of RFC 3339 strings. This will make it more ergonomic for
    developers to integrate against - no string parsing.
  • Generate JSON schema for app-server protocol (#5063)
    Add annotations and an export script that let us generate app-server
    protocol types as typescript and JSONSchema.
    
    The script itself is a bit hacky because we need to manually label some
    of the types. Unfortunately it seems that enum variants don't get good
    names by default and end up with something like `EventMsg1`,
    `EventMsg2`, etc. I'm not an expert in this by any means, but since this
    is only run manually and we already need to enumerate the types required
    to describe the protocol, it didn't seem that much worse. An ideal
    solution here would be to have some kind of root that we could generate
    schemas for in one go, but I'm not sure if that's compatible with how we
    generate the protocol today.
  • Auto compact at ~90% (#5292)
    Users now hit a window exceeded limit and they usually don't know what
    to do. This starts auto compact at ~90% of the window.
  • Add forced_chatgpt_workspace_id and forced_login_method configuration options (#5303)
    This PR adds support for configs to specify a forced login method
    (chatgpt or api) as well as a forced chatgpt account id. This lets
    enterprises uses [managed
    configs](https://developers.openai.com/codex/security#managed-configuration)
    to force all employees to use their company's workspace instead of their
    own or any other.
    
    When a workspace id is set, a query param is sent to the login flow
    which auto-selects the given workspace or errors if the user isn't a
    member of it.
    
    This PR is large but a large % of it is tests, wiring, and required
    formatting changes.
    
    API login with chatgpt forced
    <img width="1592" height="116" alt="CleanShot 2025-10-19 at 22 40 04"
    src="https://github.com/user-attachments/assets/560c6bb4-a20a-4a37-95af-93df39d057dd"
    />
    
    ChatGPT login with api forced
    <img width="1018" height="100" alt="CleanShot 2025-10-19 at 22 40 29"
    src="https://github.com/user-attachments/assets/d010bbbb-9c8d-4227-9eda-e55bf043b4af"
    />
    
    Onboarding with api forced
    <img width="892" height="460" alt="CleanShot 2025-10-19 at 22 41 02"
    src="https://github.com/user-attachments/assets/cc0ed45c-b257-4d62-a32e-6ca7514b5edd"
    />
    
    Onboarding with ChatGPT forced
    <img width="1154" height="426" alt="CleanShot 2025-10-19 at 22 41 27"
    src="https://github.com/user-attachments/assets/41c41417-dc68-4bb4-b3e7-3b7769f7e6a1"
    />
    
    Logging in with the wrong workspace
    <img width="2222" height="84" alt="CleanShot 2025-10-19 at 22 42 31"
    src="https://github.com/user-attachments/assets/0ff4222c-f626-4dd3-b035-0b7fe998a046"
    />
  • fix: switch rate limit reset handling to timestamps (#5304)
    This change ensures that we store the absolute time instead of relative
    offsets of when the primary and secondary rate limits will reset.
    Previously these got recalculated relative to current time, which leads
    to the displayed reset times to change over time, including after doing
    a codex resume.
    
    For previously changed sessions, this will cause the reset times to not
    show due to this being a breaking change:
    <img width="524" height="55" alt="Screenshot 2025-10-17 at 5 14 18 PM"
    src="https://github.com/user-attachments/assets/53ebd43e-da25-4fef-9c47-94a529d40265"
    />
    
    Fixes https://github.com/openai/codex/issues/4761
  • feat: add path field to ParsedCommand::Read variant (#5275)
    `ParsedCommand::Read` has a `name` field that attempts to identify the
    name of the file being read, but the file may not be in the `cwd` in
    which the command is invoked as demonstrated by this existing unit test:
    
    
    https://github.com/openai/codex/blob/0139f6780c850d87bb37bbb3a11e763d5dc3b50d/codex-rs/core/src/parse_command.rs#L250-L260
    
    As you can see, `tui/Cargo.toml` is the relative path to the file being
    read.
    
    This PR introduces a new `path: PathBuf` field to `ParsedCommand::Read`
    that attempts to capture this information. When possible, this is an
    absolute path, though when relative, it should be resolved against the
    `cwd` that will be used to run the command to derive the absolute path.
    
    This should make it easier for clients to provide UI for a "read file"
    event that corresponds to the command execution.