Commit Graph

237 Commits

  • feat: add support for building with Bazel (#8875)
    This PR configures Codex CLI so it can be built with
    [Bazel](https://bazel.build) in addition to Cargo. The `.bazelrc`
    includes configuration so that remote builds can be done using
    [BuildBuddy](https://www.buildbuddy.io).
    
    If you are familiar with Bazel, things should work as you expect, e.g.,
    run `bazel test //... --keep-going` to run all the tests in the repo,
    but we have also added some new aliases in the `justfile` for
    convenience:
    
    - `just bazel-test` to run tests locally
    - `just bazel-remote-test` to run tests remotely (currently, the remote
    build is for x86_64 Linux regardless of your host platform). Note we are
    currently seeing the following test failures in the remote build, so we
    still need to figure out what is happening here:
    
    ```
    failures:
        suite::compact::manual_compact_twice_preserves_latest_user_messages
        suite::compact_resume_fork::compact_resume_after_second_compaction_preserves_history
        suite::compact_resume_fork::compact_resume_and_fork_preserve_model_history_view
    ```
    
    - `just build-for-release` to build release binaries for all
    platforms/architectures remotely
    
    To setup remote execution:
    - [Create a buildbuddy account](https://app.buildbuddy.io/) (OpenAI
    employees should also request org access at
    https://openai.buildbuddy.io/join/ with their `@openai.com` email
    address.)
    - [Copy your API key](https://app.buildbuddy.io/docs/setup/) to
    `~/.bazelrc` (add the line `build
    --remote_header=x-buildbuddy-api-key=YOUR_KEY`)
    - Use `--config=remote` in your `bazel` invocations (or add `common
    --config=remote` to your `~/.bazelrc`, or use the `just` commands)
    
    ## CI
    
    In terms of CI, this PR introduces `.github/workflows/bazel.yml`, which
    uses Bazel to run the tests _locally_ on Mac and Linux GitHub runners
    (we are working on supporting Windows, but that is not ready yet). Note
    that the failures we are seeing in `just bazel-remote-test` do not occur
    on these GitHub CI jobs, so everything in `.github/workflows/bazel.yml`
    is green right now.
    
    The `bazel.yml` uses extra config in `.github/workflows/ci.bazelrc` so
    that macOS CI jobs build _remotely_ on Linux hosts (using the
    `docker://docker.io/mbolin491/codex-bazel` Docker image declared in the
    root `BUILD.bazel`) using cross-compilation to build the macOS
    artifacts. Then these artifacts are downloaded locally to GitHub's macOS
    runner so the tests can be executed natively. This is the relevant
    config that enables this:
    
    ```
    common:macos --config=remote
    common:macos --strategy=remote
    common:macos --strategy=TestRunner=darwin-sandbox,local
    ```
    
    Because of the remote caching benefits we get from BuildBuddy, these new
    CI jobs can be extremely fast! For example, consider these two jobs that
    ran all the tests on Linux x86_64:
    
    - Bazel 1m37s
    https://github.com/openai/codex/actions/runs/20861063212/job/59940545209?pr=8875
    - Cargo 9m20s
    https://github.com/openai/codex/actions/runs/20861063192/job/59940559592?pr=8875
    
    For now, we will continue to run both the Bazel and Cargo jobs for PRs,
    but once we add support for Windows and running Clippy, we should be
    able to cutover to using Bazel exclusively for PRs, which should still
    speed things up considerably. We will probably continue to run the Cargo
    jobs post-merge for commits that land on `main` as a sanity check.
    
    Release builds will also continue to be done by Cargo for now.
    
    Earlier attempt at this PR: https://github.com/openai/codex/pull/8832
    Earlier attempt to add support for Buck2, now abandoned:
    https://github.com/openai/codex/pull/8504
    
    ---------
    
    Co-authored-by: David Zbarsky <dzbarsky@gmail.com>
    Co-authored-by: Michael Bolin <mbolin@openai.com>
  • fix(app-server): set originator header from initialize JSON-RPC request (#8873)
    **Motivation**
    The `originator` header is important for codex-backend’s Responses API
    proxy because it identifies the real end client (codex cli, codex vscode
    extension, codex exec, future IDEs) and is used to categorize requests
    by client for our enterprise compliance API.
    
    Today the `originator` header is set by either:
    - the `CODEX_INTERNAL_ORIGINATOR_OVERRIDE` env var (our VSCode extension
    does this)
    - calling `set_default_originator()` which sets a global immutable
    singleton (`codex exec` does this)
    
    For `codex app-server`, we want the `initialize` JSON-RPC request to set
    that header because it is a natural place to do so. Example:
    ```json
    {
      "method": "initialize",
      "id": 0,
      "params": {
        "clientInfo": {
          "name": "codex_vscode",
          "title": "Codex VS Code Extension",
          "version": "0.1.0"
        }
      }
    }
    ```
    and when app-server receives that request, it can call
    `set_default_originator()`. This is a much more natural interface than
    asking third party developers to set an env var.
    
    One hiccup is that `originator()` reads the global singleton and locks
    in the value, preventing a later `set_default_originator()` call from
    setting it. This would be fine but is brittle, since any codepath that
    calls `originator()` before app-server can process an `initialize`
    JSON-RPC call would prevent app-server from setting it. This was
    actually the case with OTEL initialization which runs on boot, but I
    also saw this behavior in certain tests.
    
    Instead, what we now do is:
    - [unchanged] If `CODEX_INTERNAL_ORIGINATOR_OVERRIDE` env var is set,
    `originator()` would return that value and `set_default_originator()`
    with some other value does NOT override it.
    - [new] If no env var is set, `originator()` would return the default
    value which is `codex_cli_rs` UNTIL `set_default_originator()` is called
    once, in which case it is set to the new value and becomes immutable.
    Later calls to `set_default_originator()` returns
    `SetOriginatorError::AlreadyInitialized`.
    
    **Other notes**
    - I updated `codex_core::otel_init::build_provider` to accepts a service
    name override, and app-server sends a hardcoded `codex_app_server`
    service name to distinguish it from `codex_cli_rs` used by default (e.g.
    TUI).
    
    **Next steps**
    - Update VSCE to set the proper value for `clientInfo.name` on
    `initialize` and drop the `CODEX_INTERNAL_ORIGINATOR_OVERRIDE` env var.
    - Delete support for `CODEX_INTERNAL_ORIGINATOR_OVERRIDE` in codex-rs.
  • Immutable CodexAuth (#8857)
    Historically we started with a CodexAuth that knew how to refresh it's
    own tokens and then added AuthManager that did a different kind of
    refresh (re-reading from disk).
    
    I don't think it makes sense for both `CodexAuth` and `AuthManager` to
    be mutable and contain behaviors.
    
    Move all refresh logic into `AuthManager` and keep `CodexAuth` as a data
    object.
  • feat: introduce find_resource! macro that works with Cargo or Bazel (#8879)
    To support Bazelification in https://github.com/openai/codex/pull/8875,
    this PR introduces a new `find_resource!` macro that we use in place of
    our existing logic in tests that looks for resources relative to the
    compile-time `CARGO_MANIFEST_DIR` env var.
    
    To make this work, we plan to add the following to all `rust_library()`
    and `rust_test()` Bazel rules in the project:
    
    ```
    rustc_env = {
        "BAZEL_PACKAGE": native.package_name(),
    },
    ```
    
    Our new `find_resource!` macro reads this value via
    `option_env!("BAZEL_PACKAGE")` so that the Bazel package _of the code
    using `find_resource!`_ is injected into the code expanded from the
    macro. (If `find_resource()` were a function, then
    `option_env!("BAZEL_PACKAGE")` would always be
    `codex-rs/utils/cargo-bin`, which is not what we want.)
    
    Note we only consider the `BAZEL_PACKAGE` value when the `RUNFILES_DIR`
    environment variable is set at runtime, indicating that the test is
    being run by Bazel. In this case, we have to concatenate the runtime
    `RUNFILES_DIR` with the compile-time `BAZEL_PACKAGE` value to build the
    path to the resource.
    
    In testing this change, I discovered one funky edge case in
    `codex-rs/exec-server/tests/common/lib.rs` where we have to _normalize_
    (but not canonicalize!) the result from `find_resource!` because the
    path contains a `common/..` component that does not exist on disk when
    the test is run under Bazel, so it must be semantically normalized using
    the [`path-absolutize`](https://crates.io/crates/path-absolutize) crate
    before it is passed to `dotslash fetch`.
    
    Because this new behavior may be non-obvious, this PR also updates
    `AGENTS.md` to make humans/Codex aware that this API is preferred.
  • chore: unify conversation with thread name (#8830)
    Done and verified by Codex + refactor feature of RustRover
  • feat(app-server): thread/rollback API (#8454)
    Add `thread/rollback` to app-server to support IDEs undo-ing the last N
    turns of a thread.
    
    For context, an IDE partner will be supporting an "undo" capability
    where the IDE (the app-server client) will be responsible for reverting
    the local changes made during the last turn. To support this well, we
    also need a way to drop the last turn (or more generally, the last N
    turns) from the agent's context. This is what `thread/rollback` does.
    
    **Core idea**: A Thread rollback is represented as a persisted event
    message (EventMsg::ThreadRollback) in the rollout JSONL file, not by
    rewriting history. On resume, both the model's context (core replay) and
    the UI turn list (app-server v2's thread history builder) apply these
    markers so the pruned history is consistent across live conversations
    and `thread/resume`.
    
    Implementation notes:
    - Rollback only affects agent context and appends to the rollout file;
    clients are responsible for reverting files on disk.
    - If a thread rollback is currently in progress, subsequent
    `thread/rollback` calls are rejected.
    - Because we use `CodexConversation::submit` and codex core tracks
    active turns, returning an error on concurrent rollbacks is communicated
    via an `EventMsg::Error` with a new variant
    `CodexErrorInfo::ThreadRollbackFailed`. app-server watches for that and
    sends the BAD_REQUEST RPC response.
    
    Tests cover thread rollbacks in both core and app-server, including when
    `num_turns` > existing turns (which clears all turns).
    
    **Note**: this explicitly does **not** behave like `/undo` which we just
    removed from the CLI, which does the opposite of what `thread/rollback`
    does. `/undo` reverts local changes via ghost commits/snapshots and does
    not modify the agent's context / conversation history.
  • Allow global exec flags after resume and fix CI codex build/timeout (#8440)
    **Motivation**
    - Bring `codex exec resume` to parity with top‑level flags so global
    options (git check bypass, json, model, sandbox toggles) work after the
    subcommand, including when outside a git repo.
    
    **Description**
    - Exec CLI: mark `--skip-git-repo-check`, `--json`, `--model`,
    `--full-auto`, and `--dangerously-bypass-approvals-and-sandbox` as
    global so they’re accepted after `resume`.
    - Tests: add `exec_resume_accepts_global_flags_after_subcommand` to
    verify those flags work when passed after `resume`.
    
    **Testing**
    - `just fmt`
    - `cargo test -p codex-exec` (pass; ran with elevated perms to allow
    network/port binds)
    - Manual: exercised `codex exec resume` with global flags after the
    subcommand to confirm behavior.
  • [chore] add additional_details to StreamErrorEvent + wire through (#8307)
    ### What
    
    Builds on #8293.
    
    Add `additional_details`, which contains the upstream error message, to
    relevant structures used to pass along retryable `StreamError`s.
    
    Uses the new TUI status indicator's `details` field (shows under the
    status header) to display the `additional_details` error to the user on
    retryable `Reconnecting...` errors. This adds clarity for users for
    retryable errors.
    
    Will make corresponding change to VSCode extension to show
    `additional_details` as expandable from the `Reconnecting...` cell.
    
    Examples:
    <img width="1012" height="326" alt="image"
    src="https://github.com/user-attachments/assets/f35e7e6a-8f5e-4a2f-a764-358101776996"
    />
    
    <img width="1526" height="358" alt="image"
    src="https://github.com/user-attachments/assets/0029cbc0-f062-4233-8650-cc216c7808f0"
    />
  • feat: introduce codex-utils-cargo-bin as an alternative to assert_cmd::Command (#8496)
    This PR introduces a `codex-utils-cargo-bin` utility crate that
    wraps/replaces our use of `assert_cmd::Command` and
    `escargot::CargoBuild`.
    
    As you can infer from the introduction of `buck_project_root()` in this
    PR, I am attempting to make it possible to build Codex under
    [Buck2](https://buck2.build) as well as `cargo`. With Buck2, I hope to
    achieve faster incremental local builds (largely due to Buck2's
    [dice](https://buck2.build/docs/insights_and_knowledge/modern_dice/)
    build strategy, as well as benefits from its local build daemon) as well
    as faster CI builds if we invest in remote execution and caching.
    
    See
    https://buck2.build/docs/getting_started/what_is_buck2/#why-use-buck2-key-advantages
    for more details about the performance advantages of Buck2.
    
    Buck2 enforces stronger requirements in terms of build and test
    isolation. It discourages assumptions about absolute paths (which is key
    to enabling remote execution). Because the `CARGO_BIN_EXE_*` environment
    variables that Cargo provides are absolute paths (which
    `assert_cmd::Command` reads), this is a problem for Buck2, which is why
    we need this `codex-utils-cargo-bin` utility.
    
    My WIP-Buck2 setup sets the `CARGO_BIN_EXE_*` environment variables
    passed to a `rust_test()` build rule as relative paths.
    `codex-utils-cargo-bin` will resolve these values to absolute paths,
    when necessary.
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/8496).
    * #8498
    * __->__ #8496
  • chore: enusre the logic that creates ConfigLayerStack has access to cwd (#8353)
    `load_config_layers_state()` should load config from a
    `.codex/config.toml` in any folder between the `cwd` for a thread and
    the project root. Though in order to do that,
    `load_config_layers_state()` needs to know what the `cwd` is, so this PR
    does the work to thread the `cwd` through for existing callsites.
    
    A notable exception is the `/config` endpoint in app server for which a
    `cwd` is not guaranteed to be associated with the query, so the `cwd`
    param is `Option<AbsolutePathBuf>` to account for this case.
    
    The logic to make use of the `cwd` will be done in a follow-up PR.
  • feat: support allowed_sandbox_modes in requirements.toml (#8298)
    This adds support for `allowed_sandbox_modes` in `requirements.toml` and
    provides legacy support for constraining sandbox modes in
    `managed_config.toml`. This is converted to `Constrained<SandboxPolicy>`
    in `ConfigRequirements` and applied to `Config` such that constraints
    are enforced throughout the harness.
    
    Note that, because `managed_config.toml` is deprecated, we do not add
    support for the new `external-sandbox` variant recently introduced in
    https://github.com/openai/codex/pull/8290. As noted, that variant is not
    supported in `config.toml` today, but can be configured programmatically
    via app server.
  • chore: cleanup Config instantiation codepaths (#8226)
    This PR does various types of cleanup before I can proceed with more
    ambitious changes to config loading.
    
    First, I noticed duplicated code across these two methods:
    
    
    https://github.com/openai/codex/blob/774bd9e432fa2e0f4e059e97648cf92216912e19/codex-rs/core/src/config/mod.rs#L314-L324
    
    
    https://github.com/openai/codex/blob/774bd9e432fa2e0f4e059e97648cf92216912e19/codex-rs/core/src/config/mod.rs#L334-L344
    
    This has now been consolidated in
    `load_config_as_toml_with_cli_overrides()`.
    
    Further, I noticed that `Config::load_with_cli_overrides()` took two
    similar arguments:
    
    
    https://github.com/openai/codex/blob/774bd9e432fa2e0f4e059e97648cf92216912e19/codex-rs/core/src/config/mod.rs#L308-L311
    
    The difference between `cli_overrides` and `overrides` was not
    immediately obvious to me. At first glance, it appears that one should
    be able to be expressed in terms of the other, but it turns out that
    some fields of `ConfigOverrides` (such as `cwd` and
    `codex_linux_sandbox_exe`) are, by design, not configurable via a
    `.toml` file or a command-line `--config` flag.
    
    That said, I discovered that many callers of
    `Config::load_with_cli_overrides()` were passing
    `ConfigOverrides::default()` for `overrides`, so I created two separate
    methods:
    
    - `Config::load_with_cli_overrides(cli_overrides: Vec<(String,
    TomlValue)>)`
    - `Config::load_with_cli_overrides_and_harness_overrides(cli_overrides:
    Vec<(String, TomlValue)>, harness_overrides: ConfigOverrides)`
    
    The latter has a long name, as it is _not_ what should be used in the
    common case, so the extra typing is designed to draw attention to this
    fact. I tried to update the existing callsites to use the shorter name,
    where possible.
    
    Further, in the cases where `ConfigOverrides` is used, usually only a
    limited subset of fields are actually set, so I updated the declarations
    to leverage `..Default::default()` where possible.
  • feat: Constrain values for approval_policy (#7778)
    Constrain `approval_policy` through new `admin_policy` config.
    
    This PR will:
    1. Add a `admin_policy` section to config, with a single field (for now)
    `allowed_approval_policies`. This list constrains the set of
    user-settable `approval_policy`s.
    2. Introduce a new `Constrained<T>` type, which combines a current value
    and a validator function. The validator function ensures disallowed
    values are not set.
    3. Change the type of `approval_policy` on `Config` and
    `SessionConfiguration` from `AskForApproval` to
    `Constrained<AskForApproval>`. The validator function is set by the
    values passed into `allowed_approval_policies`.
    4. `GenericDisplayRow`: add a `disabled_reason: Option<String>`. When
    set, it disables selection of the value and indicates as such in the
    menu. This also makes it unselectable with arrow keys or numbers. This
    is used in the `/approvals` menu.
    
    Follow ups are:
    1. Do the same thing to `sandbox_policy`.
    2. Propagate the allowed set of values through app-server for the
    extension (though already this should prevent app-server from setting
    this values, it's just that we want to disable UI elements that are
    unsettable).
    
    Happy to split this PR up if you prefer, into the logical numbered areas
    above. Especially if there are parts we want to gavel on separately
    (e.g. admin_policy).
    
    Disabled full access:
    <img width="1680" height="380" alt="image"
    src="https://github.com/user-attachments/assets/1fb61c8c-1fcb-4dc4-8355-2293edb52ba0"
    />
    
    Disabled `--yolo` on startup:
    <img width="749" height="76" alt="image"
    src="https://github.com/user-attachments/assets/0a1211a0-6eb1-40d6-a1d7-439c41e94ddb"
    />
    
    CODEX-4087
  • feat: change ConfigLayerName into a disjoint union rather than a simple enum (#8095)
    This attempts to tighten up the types related to "config layers."
    Currently, `ConfigLayerEntry` is defined as follows:
    
    
    https://github.com/openai/codex/blob/bef36f4ae765f471d7cd69372fcf1b92c8f0367a/codex-rs/core/src/config_loader/state.rs#L19-L25
    
    but the `source` field is a bit of a lie, as:
    
    - for `ConfigLayerName::Mdm`, it is
    `"com.openai.codex/config_toml_base64"`
    - for `ConfigLayerName::SessionFlags`, it is `"--config"`
    - for `ConfigLayerName::User`, it is `"config.toml"` (just the file
    name, not the path to the `config.toml` on disk that was read)
    - for `ConfigLayerName::System`, it seems like it is usually
    `/etc/codex/managed_config.toml` in practice, though on Windows, it is
    `%CODEX_HOME%/managed_config.toml`:
    
    
    https://github.com/openai/codex/blob/bef36f4ae765f471d7cd69372fcf1b92c8f0367a/codex-rs/core/src/config_loader/layer_io.rs#L84-L101
    
    All that is to say, in three out of the four `ConfigLayerName`, `source`
    is a `PathBuf` that is not an absolute path (or even a true path).
    
    This PR tries to uplevel things by eliminating `source` from
    `ConfigLayerEntry` and turning `ConfigLayerName` into a disjoint union
    named `ConfigLayerSource` that has the appropriate metadata for each
    variant, favoring the use of `AbsolutePathBuf` where appropriate:
    
    ```rust
    pub enum ConfigLayerSource {
        /// Managed preferences layer delivered by MDM (macOS only).
        #[serde(rename_all = "camelCase")]
        #[ts(rename_all = "camelCase")]
        Mdm { domain: String, key: String },
        /// Managed config layer from a file (usually `managed_config.toml`).
        #[serde(rename_all = "camelCase")]
        #[ts(rename_all = "camelCase")]
        System { file: AbsolutePathBuf },
        /// Session-layer overrides supplied via `-c`/`--config`.
        SessionFlags,
        /// User config layer from a file (usually `config.toml`).
        #[serde(rename_all = "camelCase")]
        #[ts(rename_all = "camelCase")]
        User { file: AbsolutePathBuf },
    }
    ```
  • Add public skills + improve repo skill discovery and error UX (#8098)
    1. Adds SkillScope::Public end-to-end (core + protocol) and loads skills
    from the public cache directory
    2. Improves repo skill discovery by searching upward for the nearest
    .codex/skills within a git repo
    3. Deduplicates skills by name with deterministic ordering to avoid
    duplicates across sources
    4. Fixes garbled “Skill errors” overlay rendering by preventing pending
    history lines from being injected during the modal
    5. Updates the project docs “Skills” intro wording to avoid hardcoded
    paths
  • Reimplement skills loading using SkillsManager + skills/list op. (#7914)
    refactor the way we load and manage skills:
    1. Move skill discovery/caching into SkillsManager and reuse it across
    sessions.
    2. Add the skills/list API (Op::ListSkills/SkillsListResponse) to fetch
    skills for one or more cwds. Also update app-server for VSCE/App;
    3. Trigger skills/list during session startup so UIs preload skills and
    handle errors immediately.
  • fix: introduce AbsolutePathBuf as part of sandbox config (#7856)
    Changes the `writable_roots` field of the `WorkspaceWrite` variant of
    the `SandboxPolicy` enum from `Vec<PathBuf>` to `Vec<AbsolutePathBuf>`.
    This is helpful because now callers can be sure the value is an absolute
    path rather than a relative one. (Though when using an absolute path in
    a Seatbelt config policy, we still have to _canonicalize_ it first.)
    
    Because `writable_roots` can be read from a config file, it is important
    that we are able to resolve relative paths properly using the parent
    folder of the config file as the base path.
  • Inject SKILL.md when it's explicitly mentioned. (#7763)
    1. Skills load once in core at session start; the cached outcome is
    reused across core and surfaced to TUI via SessionConfigured.
    2. TUI detects explicit skill selections, and core injects the matching
    SKILL.md content into the turn when a selected skill is present.
  • make model optional in config (#7769)
    - Make Config.model optional and centralize default-selection logic in
    ModelsManager, including a default_model helper (with
    codex-auto-balanced when available) so sessions now carry an explicit
    chosen model separate from the base config.
    - Resolve `model` once in `core` and `tui` from config. Then store the
    state of it on other structs.
    - Move refreshing models to be before resolving the default model
  • Removed experimental "command risk assessment" feature (#7799)
    This experimental feature received lukewarm reception during internal
    testing. Removing from the code base.
  • Migrate model preset (#7542)
    - Introduce `openai_models` in `/core`
    - Move `PRESETS` under it
    - Move `ModelPreset`, `ModelUpgrade`, `ReasoningEffortPreset`,
    `ReasoningEffortPreset`, and `ReasoningEffortPreset` to `protocol`
    - Introduce `Op::ListModels` and `EventMsg::AvailableModels`
    
    Next steps:
    - migrate `app-server` and `tui` to use the introduced Operation
  • seatbelt: allow openpty() (#7507)
    This allows `openpty(3)` to run in the default sandbox. Also permit
    reading `kern.argmax`, which is the maximum number of arguments to
    exec().
  • chore: add cargo-deny configuration (#7119)
    - add GitHub workflow running cargo-deny on push/PR
    - document cargo-deny allowlist with workspace-dep notes and advisory
    ignores
    - align workspace crates to inherit version/edition/license for
    consistent checks
  • support MCP elicitations (#6947)
    No support for request schema yet, but we'll at least show the message
    and allow accept/decline.
    
    <img width="823" height="551" alt="Screenshot 2025-11-21 at 2 44 05 PM"
    src="https://github.com/user-attachments/assets/6fbb892d-ca12-4765-921e-9ac4b217534d"
    />
  • [app-server] feat: add Declined status for command exec (#7101)
    Add a `Declined` status for when we request an approval from the user
    and the user declines. This allows us to distinguish from commands that
    actually ran, but failed.
    
    This behaves similarly to apply_patch / FileChange, which does the same
    thing.
  • [app-server] update doc with codex error info (#6941)
    Document new codex error info. Also fixed the name from
    `codex_error_code` to `codex_error_info`.
  • [app-server & core] introduce new codex error code and v2 app-server error events (#6938)
    This PR does two things:
    1. populate a new `codex_error_code` protocol in error events sent from
    core to client;
    2. old v1 core events `codex/event/stream_error` and `codex/event/error`
    will now both become `error`. We also show codex error code for
    turncompleted -> error status.
    
    new events in app server test:
    ```
    < {
    <   "method": "codex/event/stream_error",
    <   "params": {
    <     "conversationId": "019aa34c-0c14-70e0-9706-98520a760d67",
    <     "id": "0",
    <     "msg": {
    <       "codex_error_code": {
    <         "response_stream_disconnected": {
    <           "http_status_code": 401
    <         }
    <       },
    <       "message": "Reconnecting... 2/5",
    <       "type": "stream_error"
    <     }
    <   }
    < }
    
     {
    <   "method": "error",
    <   "params": {
    <     "error": {
    <       "codexErrorCode": {
    <         "responseStreamDisconnected": {
    <           "httpStatusCode": 401
    <         }
    <       },
    <       "message": "Reconnecting... 2/5"
    <     }
    <   }
    < }
    
    < {
    <   "method": "turn/completed",
    <   "params": {
    <     "turn": {
    <       "error": {
    <         "codexErrorCode": {
    <           "responseTooManyFailedAttempts": {
    <             "httpStatusCode": 401
    <           }
    <         },
    <         "message": "exceeded retry limit, last status: 401 Unauthorized, request id: 9a1b495a1a97ed3e-SJC"
    <       },
    <       "id": "0",
    <       "items": [],
    <       "status": "failed"
    <     }
    <   }
    < }
    ```
  • codex-exec: allow resume --last to read prompt #6717 (#6719)
    ### Description
    
    - codex exec --json resume --last "<prompt>" bailed out because clap
    treated the prompt as SESSION_ID. I removed the conflicts_with flag and
    reinterpret that positional as a prompt when
    --last is set, so the flow now keeps working in JSON mode.
    (codex-rs/exec/src/cli.rs:84-104, codex-rs/exec/src/lib.rs:75-130)
    - Added a regression test that exercises resume --last in JSON mode to
    ensure the prompt is accepted and the rollout file is updated.
    (codex-rs/exec/tests/suite/resume.rs:126-178)
    
    ### Testing
    
      - just fmt
      - cargo test -p codex-exec
      - just fix -p codex-exec
      - cargo test -p codex-exec
    
    #6717
    
    Signed-off-by: Dmitri Khokhlov <dkhokhlov@cribl.io>
  • [app-server] feat: v2 apply_patch approval flow (#6760)
    This PR adds the API V2 version of the apply_patch approval flow, which
    centers around `ThreadItem::FileChange`.
    
    This PR wires the new RPC (`item/fileChange/requestApproval`, V2 only)
    and related events (`item/started`, `item/completed` for
    `ThreadItem::FileChange`, which are emitted in both V1 and V2) through
    the app-server
    protocol. The new approval RPC is only sent when the user initiates a
    turn with the new `turn/start` API so we don't break backwards
    compatibility with VSCE.
    
    Similar to https://github.com/openai/codex/pull/6758, the approach I
    took was to make as few changes to the Codex core as possible,
    leveraging existing `EventMsg` core events, and translating those in
    app-server. I did have to add a few additional fields to
    `EventMsg::PatchApplyBegin` and `EventMsg::PatchApplyEnd`, but those
    were fairly lightweight.
    
    However, the `EventMsg`s emitted by core are the following:
    ```
    1) Auto-approved (no request for approval)

    - EventMsg::PatchApplyBegin
    - EventMsg::PatchApplyEnd
    
    2) Approved by user
    - EventMsg::ApplyPatchApprovalRequest
    - EventMsg::PatchApplyBegin
    - EventMsg::PatchApplyEnd
    
    3) Declined by user
    - EventMsg::ApplyPatchApprovalRequest
    - EventMsg::PatchApplyBegin
    - EventMsg::PatchApplyEnd
    ```
    
    For a request triggering an approval, this would result in:
    ```
    item/fileChange/requestApproval
    item/started
    item/completed
    ```
    
    which is different from the `ThreadItem::CommandExecution` flow
    introduced in https://github.com/openai/codex/pull/6758, which does the
    below and is preferable:
    ```
    item/started
    item/commandExecution/requestApproval
    item/completed
    ```
    
    To fix this, we leverage `TurnSummaryStore` on codex_message_processor
    to store a little bit of state, allowing us to fire `item/started` and
    `item/fileChange/requestApproval` whenever we receive the underlying
    `EventMsg::ApplyPatchApprovalRequest`, and no-oping when we receive the
    `EventMsg::PatchApplyBegin` later.
    
    This is much less invasive than modifying the order of EventMsg within
    core (I tried).
    
    The resulting payloads:
    ```
    {
      "method": "item/started",
      "params": {
        "item": {
          "changes": [
            {
              "diff": "Hello from Codex!\n",
              "kind": "add",
              "path": "/Users/owen/repos/codex/codex-rs/APPROVAL_DEMO.txt"
            }
          ],
          "id": "call_Nxnwj7B3YXigfV6Mwh03d686",
          "status": "inProgress",
          "type": "fileChange"
        }
      }
    }
    ```
    
    ```
    {
      "id": 0,
      "method": "item/fileChange/requestApproval",
      "params": {
        "grantRoot": null,
        "itemId": "call_Nxnwj7B3YXigfV6Mwh03d686",
        "reason": null,
        "threadId": "019a9e11-8295-7883-a283-779e06502c6f",
        "turnId": "1"
      }
    }
    ```
    
    ```
    {
      "id": 0,
      "result": {
        "decision": "accept"
      }
    }
    ```
    
    ```
    {
      "method": "item/completed",
      "params": {
        "item": {
          "changes": [
            {
              "diff": "Hello from Codex!\n",
              "kind": "add",
              "path": "/Users/owen/repos/codex/codex-rs/APPROVAL_DEMO.txt"
            }
          ],
          "id": "call_Nxnwj7B3YXigfV6Mwh03d686",
          "status": "completed",
          "type": "fileChange"
        }
      }
    }
    ```
  • Revert "[core] add optional status_code to error events (#6865)" (#6955)
    This reverts commit c2ec477d93.
    
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
    
    Include a link to a bug report or enhancement request.
  • [core] add optional status_code to error events (#6865)
    We want to better uncover error status code for clients. Add an optional
    status_code to error events (thread error, error, stream error) so app
    server could uncover the status code from the client side later.
    
    in event log:
    ```
    < {
    <   "method": "codex/event/stream_error",
    <   "params": {
    <     "conversationId": "019a9a32-f576-7292-9711-8e57e8063536",
    <     "id": "0",
    <     "msg": {
    <       "message": "Reconnecting... 5/5",
    <       "status_code": 401,
    <       "type": "stream_error"
    <     }
    <   }
    < }
    < {
    <   "method": "codex/event/error",
    <   "params": {
    <     "conversationId": "019a9a32-f576-7292-9711-8e57e8063536",
    <     "id": "0",
    <     "msg": {
    <       "message": "exceeded retry limit, last status: 401 Unauthorized, request id: 9a0cb03a485067f7-SJC",
    <       "status_code": 401,
    <       "type": "error"
    <     }
    <   }
    < }
    ```
  • fix: add more fields to ThreadStartResponse and ThreadResumeResponse (#6847)
    This adds the following fields to `ThreadStartResponse` and
    `ThreadResumeResponse`:
    
    ```rust
        pub model: String,
        pub model_provider: String,
        pub cwd: PathBuf,
        pub approval_policy: AskForApproval,
        pub sandbox: SandboxPolicy,
        pub reasoning_effort: Option<ReasoningEffort>,
    ```
    
    This is important because these fields are optional in
    `ThreadStartParams` and `ThreadResumeParams`, so the caller needs to be
    able to determine what values were ultimately used to start/resume the
    conversation. (Though note that any of these could be changed later
    between turns in the conversation.)
    
    Though to get this information reliably, it must be read from the
    internal `SessionConfiguredEvent` that is created in response to the
    start of a conversation. Because `SessionConfiguredEvent` (as defined in
    `codex-rs/protocol/src/protocol.rs`) did not have all of these fields, a
    number of them had to be added as part of this PR.
    
    Because `SessionConfiguredEvent` is referenced in many tests, test
    instances of `SessionConfiguredEvent` had to be updated, as well, which
    is why this PR touches so many files.
  • Update defaults to gpt-5.1 (#6652)
    ## Summary
    - update documentation, example configs, and automation defaults to
    reference gpt-5.1 / gpt-5.1-codex
    - bump the CLI and core configuration defaults, model presets, and error
    messaging to the new models while keeping the model-family/tool coverage
    for legacy slugs
    - refresh tests, fixtures, and TUI snapshots so they expect the upgraded
    defaults
    
    ## Testing
    - `cargo test -p codex-core
    config::tests::test_precedence_fixture_with_gpt5_profile`
    
    
    ------
    [Codex
    Task](https://chatgpt.com/codex/tasks/task_i_6916c5b3c2b08321ace04ee38604fc6b)
  • [app-server] feat: add v2 command execution approval flow (#6758)
    This PR adds the API V2 version of the command‑execution approval flow
    for the shell tool.
    
    This PR wires the new RPC (`item/commandExecution/requestApproval`, V2
    only) and related events (`item/started`, `item/completed`, and
    `item/commandExecution/delta`, which are emitted in both V1 and V2)
    through the app-server
    protocol. The new approval RPC is only sent when the user initiates a
    turn with the new `turn/start` API so we don't break backwards
    compatibility with VSCE.
    
    The approach I took was to make as few changes to the Codex core as
    possible, leveraging existing `EventMsg` core events, and translating
    those in app-server. I did have to add additional fields to
    `EventMsg::ExecCommandEndEvent` to capture the command's input so that
    app-server can statelessly transform these events to a
    `ThreadItem::CommandExecution` item for the `item/completed` event.
    
    Once we stabilize the API and it's complete enough for our partners, we
    can work on migrating the core to be aware of command execution items as
    a first-class concept.
    
    **Note**: We'll need followup work to make sure these APIs work for the
    unified exec tool, but will wait til that's stable and landed before
    doing a pass on app-server.
    
    Example payloads below:
    ```
    {
      "method": "item/started",
      "params": {
        "item": {
          "aggregatedOutput": null,
          "command": "/bin/zsh -lc 'touch /tmp/should-trigger-approval'",
          "cwd": "/Users/owen/repos/codex/codex-rs",
          "durationMs": null,
          "exitCode": null,
          "id": "call_lNWWsbXl1e47qNaYjFRs0dyU",
          "parsedCmd": [
            {
              "cmd": "touch /tmp/should-trigger-approval",
              "type": "unknown"
            }
          ],
          "status": "inProgress",
          "type": "commandExecution"
        }
      }
    }
    ```
    
    ```
    {
      "id": 0,
      "method": "item/commandExecution/requestApproval",
      "params": {
        "itemId": "call_lNWWsbXl1e47qNaYjFRs0dyU",
        "parsedCmd": [
          {
            "cmd": "touch /tmp/should-trigger-approval",
            "type": "unknown"
          }
        ],
        "reason": "Need to create file in /tmp which is outside workspace sandbox",
        "risk": null,
        "threadId": "019a93e8-0a52-7fe3-9808-b6bc40c0989a",
        "turnId": "1"
      }
    }
    ```
    
    ```
    {
      "id": 0,
      "result": {
        "acceptSettings": {
          "forSession": false
        },
        "decision": "accept"
      }
    }
    ```
    
    ```
    {
      "params": {
        "item": {
          "aggregatedOutput": null,
          "command": "/bin/zsh -lc 'touch /tmp/should-trigger-approval'",
          "cwd": "/Users/owen/repos/codex/codex-rs",
          "durationMs": 224,
          "exitCode": 0,
          "id": "call_lNWWsbXl1e47qNaYjFRs0dyU",
          "parsedCmd": [
            {
              "cmd": "touch /tmp/should-trigger-approval",
              "type": "unknown"
            }
          ],
          "status": "completed",
          "type": "commandExecution"
        }
      }
    }
    ```
  • LM Studio OSS Support (#2312)
    ## Overview
    
    Adds LM Studio OSS support. Closes #1883
    
    
    ### Changes
    This PR enhances the behavior of `--oss` flag to support LM Studio as a
    provider. Additionally, it introduces a new flag`--local-provider` which
    can take in `lmstudio` or `ollama` as values if the user wants to
    explicitly choose which one to use.
    
    If no provider is specified `codex --oss` will auto-select the provider
    based on whichever is running.
    
    #### Additional enhancements 
    The default can be set using `oss-provider` in config like:
    
    ```
    oss_provider = "lmstudio"
    ```
    
    For non-interactive users, they will need to either provide the provider
    as an arg or have it in their `config.toml`
    
    ### Notes
    For best performance, [set the default context
    length](https://lmstudio.ai/docs/app/advanced/per-model) for gpt-oss to
    the maximum your machine can support
    
    ---------
    
    Co-authored-by: Matt Clayton <matt@lmstudio.ai>
    Co-authored-by: Eric Traut <etraut@openai.com>
  • core/tui: non-blocking MCP startup (#6334)
    This makes MCP startup not block TUI startup. Messages sent while MCPs
    are booting will be queued.
    
    
    https://github.com/user-attachments/assets/96e1d234-5d8f-4932-a935-a675d35c05e0
    
    
    Fixes #6317
    
    ---------
    
    Co-authored-by: pakrym-oai <pakrym@openai.com>
  • feat: better UI for unified_exec (#6515)
    <img width="376" height="132" alt="Screenshot 2025-11-12 at 17 36 22"
    src="https://github.com/user-attachments/assets/ce693f0d-5ca0-462e-b170-c20811dcc8d5"
    />
  • feat: Add support for --add-dir to exec and TypeScript SDK (#6565)
    ## Summary
    
    Adds support for specifying additional directories in the TypeScript SDK
    through a new `additionalDirectories` option in `ThreadOptions`.
    
    ## Changes
    
    - Added `additionalDirectories` parameter to `ThreadOptions` interface
    - Updated `CodexExec` to accept and pass through additional directories
    via the `--config` flag for `sandbox_workspace_write.writable_roots`
    - Added comprehensive test coverage for the new functionality
    
    ## Test plan
    
    - Added test case that verifies `additionalDirectories` is correctly
    passed as repeated flags
    - Existing tests continue to pass
    
    ---------
    
    Co-authored-by: Claude <noreply@anthropic.com>