Commit Graph

4176 Commits

  • feat: pres artifact part 5 (#13355)
    Mostly written by Codex
  • feat: presentation artifact p1 (#13341)
    Part 1 of presentation tool artifact
  • app-server service tier plumbing (plus some cleanup) (#13334)
    followup to https://github.com/openai/codex/pull/13212 to expose fast
    tier controls to app server
    (majority of this PR is generated schema jsons - actual code is +69 /
    -35 and +24 tests )
    
    - add service tier fields to the app-server protocol surfaces used by
    thread lifecycle, turn start, config, and session configured events
    - thread service tier through the app-server message processor and core
    thread config snapshots
    - allow runtime config overrides to carry service tier for app-server
    callers
    
    cleanup:
    - Removing useless "legacy" code supporting "standard" - we moved to
    None | "fast", so "standard" is not needed.
  • fix: agent when profile (#13235)
    Co-authored-by: Josh McKinney <joshka@openai.com>
    Co-authored-by: Codex <noreply@openai.com>
  • fix(core): scope file search gitignore to repository context (#13250)
    Closes #3493
    
    ## Problem
    
    When a user's home directory (or any ancestor) contains a broad
    `.gitignore` (e.g. `*` + `!.gitignore`), the `@` file mention picker in
    Codex silently hides valid repository files like `package.json`. The
    picker returns `no matches` for searches that should succeed. This is
    surprising because manually typed paths still work, making the failure
    hard to diagnose.
    
    ## Mental model
    
    Git itself never walks above the repository root to assemble its ignore
    list. Its `.gitignore` resolution is strictly scoped: it reads
    `.gitignore` files from the repo root downward, the per-repo
    `.git/info/exclude`, and the user's global excludes file (via
    `core.excludesFile`). A `.gitignore` sitting in a parent directory above
    the repo root has no effect on `git status`, `git ls-files`, or any
    other git operation. Our file search should replicate this contract
    exactly.
    
    The `ignore` crate's `WalkBuilder` has a `require_git` flag that
    controls whether it follows this contract:
    
    - `require_git(false)` (the previous setting): the walker reads
    `.gitignore` files from _all_ ancestor directories, even those above or
    outside the repository root. This is a deliberate divergence from git's
    behavior in the `ignore` crate, intended for non-git use cases. It means
    a `~/.gitignore` with `*` will suppress every file in the walk—something
    git itself would never do.
    
    - `require_git(true)` (this fix): the walker only applies `.gitignore`
    semantics when it detects a `.git` directory, scoping ignore resolution
    to the repository boundary. This matches git's own behavior: parent
    `.gitignore` files above the repo root have no effect.
    
    The fix is a one-line change: `require_git(false)` becomes
    `require_git(true)`.
    
    ## How `require_git(false)` got here
    
    The setting was introduced in af338cc (#2981, "Improve @ file search:
    include specific hidden dirs such as .github, .gitlab"). That PR's goal
    was to make hidden directories like `.github` and `.vscode` discoverable
    by setting `.hidden(false)` on the walker. The `require_git(false)` was
    added alongside it with the comment _"Don't require git to be present to
    apply git-related ignore rules"_—the author likely intended gitignore
    rules to still filter results even when no `.git` directory exists (e.g.
    searching an extracted tarball that has a `.gitignore` but no `.git`).
    
    The unintended consequence: with `require_git(false)`, the `ignore`
    crate walks _above_ the search root to find `.gitignore` files in
    ancestor directories. This is a side effect the original author almost
    certainly didn't anticipate. The PR message says "Preserve `.gitignore`
    semantics," but `require_git(false)` actually _breaks_ git's semantics
    by applying ancestor ignore files that git itself would never read.
    
    In short: the intent was "apply gitignore even without `.git`" but the
    effect was "apply gitignore from every ancestor directory." This fix
    restores git-correct scoping.
    
    ## Non-goals
    
    - This PR does not change behavior when `respect_gitignore` is `false`
    (that path already disables all git-related ignore rules).
    - The first test
    (`parent_gitignore_outside_repo_does_not_hide_repo_files`) intentionally
    omits `git init`. The `ignore` crate's `require_git(true)` causes it to
    skip gitignore processing entirely when no `.git` exists, which is the
    desired behavior for that scenario. A second test
    (`git_repo_still_respects_local_gitignore_when_enabled`) covers the
    complementary case with a real git repo.
    
    ## Tradeoffs
    
    **Behavioral shift**: With `require_git(true)`, directories that contain
    `.gitignore` files but are _not_ inside a git repository will no longer
    have those ignore rules applied during `@` search. This is a correctness
    improvement for the primary use case (searching inside repos), but
    changes behavior for the edge case of searching non-repo directories
    that happen to have `.gitignore` files. In practice, Codex is
    overwhelmingly used inside git repositories, so this tradeoff strongly
    favors the fix.
    
    **Two test strategies**: The first test omits `git init` to verify
    parent ignore leakage is blocked; the second runs `git init` to verify
    the repo's own `.gitignore` is still honored. Together they cover both
    sides of the `require_git(true)` contract.
    
    ## Architecture
    
    The change is in `walker_worker()` within
    `codex-rs/file-search/src/lib.rs`, which configures the
    `ignore::WalkBuilder` used by the file search walker thread. The walker
    feeds discovered file paths into `nucleo` for fuzzy matching. The
    `require_git` flag controls whether the walker consults `.gitignore`
    files at all—it sits upstream of all ignore processing.
    
    ```
    walker_worker
      └─ WalkBuilder::new(root)
           ├─ .hidden(false)         — include dotfiles
           ├─ .follow_links(true)    — follow symlinks
           ├─ .require_git(true)     — ← THE FIX: only apply gitignore in git repos
           └─ (conditional) git_ignore(false), git_global(false), etc.
                └─ applied when respect_gitignore == false
    ```
    
    ## Tests
    
    - `parent_gitignore_outside_repo_does_not_hide_repo_files`: creates a
    temp directory tree with a parent `.gitignore` containing `*`, a child
    "repo" directory with `package.json` and `.vscode/settings.json`, and
    asserts that both files are discoverable via `run()` with
    `respect_gitignore: true`.
    - `git_repo_still_respects_local_gitignore_when_enabled`: the
    complementary test—runs `git init` inside the child directory and
    verifies that the repo's own `.gitignore` exclusions still work (e.g.
    `.vscode/extensions.json` is excluded while `.vscode/settings.json` is
    whitelisted). Confirms that `require_git(true)` does not disable
    gitignore processing inside actual git repositories.
  • add fast mode toggle (#13212)
    - add a local Fast mode setting in codex-core (similar to how model id
    is currently stored on disk locally)
    - send `service_tier=priority` on requests when Fast is enabled
    - add `/fast` in the TUI and persist it locally
    - feature flag
  • tui: preserve kill buffer across submit and slash-command clears (#12006)
    ## Problem
    
    Before this change, composer paths that cleared the textarea after
    submit or slash-command dispatch
    also cleared the textarea kill buffer. That meant a user could `Ctrl+K`
    part of a draft, trigger a
    composer action that cleared the visible draft, and then lose the
    ability to `Ctrl+Y` the killed
    text back.
    
    This was especially awkward for workflows where the user wants to
    temporarily remove text, run a
    composer action such as changing reasoning level or dispatching a slash
    command, and then restore
    the killed text into the now-empty draft.
    
    ## Mental model
    
    This change separates visible draft state from editing-history state.
    
    The visible draft includes the current textarea contents and text
    elements that should be cleared
    when the composer submits or dispatches a command. The kill buffer is
    different: it represents the
    most recent killed text and should survive those composer-driven clears
    so the user can still yank
    it back afterward.
    
    After this change, submit and slash-command dispatch still clear the
    visible textarea contents, but
    they no longer erase the most recent kill.
    
    ## Non-goals
    
    This does not implement a multi-entry kill ring or change the semantics
    of `Ctrl+K` and `Ctrl+Y`
    beyond preserving the existing yank target across these clears.
    
    It also does not change how submit, slash-command parsing, prompt
    expansion, or attachment handling
    work, except that those flows no longer discard the textarea kill buffer
    as a side effect of
    clearing the draft.
    
    ## Tradeoffs
    
    The main tradeoff is that clearing the visible textarea is no longer
    equivalent to fully resetting
    all editing state. That is intentional here, because submit and
    slash-command dispatch are composer
    actions, not requests to forget the user's most recent kill.
    
    The benefit is better editing continuity. The cost is that callers must
    understand that full-buffer
    replacement resets visible draft state but not the kill buffer.
    
    ## Architecture
    
    The behavioral change is in `TextArea`: full-buffer replacement now
    rebuilds text and elements
    without clearing `kill_buffer`.
    
    `ChatComposer` already clears the textarea after successful submit and
    slash-command dispatch by
    calling into those textarea replacement paths. With this change, those
    existing composer flows
    inherit the new behavior automatically: the visible draft is cleared,
    but the last killed text
    remains available for `Ctrl+Y`.
    
    The tests cover both layers:
    
    - `TextArea` verifies that the kill buffer survives full-buffer
    replacement.
    - `ChatComposer` verifies that it survives submit.
    - `ChatComposer` also verifies that it survives slash-command dispatch.
    
    ## Observability
    
    There is no dedicated logging for kill-buffer preservation. The most
    direct way to reason about the
    behavior is to inspect textarea-wide replacement paths and confirm
    whether they treat the kill
    buffer as visible-buffer state or as editing-history state.
    
    If this regresses in the future, the likely failure mode is simple and
    user-visible: `Ctrl+Y` stops
    restoring text after submit or slash-command clears even though ordinary
    kill/yank still works
    within a single uninterrupted draft.
    
    ## Tests
    
    Added focused regression coverage for the new contract:
    
    - `kill_buffer_persists_across_set_text`
    - `kill_buffer_persists_after_submit`
    - `kill_buffer_persists_after_slash_command_dispatch`
    
    Local verification:
    - `just fmt`
    - `cargo test -p codex-tui`
    
    ---------
    
    Co-authored-by: Josh McKinney <joshka@openai.com>
  • chore: remove SkillMetadata.permissions and derive skill sandboxing from permission_profile (#13061)
    ## Summary
    
    This change removes the compiled permissions field from skill metadata
    and keeps permission_profile as the single source of truth.
    
    Skill loading no longer compiles skill permissions eagerly. Instead, the
    zsh-fork skill escalation path compiles `skill.permission_profile` when
    it needs to determine the sandbox to apply for a skill script.
    
      ## Behavior change
    
      For skills that declare:
    ```
      permissions: {}
    ```
    we now treat that the same as having no skill permissions override,
    instead of creating and using a default readonly sandbox. This change
    makes the behavior more intuitive:
    
      - only non-empty skill permission profiles affect sandboxing
    - omitting permissions and writing permissions: {} now mean the same
    thing
    - skill metadata keeps a single permissions representation instead of
    storing derived state too
    
    Overall, this makes skill sandbox behavior easier to understand and more
    predictable.
  • Adjusting plan prompt for clarity and verbosity (#13284)
    `plan.md` prompt changes to tighten plan clarity and verbosity.
  • app-server: Silence thread status changes caused by thread being created (#13079)
    Currently we emit `thread/status/changed` with `Idle` status right
    before sending `thread/started` event (which also has `Idle` status in
    it).
    It feels that there is no point in that as client has no way to know
    prior state of the thread as it didn't exist yet, so silence these kinds
    of notifications.
  • fix(app-server): emit turn/started only when turn actually starts (#13261)
    This is a follow-up for https://github.com/openai/codex/pull/13047
    
    ## Why
    We had a race where `turn/started` could be observed before the thread
    had actually transitioned to `Active`. This was because we eagerly
    emitted `turn/started` in the request handler for `turn/start` (and
    `review/start`).
    
    That was showing up as flaky `thread/resume` tests, but the real issue
    was broader: a client could see `turn/started` and still get back an
    idle thread immediately afterward.
    
    The first idea was to eagerly call
    `thread_watch_manager.note_turn_started(...)` from the `turn/start`
    request path. That turns out to be unsafe, because
    `submit(Op::UserInput)` only queues work. If a turn starts and completes
    quickly, request-path bookkeeping can race with the real lifecycle
    events and leave stale running state behind.
    
    **The real fix** is to move `turn/started` to emit only after the turn
    _actually_ starts, so we do that by waiting for the
    `EventMsg::TurnStarted` notification emitted by codex core. We do this
    for both `turn/start` and `review/start`.
    
    I also verified this change is safe for our first-party codex apps -
    they don't have any assumptions that `turn/started` is emitted before
    the RPC response to `turn/start` (which is correct anyway).
    
    I also removed `single_client_mode` since it isn't really necessary now.
    
    ## Testing
    - `cargo test -p codex-app-server thread_resume -- --nocapture`
    - `cargo test -p codex-app-server
    'suite::v2::turn_start::turn_start_emits_notifications_and_accepts_model_override'
    -- --exact --nocapture`
    - `cargo test -p codex-app-server`
  • Update realtime websocket API (#13265)
    - migrate the realtime websocket transport to the new session and
    handoff flow
    - make the realtime model configurable in config.toml and use API-key
    auth for the websocket
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat(app-server): add tracing to all app-server APIs (#13285)
    ### Overview
    This PR adds the first piece of tracing for app-server JSON-RPC
    requests.
    
    There are two main changes:
    - JSON-RPC requests can now take an optional W3C trace context at the
    top level via a `trace` field (`traceparent` / `tracestate`).
    - app-server now creates a dedicated request span for every inbound
    JSON-RPC request in `MessageProcessor`, and uses the request-level trace
    context as the parent when present.
    
    For compatibility with existing flows, app-server still falls back to
    the TRACEPARENT env var when there is no request-level traceparent.
    
    This PR is intentionally scoped to the app-server boundary. In a
    followup, we'll actually propagate trace context through the async
    handoff into core execution spans like run_turn, which will make
    app-server traces much more useful.
    
    ### Spans
    A few details on the app-server span shape:
    - each inbound request gets its own server span
    - span/resource names are based on the JSON-RPC method (`initialize`,
    `thread/start`, `turn/start`, etc.)
    - spans record transport (stdio vs websocket), request id, connection
    id, and client name/version when available
    - `initialize` stores client metadata in session state so later requests
    on the same connection can reuse it
  • app-server: Update thread/name/set to support not-loaded threads (#13282)
    Currently `thread/name/set` does only work for loaded threads.
    Expand the scope to also support persisted but not-yet-loaded ones for a
    more predictable API surface.
    This will make it possible to rename threads discovered via
    `thread/list` and similar operations.
  • test(app-server): increase flow test timeout to reduce flake (#11814)
    ## Summary
    - increase `DEFAULT_READ_TIMEOUT` in `codex_message_processor_flow` from
    20s to 45s
    - keep test behavior the same while avoiding platform timing flakes
    
    ## Why
    Windows ARM64 CI showed these tests taking about 24s before
    `task_complete`, which could fail early and produce wiremock
    request-count mismatches.
    
    ## Testing
    - just fmt
    - cargo test -p codex-app-server codex_message_processor_flow --
    --nocapture
  • fix(core) shell_snapshot multiline exports (#12642)
    ## Summary
    Codex discovered this one - shell_snapshot tests were breaking on my
    machine because I had a multiline env var. We should handle these!
    
    ## Testing
    - [x] existing tests pass
    - [x] Updated unit tests
  • feat: enable ma through /agent (#13246)
    <img width="639" height="139" alt="Screenshot 2026-03-02 at 16 06 41"
    src="https://github.com/user-attachments/assets/c006fcec-c1e7-41ce-bb84-c121d5ffb501"
    />
    
    Then
    <img width="372" height="37" alt="Screenshot 2026-03-02 at 16 06 49"
    src="https://github.com/user-attachments/assets/aa4ad703-e7e7-4620-9032-f5cd4f48ff79"
    />
  • tui: restore draft footer hints (#13202)
    ## Summary
    - restore `Tab to queue` when a draft is present and the agent is
    running
    - keep draft-idle footers passive by showing the normal footer or status
    line instead of `? for shortcuts`
    - align footer snapshot coverage with the updated draft footer behavior
    
    ## Codex author
    `codex resume 019c7f1c-43aa-73e0-97c7-40f457396bb0`
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Fix project trust config parsing so CLI overrides work (#13090)
    Fixes #13076
    
    This PR fixes a bug that causes command-line config overrides for MCP
    subtables to not be merged correctly.
    
    Summary
    - make project trust loading go through the dedicated struct so CLI
    overrides can update trusted project-local MCP transports
    
    ---------
    
    Co-authored-by: jif-oai <jif@openai.com>
  • fix: use https://git.savannah.gnu.org/git/bash instead of https://github.com/bolinfest/bash (#13057)
    Historically, we cloned the Bash repo from
    https://github.com/bminor/bash, but for whatever reason, it was removed
    at some point.
    
    I had a local clone of it, so I pushed it to
    https://github.com/bolinfest/bash so that we could continue running our
    CI job. I did this in https://github.com/openai/codex/pull/9563, and as
    you can see, I did not tamper with the commit hash we used as the basis
    of this build.
    
    Using a personal fork is not great, so this PR changes the CI job to use
    what appears to be considered the source of truth for Bash, which is
    https://git.savannah.gnu.org/git/bash.git.
    
    Though in testing this out, it appears this Git server does not support
    the combination of `git clone --depth 1
    https://git.savannah.gnu.org/git/bash` and `git fetch --depth 1 origin
    a8a1c2fac029404d3f42cd39f5a20f24b6e4fe4b`, as it fails with the
    following error:
    
    ```
    error: Server does not allow request for unadvertised object a8a1c2fac029404d3f42cd39f5a20f24b6e4fe4b
    ```
    
    so unfortunately this means that we have to do a full clone instead of a
    shallow clone in our CI jobs, which will be a bit slower.
    
    Also updated `codex-rs/shell-escalation/README.md` to reflect this
    change.
  • chore: /multiagent alias for /agent (#13249)
    Add a `/mutli-agents` alias for `/agent` and update the wording
  • core: reuse parent shell snapshot for thread-spawn subagents (#13052)
    ## Summary
    - reuse the parent shell snapshot when spawning/forking/resuming
    `SessionSource::SubAgent(SubAgentSource::ThreadSpawn { .. })` sessions
    - plumb inherited snapshot through `AgentControl -> ThreadManager ->
    Codex::spawn -> SessionConfiguration`
    - skip shell snapshot refresh on cwd updates for thread-spawn subagents
    so inherited snapshots are not replaced
    
    ## Why
    - avoids per-subagent shell snapshot creation and cleanup work
    - keeps thread-spawn subagents on the parent snapshot path, matching the
    intended parent/child snapshot model
    
    ## Validation
    - `just fmt` (in `codex-rs`)
    - `cargo test -p codex-core --no-run`
    - `cargo test -p codex-core spawn_agent -- --nocapture`
    - `cargo test -p codex-core --test all
    suite::agent_jobs::spawn_agents_on_csv_runs_and_exports`
    
    ## Notes
    - full `cargo test -p codex-core --test all` was left running separately
    for broader verification
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat: polluted memories (#13008)
    Add a feature flag to disable memory creation for "polluted"
  • Improve subagent contrast in TUI (#13197)
    ## Summary
    - raise contrast for subagent transcript labels and fallback states
    - remove low-contrast dim styling from role tags and error details
    - make the closed-agent picker dot readable in dark theme
    
    ## Validation
    - just fmt
    - just fix -p codex-tui
    - cargo test -p codex-tui
    
    Co-authored-by: Codex <noreply@openai.com>
  • Fix issue deduplication workflow for Codex issues (#13215)
    Fixes #13203
    
    Summary
    - split the duplicate-finding workflow into two jobs so we gather all
    issues first
    - add an open-issue fallback job that runs only when the full scan finds
    nothing
    - centralize final selection so `comment-on-issue` always sees the best
    dedupe output
  • Record realtime close marker on replacement (#13058)
    ## Summary
    - record a realtime close developer message when a new realtime session
    replaces an active one
    - assert the replacement marker through the mocked responses request
    path
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
    Co-authored-by: Charles Cunningham <ccunningham@openai.com>
  • [codex] include plan type in account updates (#13181)
    This change fixes a Codex app account-state sync bug where clients could
    know the user was signed in but still miss the ChatGPT subscription
    tier, which could lead to incorrect upgrade messaging for paid users.
    
    The root cause was that `account/updated` only carried `authMode` while
    plan information was available separately via `account/read` and
    rate-limit snapshots, so this update adds `planType` to
    `account/updated`, populates it consistently across login and refresh
    paths.
  • fix: MacOSAutomationPermission::BundleIDs should allow communicating … (#12989)
    …with launchservicesd
    
    Add mach lookup for `launchservicesd` when extending the sandbox for
    `MacOSAutomationPermission::BundleIDs`. This is necessary so that the
    target application can be launched for automation.
    
    This omission was due to a spec error in a document, which has been
    fixed.
  • feat: load from plugins (#12864)
    Support loading plugins.
    
    Plugins can now be enabled via [plugins.<name>] in config.toml. They are
    loaded as first-class entities through PluginsManager, and their default
    skills/ and .mcp.json contributions are integrated into the existing
    skills and MCP flows.
  • core: resolve host_executable() rules during preflight (#13065)
    ## Why
    
    [#12964](https://github.com/openai/codex/pull/12964) added
    `host_executable()` support to `codex-execpolicy`, and
    [#13046](https://github.com/openai/codex/pull/13046) adopted it in the
    zsh-fork interception path.
    
    The remaining gap was the preflight execpolicy check in
    `core/src/exec_policy.rs`. That path derives approval requirements
    before execution for `shell`, `shell_command`, and `unified_exec`, but
    it was still using the default exact-token matcher.
    
    As a result, a command that already included an absolute executable
    path, such as `/usr/bin/git status`, could still miss a basename rule
    like `prefix_rule(pattern = ["git"], ...)` during preflight even when
    the policy also defined a matching `host_executable(name = "git", ...)`
    entry.
    
    This PR brings the same opt-in `host_executable()` resolution to the
    preflight approval path when an absolute program path is already present
    in the parsed command.
    
    ## What Changed
    
    - updated
    `ExecPolicyManager::create_exec_approval_requirement_for_command()` in
    `core/src/exec_policy.rs` to use `check_multiple_with_options(...)` with
    `MatchOptions { resolve_host_executables: true }`
    - kept the existing shell parsing flow for approval derivation, but now
    allow basename rules to match absolute executable paths during preflight
    when `host_executable()` permits it
    - updated requested-prefix amendment evaluation to use the same
    host-executable-aware matching mode, so suggested `prefix_rule()`
    amendments are checked consistently for absolute-path commands
    - added preflight coverage for:
    - absolute-path commands that should match basename rules through
    `host_executable()`
    - absolute-path commands whose paths are not in the allowed
    `host_executable()` mapping
      - requested prefix-rule amendments for absolute-path commands
    
    ## Verification
    
    - `just fix -p codex-core`
    - `cargo test -p codex-core --lib exec_policy::tests::`
  • Speed up subagent startup (#12935)
    ## Summary
    - skip online model refresh for subagent sessions
    - avoid rollout flushes during subagent startup
    - keep /models refresh for non-subagent sessions
    
    ## Testing
    - cargo test -p codex-core --test all
    suite::models_etag_responses::refresh_models_on_models_etag_mismatch_and_avoid_duplicate_models_fetch
    - cargo test -p codex-core --test all
    suite::remote_models::remote_models_long_model_slug_is_sent_with_high_reasoning
    - cargo test -p codex-core --test all
    suite::model_switching::model_switch_to_smaller_model_updates_token_context_window
    - cargo test -p codex-core --test all
    suite::compact::pre_sampling_compact_runs_on_switch_to_smaller_context_model
    - cargo test -p codex-core --test all
    suite::compact::pre_sampling_compact_runs_after_resume_and_switch_to_smaller_model
    - cargo test -p codex-core --test all
    suite::personality::remote_model_friendly_personality_instructions_with_feature
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>