Commit Graph

3031 Commits

  • change collaboration mode to struct (#9793)
    Shouldn't cause behavioral change
  • bundle sandbox helper binaries in main zip, for winget. (#9707)
    Winget uses the main codex.exe value as its target.
    The elevated sandbox requires these two binaries to live next to
    codex.exe
  • Remove stale TODO comment from defs.bzl (#9787)
    ### Motivation
    - Remove an outdated comment in `defs.bzl` referencing
    `cargo_build_script` that is no longer relevant.
    
    ### Description
    - Delete the stale `# TODO(zbarsky): cargo_build_script support?` line
    so the logic flows directly from `binaries` to `lib_srcs` in `defs.bzl`.
    
    ### Testing
    - Ran `git diff --check` which produced no errors.
    
    ------
    [Codex
    Task](https://chatgpt.com/codex/tasks/task_i_6973d9ac757c8331be475a8fb0f90a88)
  • Fix resume picker when user event appears after head (#9512)
    Fixes #9501
    
    Contributing guide:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    ## Summary
    The resume picker requires a session_meta line and at least one
    user_message event within the initial head scan. Some rollout files
    contain multiple session_meta entries before the first user_message, so
    the user event can fall outside the default head window and the session
    is omitted from the picker even though it is resumable by ID.
    
    This PR keeps the head summary bounded but extends scanning for a
    user_message once a session_meta has been observed. The summary still
    caps stored head entries, but we allow a small, bounded extra scan to
    find the first user event so valid sessions are not filtered out.
    
    ## Changes
    - Continue scanning past the head limit (bounded) when session_meta is
    present but no user_message has been seen yet.
    - Mark session_meta as seen even if the head summary buffer is already
    full.
    - Add a regression test with multiple session_meta lines before the
    first user_message.
    
    ## Why This Is Safe
    - The head summary remains bounded to avoid unbounded memory usage.
    - The extra scan is capped (USER_EVENT_SCAN_LIMIT) and only triggers
    after a session_meta is seen.
    - Behavior is unchanged for typical files where the user_message appears
    early.
    
    ## Testing
    - cargo test -p codex-core --lib
    test_list_threads_scans_past_head_for_user_event
  • Select default model from filtered presets (#9782)
    Pick the first available preset after auth filtering for default
    selection.
  • Print warning if we skip config loading (#9611)
    https://github.com/openai/codex/pull/9533 silently ignored config if
    untrusted. Instead, we still load it but disable it. Maybe we shouldn't
    try to parse it either...
    
    <img width="939" height="515" alt="Screenshot 2026-01-21 at 14 56 38"
    src="https://github.com/user-attachments/assets/e753cc22-dd99-4242-8ffe-7589e85bef66"
    />
  • Upgrade GitHub Actions for Node 24 compatibility (#9722)
    ## Summary
    
    Upgrade GitHub Actions to their latest versions to ensure compatibility
    with Node 24, as Node 20 will reach end-of-life in April 2026.
    
    ## Changes
    
    | Action | Old Version(s) | New Version | Release | Files |
    |--------|---------------|-------------|---------|-------|
    | `actions/cache` |
    [`v4`](https://github.com/actions/cache/releases/tag/v4) |
    [`v5`](https://github.com/actions/cache/releases/tag/v5) |
    [Release](https://github.com/actions/cache/releases/tag/v5) | bazel.yml
    |
    
    ## Context
    
    Per [GitHub's
    announcement](https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/),
    Node 20 is being deprecated and runners will begin using Node 24 by
    default starting March 4th, 2026.
    
    ### Why this matters
    
    - **Node 20 EOL**: April 2026
    - **Node 24 default**: March 4th, 2026
    - **Action**: Update to latest action versions that support Node 24
    
    ### Security Note
    
    Actions that were previously pinned to commit SHAs remain pinned to SHAs
    (updated to the latest release SHA) to maintain the security benefits of
    immutable references.
    
    ### Testing
    
    These changes only affect CI/CD workflow configurations and should not
    impact application functionality. The workflows should be tested by
    running them on a branch before merging.
    
    Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com>
  • fix(exec): skip git repo check when --yolo flag is used (#9590)
    ## Summary
    
    Fixes #7522
    
    The `--yolo` (`--dangerously-bypass-approvals-and-sandbox`) flag is
    documented to skip all confirmation prompts and execute commands without
    sandboxing, intended solely for running in environments that are
    externally sandboxed. However, it was not bypassing the trusted
    directory (git repo) check, requiring users to also specify
    `--skip-git-repo-check`.
    
    This change makes `--yolo` also skip the git repo check, matching the
    documented behavior and user expectations.
    
    ## Changes
    
    - Modified `codex-rs/exec/src/lib.rs` to check for
    `dangerously_bypass_approvals_and_sandbox` flag in addition to
    `skip_git_repo_check` when determining whether to skip the git repo
    check
    
    ## Testing
    
    - Verified the code compiles with `cargo check -p codex-exec`
    - Ran existing tests with `cargo test -p codex-exec` (34 passed, 8
    integration tests failed due to unrelated API connectivity issues)
    
    ---
    🤖 Generated with [Claude Code](https://claude.ai/code)
    
    Co-authored-by: Claude <noreply@anthropic.com>
  • prompt (#9777)
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
    
    Include a link to a bug report or enhancement request.
  • Persist text element ranges and attached images across history/resume (#9116)
    **Summary**
    - Backtrack selection now rehydrates `text_elements` and
    `local_image_paths` from the chosen user history cell so Esc‑Esc history
    edits preserve image placeholders and attachments.
    - Composer prefill uses the preserved elements/attachments in both `tui`
    and `tui2`.
    - Extended backtrack selection tests to cover image placeholder elements
    and local image paths.
    
    **Changes**
    - `tui/src/app_backtrack.rs`: Backtrack selection now carries text
    elements + local image paths; composer prefill uses them (removes TODO).
    - `tui2/src/app_backtrack.rs`: Same as above.
    - `tui/src/app.rs`: Updated backtrack test to assert restored
    elements/paths.
    - `tui2/src/app.rs`: Same test updates.
    
    ### The original scope of this PR (threading text elements and image
    attachments through the codex harness thoroughly/persistently) was
    broken into the following PRs other than this one:
    
    The diff of this PR was reduced by changing types in a starter PR:
    https://github.com/openai/codex/pull/9235
    
    Then text element metadata was added to protocol, app server, and core
    in this PR: https://github.com/openai/codex/pull/9331
    
    Then the end-to-end flow was completed by wiring TUI/TUI2 input,
    history, and restore behavior in
    https://github.com/openai/codex/pull/9393
    
    Prompt expansion was supported in this PR:
    https://github.com/openai/codex/pull/9518
    
    TextElement optional placeholder field was protected in
    https://github.com/openai/codex/pull/9545
  • chore: use some raw strings to reduce quoting (#9745)
    Small follow-ups for https://github.com/openai/codex/pull/9565. Mainly
    `r#`, but also added some whitespace for early returns.
  • Fix execpolicy parsing for multiline quoted args (#9565)
    ## What
    Fix bash command parsing to accept double-quoted strings that contain
    literal newlines so execpolicy can match allow rules.
    
    ## Why
    Allow rules like [git, commit] should still match when commit messages
    include a newline in a quoted argument; the parser currently rejects
    these strings and falls back to the outer shell invocation.
    
    ## How
    - Validate double-quoted strings by ensuring all named children are
    string_content and then stripping the outer quotes from the raw node
    text so embedded newlines are preserved.
    - Reuse the helper for concatenated arguments.
    - Ensure large SI suffix formatting uses the caller-provided locale
    formatter for grouping.
    - Add coverage for newline-containing quoted arguments.
    
    Fixes #9541.
    
    ## Tests
    - cargo test -p codex-core
    - just fix -p codex-core
    - cargo test -p codex-protocol
    - just fix -p codex-protocol
    - cargo test --all-features
  • feat: add session source as otel metadata tag (#9720)
    Add session.source and user.account_id as global OTEL metric tags to
    identify client surface and user.
  • Hide mode cycle hint while a task is running (#9730)
    ## Summary
    - hide the “(shift+tab to cycle)” suffix on the collaboration mode label
    while a task is running
    - keep the cycle hint visible when idle
    - add a snapshot to cover the running-task label state
  • Change the prompt for planning and reasoning effort (#9733)
    Change the prompt for planning and reasoning effort preset for better
    experience
  • feat(app-server) Expose personality (#9674)
    ### Motivation
    Exposes a per-thread / per-turn `personality` override in the v2
    app-server API so clients can influence model communication style at
    thread/turn start. Ensures the override is passed into the session
    configuration resolution so it becomes effective for subsequent turns
    and headless runners.
    
    ### Testing
    - [x] Add an integration-style test
    `turn_start_accepts_personality_override_v2` in
    `codex-rs/app-server/tests/suite/v2/turn_start.rs` that verifies a
    `/personality` override results in a developer update message containing
    `<personality_spec>` in the outbound model request.
    
    ------
    [Codex
    Task](https://chatgpt.com/codex/tasks/task_i_6971d646b1c08322a689a54d2649f3fe)
  • [connectors] Support connectors part 1 - App server & MCP (#9667)
    In order to make Codex work with connectors, we add a built-in gateway
    MCP that acts as a transparent proxy between the client and the
    connectors. The gateway MCP collects actions that are accessible to the
    user and sends them down to the user, when a connector action is chosen
    to be called, the client invokes the action through the gateway MCP as
    well.
    
     - [x] Add the system built-in gateway MCP to list and run connectors.
     - [x] Add the app server methods and protocol
  • Update models.json (#9726)
    Automated update of models.json.
    
    ---------
    
    Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>
    Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>
  • use machine scope instead of user scope for dpapi. (#9713)
    This fixes a bug where the elevated sandbox setup encrypts sandbox user
    passwords as an admin user, but normal command execution attempts to
    decrypt them as a different user.
    
    Machine scope allows all users to encyrpt/decrypt
    
    this PR also moves the encrypted file to a different location
    .codex/.sandbox-secrets which the sandbox users cannot read.
  • TUI: prompt to implement plan and switch to Execute (#9712)
    ## Summary
    - Replace the plan‑implementation prompt with a standard selection
    popup.
    - “Yes” submits a user turn in Execute via a dedicated app event to
    preserve normal transcript behavior.
    - “No” simply dismisses the popup.
    
    <img width="977" height="433" alt="Screenshot 2026-01-22 at 2 00 54 PM"
    src="https://github.com/user-attachments/assets/91fad06f-7b7a-4cd8-9051-f28a19b750b2"
    />
    
    ## Changes
    - Add a plan‑implementation popup using `SelectionViewParams`.
    - Add `SubmitUserMessageWithMode` so “Yes” routes through
    `submit_user_message` (ensures user history + separator state).
    - Track `saw_plan_update_this_turn` so the prompt appears even when only
    `update_plan` is emitted.
    - Suppress the plan popup on replayed turns, when messages are queued,
    or when a rate‑limit prompt is pending.
    - Add `execute_mode` helper for collaboration modes.
    - Add tests for replay/queued/rate‑limit guards and plan update without
    final message.
    - Add snapshots for both the default and “No”‑selected popup states.
  • feat: support proxy for ws connection (#9719)
    reapply websocket changes without changing tls lib.
  • Fix typo in experimental_prompt.md (#9716)
    Simple typo fix in the first sentence of the experimental_prompt.md
    instructions file.
  • feat: fix formatting of codex features list (#9715)
    The formatting of `codex features list` made it hard to follow. This PR
    introduces column width math to make things nice.
    
    Maybe slightly hard to machine-parse (since not a simple `\t`), but we
    should introduce a `--json` option if that's really important.
    
    You can see the before/after in the screenshot:
    
    <img width="1119" height="932" alt="image"
    src="https://github.com/user-attachments/assets/c99dce85-899a-4a2d-b4af-003938f5e1df"
    />
  • feat(core) update Personality on turn (#9644)
    ## Summary
    Support updating Personality mid-Thread via UserTurn/OverwriteTurn. This
    is explicitly unused by the clients so far, to simplify PRs - app-server
    and tui implementations will be follow-ups.
    
    ## Testing
    - [x] added integration tests
  • Modes label below textarea (#9645)
    # Summary
    - Add a collaboration mode indicator rendered at the bottom-right of the
    TUI composer footer.
    - Style modes per design (Plan in #D72EE1, Execute matching dim context
    style, Pair Programming using the same cyan as text elements).
    - Add shared “(shift+tab to cycle)” hint text for all mode labels and
    align the indicator with the left footer margin.
    
    NOTE: currently this is hidden if the Collaboration Modes feature flag
    is disabled, or in Custom mode. Maybe we should show it in Custom mode
    too? I'll leave that out of this PR though
    
    # UI
    - Mode indicator appears below the textarea, bottom-right of the footer
    line.
    - Includes “(shift+tab to cycle)” and keeps right padding aligned to the
    left footer indent.
    
    <img width="983" height="200" alt="Screenshot 2026-01-21 at 7 17 54 PM"
    src="https://github.com/user-attachments/assets/d1c5e4ed-7d7b-4f6c-9e71-bc3cf6400e0e"
    />
    
    <img width="980" height="200" alt="Screenshot 2026-01-21 at 7 18 53 PM"
    src="https://github.com/user-attachments/assets/d22ff0da-a406-4930-85c5-affb2234e84b"
    />
    
    <img width="979" height="201" alt="Screenshot 2026-01-21 at 7 19 12 PM"
    src="https://github.com/user-attachments/assets/862cb17f-0495-46fa-9b01-a4a9f29b52d5"
    />
  • Support end_turn flag (#9698)
    Experimental flag that signals the end of the turn.
  • Chore: add cmd related info to exec approval request (#9659)
    ### Summary
    We now rely purely on `item/commandExecution/requestApproval` item to
    render pending approval in VSCE and app. With v2 approach, it does not
    include the actual cmd that it is attempting and therefore we can only
    use `proposedExecpolicyAmendment` to render which can be incomplete.
    
    ### Reproduce
    * Add `prefix_rule(pattern=["echo"], decision="prompt")` to your
    `~/.codex/rules.default.rules`.
    * Ask to `Run  echo "approval-test" please` in VSCE or app. 
    * The pending approval protal does show up but with no content
    
    #### Example screenshot
    <img width="3434" height="3648" alt="Screenshot 2026-01-21 at 8 23
    25 PM"
    src="https://github.com/user-attachments/assets/75644837-21f1-40f8-8b02-858d361ff817"
    />
    
    #### Sample output
    ```
      {"method":"item/commandExecution/requestApproval","id":0,"params":{
        "threadId":"019be439-5a90-7600-a7ea-2d2dcc50302a",
        "turnId":"0",
        "itemId":"call_usgnQ4qEX5U9roNdjT7fPzhb",
        "reason":"`/bin/zsh -lc 'echo \"testing\"'` requires approval by policy",
        "proposedExecpolicyAmendment":null
      }}
    
    ```
    
    ### Fix
    Inlude `command` string, `cwd` and `command_actions` in
    `CommandExecutionRequestApprovalParams` so that consumers can display
    the correct command instead of relying on exec policy output.
  • Fix: Lower log level for closed-channel send (#9653)
    ## What?
    - Downgrade the closed-channel send error log to debug in
    `codex-rs/core/src/codex.rs`.
    
    ## Why?
    - `async_channel::Sender::send` only fails when the channel is closed,
    so the current error-level log is noisy during normal shutdown. See
    issue #9652.
    
    ## How?
    - Replace the error log with a debug log on send failure.
    
    ## Tests
    - `just fmt`
    - `just fix -p codex-core`
    - `cargo test -p codex-core`
  • feat(tui) /permissions flow (#9561)
    ## Summary
    Adds the `/permissions` command, with a (usually) shorter set of
    permissions. `/approvals` still exists, for backwards compatibility.
    
    <img width="863" height="309" alt="Screenshot 2026-01-20 at 4 12 51 PM"
    src="https://github.com/user-attachments/assets/c49b5ba5-bc47-46dd-9067-e1a5670328fe"
    />
    
    
    ## Testing
    - [x] updated unit tests
    - [x] Tested locally
  • chore: tweak AGENTS.md (#9650)
    ## Summary
    Update AGENTS.md to improve testing flow
    
    ## Testing
    - [x] Tested locally, much faster
  • Add UI for skill enable/disable. (#9627)
    "/skill" will now allow you to enable/disable skills:
    <img width="658" height="199" alt="image"
    src="https://github.com/user-attachments/assets/bf8994c8-d6c1-462f-8bbb-f1ee9241caa4"
    />
  • feat(core) ModelInfo.model_instructions_template (#9597)
    ## Summary
    #9555 is the start of a rename, so I'm starting to standardize here.
    Sets up `model_instructions` templating with a strongly-typed object for
    injecting a personality block into the model instructions.
    
    ## Testing
    - [x] Added tests
    - [x] Ran locally
  • feat(tui): retire the tui2 experiment (#9640)
    ## Summary
    - Retire the experimental TUI2 implementation and its feature flag.
    - Remove TUI2-only config/schema/docs so the CLI stays on the
    terminal-native path.
    - Keep docs aligned with the legacy TUI while we focus on redraw-based
    improvements.
    
    ## Customer impact
    - Retires the TUI2 experiment and keeps Codex on the proven
    terminal-native UI while we invest in redraw-based improvements to the
    existing experience.
    
    ## Migration / compatibility
    - If you previously set tui2-related options in config.toml, they are
    now ignored and Codex continues using the existing terminal-native TUI
    (no action required).
    
    ## Context
    - What worked: a transcript-owned viewport delivered excellent resize
    rewrap and high-fidelity copy (especially for code).
    - Why stop: making that experience feel fully native across the
    environment matrix (terminal emulator, OS, input modality, multiplexer,
    font/theme, alt-screen behavior) creates a combinatorial explosion of
    edge cases.
    - What next: we are focusing on redraw-based improvements to the
    existing terminal-native TUI so scrolling, selection, and copy remain
    native while resize/redraw correctness improves.
    
    ## Testing
    - just write-config-schema
    - just fmt
    - cargo clippy --fix --all-features --tests --allow-dirty --allow-no-vcs
    -p codex-core
    - cargo clippy --fix --all-features --tests --allow-dirty --allow-no-vcs
    -p codex-cli
    - cargo check
    - cargo test -p codex-core
    - cargo test -p codex-cli
  • Reduce burst testing flake (#9549)
    ## Summary
    
    - make paste-burst tests deterministic by injecting explicit timestamps
    instead of relying on wall clock timing
    - add time-aware helpers for input/submission paths so tests can drive
    the burst heuristic precisely
    - update burst-related tests to flush using computed timeouts while
    preserving behavior assertions
    - increase timeout slack in
    shell_tools_start_before_response_completed_when_stream_delayed to
    reduce flakiness
  • feat: publish config schema on release (#9572)
    Follow up to #8956; publish schema on new release to stable URL.
    
    Also canonicalize schema (sort keys) when writing. This avoids reliance
    on default `schema_rs` behavior and makes the schema easier to read.
  • fix(tui) turn timing incremental (#9599)
    ## Summary
    When we send multiple assistant messages, reset the timer so "Worked for
    2m 36s" is the time since the last time we showed the message, rather
    than an ever-increasing number.
    
    We could instead change the copy so it's more clearly a running counter.
    
    ## Testing
    - [x] ran locally
    
    <img width="903" height="732" alt="Screenshot 2026-01-21 at 1 42 51 AM"
    src="https://github.com/user-attachments/assets/bb4d827b-3a0e-48ba-bd6a-d8cd65d8e892"
    />
  • feat: better sorting of shell commands (#9629)
    This PR changes the way we sort slash command by going in this order:
    1. Exact match
    2. Prefix
    3. Fuzzy
    
    As a result, we you type `/ps` the default command is not `/approvals`
  • Add layered config.toml support to app server (#9510)
    This PR adds support for chained (layered) config.toml file merging for
    clients that use the app server interface. This feature already exists
    for the TUI, but it does not work for GUI clients.
    
    It does the following:
    * Changes code paths for new thread, resume thread, and fork thread to
    use the effective config based on the cwd.
    * Updates the `config/read` API to accept an optional `cwd` parameter.
    If specified, the API returns the effective config based on that cwd
    path. Also optionally includes all layers including project config
    files. If cwd is not specified, the API falls back on its older behavior
    where it considers only the global (non-project) config files when
    computing the effective config.
    
    The changes in codex_message_processor.rs look deceptively large. They
    mostly just involve moving existing blocks of code to a later point in
    some functions so it can use the cwd to calculate the config.
    
    This PR builds upon #9509 and should be reviewed and merged after that
    PR.
    
    Tested:
    * Verified change with (dependent, as-yet-uncommitted) changes to IDE
    Extension and confirmed correct behavior
    
    The full fix requires additional changes in the IDE Extension code base,
    but they depend on this PR.
  • Add collaboration_mode to TurnContextItem (#9583)
    ## Summary
    - add optional `collaboration_mode` to `TurnContextItem` in rollouts
    - persist the current collaboration mode when recording turn context
    (sampling + compaction)
    
    ## Rationale
    We already persist turn context data for resume logic. Capturing
    collaboration mode in the rollout gives us the mode context for each
    turn, enabling follow‑up work to diff mode instructions correctly on
    resume.
    
    ## Changes
    - protocol: add optional `collaboration_mode` field to `TurnContextItem`
    - core: persist collaboration mode alongside other turn context settings
    in rollouts
  • Chore: update plan mode output in prompt (#9592)
    ### Summary
    * Update plan prompt output
    * Update requestUserInput response to be a single key value pair
    `answer: String`.
  • Add websockets logging (#9633)
    To help with debugging.