Commit Graph

4066 Commits

  • split-debuginfo (#12871)
    Attempt to reduce disk usage in mac ci.
    
    >off - This is the default for platforms with ELF binaries and
    windows-gnu (not Windows MSVC and not macOS). This typically means that
    DWARF debug information can be found in the final artifact in sections
    of the executable. This option is not supported on Windows MSVC. On
    macOS this options prevents the final execution of dsymutil to generate
    debuginfo.
  • Skip history metadata scan for subagents (#12918)
    Summary
    - Skip `history_metadata` scanning when spawning subagents to avoid
    expensive per-spawn history scans.
    - Keeps behavior unchanged for normal sessions.
    
    Testing
      - `cd codex-rs && cargo test -p codex-core` 
    - Failing in this environment (pre-existing and I don't think something
    I did?):
    - `suite::cli_stream::responses_mode_stream_cli` (SIGKILL + OTEL export
    error to http://localhost:14318/v1/logs)
    - `suite::grep_files::grep_files_tool_collects_matches` (unsupported
    call: grep_files)
    - `suite::grep_files::grep_files_tool_reports_empty_results`
    (unsupported call: grep_files)
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat: memories forgetting (#12900)
    Add diff based memory forgetting
  • tui: restore visible line numbers for hidden file links (#12870)
    we recently changed file linking so the model uses markdown links when
    it wants something to be clickable.
    
    This works well across the GUI surfaces because they can render markdown
    cleanly and use the full absolute path in the anchor target.
    
    A previous pass hid the absolute path in the TUI (and only showed the
    label), but that also meant we could lose useful location info when the
    model put the line number or range in the anchor target instead of the
    label.
    
    This follow-up keeps the TUI behavior simple while making local file
    links feel closer to the old TUI file reference style.
    
    key changes:
    - Local markdown file links in the TUI keep the old file-ref feel: code
    styling, no underline, no visible absolute path.
    - If the hidden local anchor target includes a location suffix and the
    label does not already include one, we append that suffix to the visible
    label.
    - This works for single lines, line/column references, and ranges.
    - If the label already includes the location, we leave it alone.
    - normal web links keep the old TUI markdown-link behavior
    
    some examples:
    - `[foo.rs](/abs/path/foo.rs)` renders as `foo.rs`
    - `[foo.rs](/abs/path/foo.rs:45)` renders as `foo.rs:45`
    - `[foo.rs](/abs/path/foo.rs:45:3-48:9)` renders as `foo.rs:45:3-48:9`
    - `[foo.rs:45](/abs/path/foo.rs:45)` stays `foo.rs:45`
    - `[docs](https://example.com/docs)` still renders like a normal web
    link
    
    how it looks:
    <img width="732" height="813" alt="Screenshot 2026-02-26 at 9 27 55 AM"
    src="https://github.com/user-attachments/assets/d51bf236-653a-4e83-96e4-9427f0804471"
    />
  • core: bundle settings diff updates into one dev/user envelope (#12417)
    ## Summary
    - bundle contextual prompt injection into at most one developer message
    plus one contextual user message in both:
      - per-turn settings updates
      - initial context insertion
    - preserve `<model_switch>` across compaction by rebuilding it through
    canonical initial-context injection, instead of relying on
    strip/reattach hacks
    - centralize contextual user fragment detection in one shared definition
    table and reuse it for parsing/compaction logic
    - keep `AGENTS.md` in its natural serialized format:
      - `# AGENTS.md instructions for {dirname}`
      - `<INSTRUCTIONS>...</INSTRUCTIONS>`
    - simplify related tests/helpers and accept the expected snapshot/layout
    updates from bundled multi-part messages
    
    ## Why
    The goal is to converge toward a simpler, more intentional prompt shape
    where contextual updates are consistently represented as one developer
    envelope plus one contextual user envelope, while keeping parsing and
    compaction behavior aligned with that representation.
    
    ## Notable details
    - the temporary `SettingsUpdateEnvelope` wrapper was removed; these
    paths now return `Vec<ResponseItem>` directly
    - local/remote compaction no longer rely on model-switch strip/restore
    helpers
    - contextual user detection is now driven by shared fragment definitions
    instead of ad hoc matcher assembly
    - AGENTS/user instructions are still the same logical context; only the
    synthetic `<user_instructions>` wrapper was replaced by the natural
    AGENTS text format
    
    ## Testing
    - `just fmt`
    - `cargo test -p codex-app-server
    codex_message_processor::tests::extract_conversation_summary_prefers_plain_user_messages
    -- --exact`
    - `cargo test -p codex-core
    compact::tests::collect_user_messages_filters_session_prefix_entries
    --lib -- --exact`
    - `cargo test -p codex-core --test all
    'suite::compact::snapshot_request_shape_pre_turn_compaction_strips_incoming_model_switch'
    -- --exact`
    - `cargo test -p codex-core --test all
    'suite::compact_remote::snapshot_request_shape_remote_pre_turn_compaction_strips_incoming_model_switch'
    -- --exact`
    - `cargo test -p codex-core --test all
    'suite::client::includes_apps_guidance_as_developer_message_when_enabled'
    -- --exact`
    - `cargo test -p codex-core --test all
    'suite::client::includes_developer_instructions_message_in_request' --
    --exact`
    - `cargo test -p codex-core --test all
    'suite::client::includes_user_instructions_message_in_request' --
    --exact`
    - `cargo test -p codex-core --test all
    'suite::client::resume_includes_initial_messages_and_sends_prior_items'
    -- --exact`
    - `cargo test -p codex-core --test all
    'suite::review::review_input_isolated_from_parent_history' -- --exact`
    - `cargo test -p codex-exec --test all
    'suite::resume::exec_resume_last_respects_cwd_filter_and_all_flag' --
    --exact`
    - `cargo test -p core_test_support
    context_snapshot::tests::full_text_mode_preserves_unredacted_text --
    --exact`
    
    ## Notes
    - I also ran several targeted `compact`, `compact_remote`,
    `prompt_caching`, `model_visible_layout`, and `event_mapping` tests
    while iterating on prompt-shape changes.
    - I have not claimed a clean full-workspace `cargo test` from this
    environment because local sandbox/resource conditions have previously
    produced unrelated failures in large workspace runs.
  • Enforce user input length cap (#12823)
    Currently there is no bound on the length of a user message submitted in
    the TUI or through the app server interface. That means users can paste
    many megabytes of text, which can lead to bad performance, hangs, and
    crashes. In extreme cases, it can lead to a [kernel
    panic](https://github.com/openai/codex/issues/12323).
    
    This PR limits the length of a user input to 2**20 (about 1M)
    characters. This value was chosen because it fills the entire context
    window on the latest models, so accepting longer inputs wouldn't make
    sense anyway.
    
    Summary
    - add a shared `MAX_USER_INPUT_TEXT_CHARS` constant in codex-protocol
    and surface it in TUI and app server code
    - block oversized submissions in the TUI submit flow and emit error
    history cells when validation fails
    - reject heavy app-server requests with JSON-RPC `-32602` and structured
    `input_too_large` data, plus document the behavior
    
    Testing
    - ran the IDE extension with this change and verified that when I
    attempt to paste a user message that's several MB long, it correctly
    reports an error instead of crashing or making my computer hot.
  • Hide local file link destinations in TUI markdown (#12705)
    ## Summary
    - hide appended destinations for local path-style markdown links in the
    TUI renderer
    - keep web links rendering with their visible destination and style link
    labels consistently
    - add markdown renderer tests and a snapshot for the new file-link
    output
    
    ## Testing
    - just fmt
    - cargo test -p codex-tui
    <img width="1120" height="968" alt="image"
    src="https://github.com/user-attachments/assets/490e8eda-ae47-4231-89fa-b254a1f83eed"
    />
  • Reduce js_repl Node version requirement to 22.22.0 (#12857)
    ## Summary
    
    Lower the `js_repl` minimum Node version from `24.13.1` to `22.22.0`.
    
    This updates the enforced minimum in `codex-rs/node-version.txt` and the
    corresponding user-facing `/experimental` description for the JavaScript
    REPL feature.
    
    ## Rationale
    
    The previous `24.13.1` floor was stricter than necessary for `js_repl`.
    I validated the REPL kernel behavior under Node `22.22.0` still works.
    
    ## Why `22.22.0`
    
    `22.22.0` is a current, widely packaged Node 22 release across common
    developer environments and distros, including Homebrew `node@22`, Fedora
    `nodejs22`, Arch `nodejs-lts-jod`, and Debian testing. That makes it a
    better exact floor than guessing at an older `22.x` patch we have not
    validated.
    
    `22.x` is also a maintenance branch that will be supported through April
    2027, where the previous maintenance branch of `20.x` is only supported
    through April of this year.
    
    ## Changes
    
    - Update `codex-rs/node-version.txt` from `24.13.1` to `22.22.0`
    - Update the `/experimental` JavaScript REPL description to say
    `Requires Node >= v22.22.0 installed.`
  • Skip system skills for extra roots (#12744)
    When extra roots is set do not load system skills.
  • Try fixing windows pipeline (#12848)
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
    
    Include a link to a bug report or enhancement request.
  • Clarify device auth login hint (#12813)
    Mention device auth for remote login
  • Disable js_repl when Node is incompatible at startup (#12824)
    ## Summary
    - validate `js_repl` Node compatibility during session startup when the
    experiment is enabled
    - if Node is missing or too old, disable `js_repl` and
    `js_repl_tools_only` for the session before tools and instructions are
    built
    - surface that startup disablement to users through the existing startup
    warning flow instead of only logging it
    - reuse the same compatibility check in js_repl kernel startup so
    startup gating and runtime behavior stay aligned
    - add a regression test that verifies the warning is emitted and that
    the first advertised tool list omits `js_repl` and `js_repl_reset` when
    Node is incompatible
    
    ## Why
    Today `js_repl` can be advertised based only on the feature flag, then
    fail later when the kernel starts. That makes the available tool list
    inaccurate at the start of a conversation, and users do not get a clear
    explanation for why the tool is unavailable.
    
    This change makes tool availability reflect real startup checks, keeps
    the advertised tool set stable for the lifetime of the session, and
    gives users a visible warning when `js_repl` is disabled.
    
    ## Testing
    - `just fmt`
    - `cargo test -p codex-core --test all
    js_repl_is_not_advertised_when_startup_node_is_incompatible`
  • feat: include available decisions in command approval requests (#12758)
    Command-approval clients currently infer which choices to show from
    side-channel fields like `networkApprovalContext`,
    `proposedExecpolicyAmendment`, and `additionalPermissions`. That makes
    the request shape harder to evolve, and it forces each client to
    replicate the server's heuristics instead of receiving the exact
    decision list for the prompt.
    
    This PR introduces a mapping between `CommandExecutionApprovalDecision`
    and `codex_protocol::protocol::ReviewDecision`:
    
    ```rust
    impl From<CoreReviewDecision> for CommandExecutionApprovalDecision {
        fn from(value: CoreReviewDecision) -> Self {
            match value {
                CoreReviewDecision::Approved => Self::Accept,
                CoreReviewDecision::ApprovedExecpolicyAmendment {
                    proposed_execpolicy_amendment,
                } => Self::AcceptWithExecpolicyAmendment {
                    execpolicy_amendment: proposed_execpolicy_amendment.into(),
                },
                CoreReviewDecision::ApprovedForSession => Self::AcceptForSession,
                CoreReviewDecision::NetworkPolicyAmendment {
                    network_policy_amendment,
                } => Self::ApplyNetworkPolicyAmendment {
                    network_policy_amendment: network_policy_amendment.into(),
                },
                CoreReviewDecision::Abort => Self::Cancel,
                CoreReviewDecision::Denied => Self::Decline,
            }
        }
    }
    ```
    
    And updates `CommandExecutionRequestApprovalParams` to have a new field:
    
    ```rust
    available_decisions: Option<Vec<CommandExecutionApprovalDecision>>
    ```
    
    when, if specified, should make it easier for clients to display an
    appropriate list of options in the UI.
    
    This makes it possible for `CoreShellActionProvider::prompt()` in
    `unix_escalation.rs` to specify the `Vec<ReviewDecision>` directly,
    adding support for `ApprovedForSession` when approving a skill script,
    which was previously missing in the TUI.
    
    Note this results in a significant change to `exec_options()` in
    `approval_overlay.rs`, as the displayed options are now derived from
    `available_decisions: &[ReviewDecision]`.
    
    ## What Changed
    
    - Add `available_decisions` to
    [`ExecApprovalRequestEvent`](https://github.com/openai/codex/blob/de00e932dd9801de0a4faac0519162099753f331/codex-rs/protocol/src/approvals.rs#L111-L175),
    including helpers to derive the legacy default choices when older
    senders omit the field.
    - Map `codex_protocol::protocol::ReviewDecision` to app-server
    `CommandExecutionApprovalDecision` and expose the ordered list as
    experimental `availableDecisions` in
    [`CommandExecutionRequestApprovalParams`](https://github.com/openai/codex/blob/de00e932dd9801de0a4faac0519162099753f331/codex-rs/app-server-protocol/src/protocol/v2.rs#L3798-L3807).
    - Thread optional `available_decisions` through the core approval path
    so Unix shell escalation can explicitly request `ApprovedForSession` for
    session-scoped approvals instead of relying on client heuristics.
    [`unix_escalation.rs`](https://github.com/openai/codex/blob/de00e932dd9801de0a4faac0519162099753f331/codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs#L194-L214)
    - Update the TUI approval overlay to build its buttons from the ordered
    decision list, while preserving the legacy fallback when
    `available_decisions` is missing.
    - Update the app-server README, test client output, and generated schema
    artifacts to document and surface the new field.
    
    ## Testing
    
    - Add `approval_overlay.rs` coverage for explicit decision lists,
    including the generic `ApprovedForSession` path and network approval
    options.
    - Update `chatwidget/tests.rs` and app-server protocol tests to populate
    the new optional field and keep older event shapes working.
    
    ## Developers Docs
    
    - If we document `item/commandExecution/requestApproval` on
    [developers.openai.com/codex](https://developers.openai.com/codex), add
    experimental `availableDecisions` as the preferred source of approval
    choices and note that older servers may omit it.
  • Revert "Add skill approval event/response (#12633)" (#12811)
    This reverts commit https://github.com/openai/codex/pull/12633. We no
    longer need this PR, because we favor sending normal exec command
    approval server request with `additional_permissions` of skill
    permissions instead
  • Add macOS and Linux direct install script (#12740)
    ## Summary
    - add a direct install script for macOS and Linux at
    `scripts/install/install.sh`
    - stage `install.sh` into `dist/` during release so it is published as a
    GitHub release asset
    - reuse the existing platform npm payload so the installer includes both
    `codex` and `rg`
    
    ## Testing
    - `bash -n scripts/install/install.sh`
    - local macOS `curl | sh` smoke test against a locally served copy of
    the script
  • Remove steer feature flag (#12026)
    All code should go in the direction that steer is enabled
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat: scope execve session approvals by approved skill metadata (#12814)
    Previous to this change, `determine_action()` would
    
    1. check if `program` is associated with a skill
    2. if so, check if `program` is in `execve_session_approvals` to see
    whether the user needs to be prompted
    
    This PR flips the order of these checks to try to set us up so that
    "session approvals" are always consulted first (which should soon extend
    to include session approvals derived from `prefix_rule()`s, as well).
    
    Though to make the new ordering work, we need to record any relevant
    metadata to associate with the approval, which in the case of a
    skill-based approval is the `SkillMetadata` so that we can derive the
    `PermissionProfile` to include with the escalation. (Though as noted by
    the `TODO`, this `PermissionProfile` is not honored yet.)
    
    The new `ExecveSessionApproval` struct is used to retain the necessary
    metadata.
    
    ## What Changed
    
    - Replace the `execve_session_approvals` `HashSet` with a map that
    stores an `ExecveSessionApproval` alongside each approved `program`.
    - When a user chooses `ApprovedForSession` for a skill script, capture
    the matched `SkillMetadata` in the session approval entry.
    - Consult that cache before re-running `find_skill()`, and reuse the
    originally approved skill metadata and permission profile when allowing
    later execve callbacks in the same session.
  • Enable request_user_input in Default mode (#12735)
    ## Summary
    - allow `request_user_input` in Default collaboration mode as well as
    Plan
    - update the Default-mode instructions to prefer assumptions first and
    use `request_user_input` only when a question is unavoidable
    - update request_user_input and app-server tests to match the new
    Default-mode behavior
    - refactor collaboration-mode availability plumbing into
    `CollaborationModesConfig` for future mode-related flags
    
    ## Codex author
    `codex resume 019c9124-ed28-7c13-96c6-b916b1c97d49`
  • Revert "Ensure shell command skills trigger approval (#12697)" (#12721)
    This reverts commit daf0f03ac8.
    
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
    
    Include a link to a bug report or enhancement request.
  • only use preambles for realtime (#12806)
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
    
    Include a link to a bug report or enhancement request.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat(app-server): thread/unsubscribe API (#10954)
    Adds a new v2 app-server API for a client to be able to unsubscribe to a
    thread:
    - New RPC method: `thread/unsubscribe`
    - New server notification: `thread/closed`
    
    Today clients can start/resume/archive threads, but there wasn’t a way
    to explicitly unload a live thread from memory without archiving it.
    With `thread/unsubscribe`, a client can indicate it is no longer
    actively working with a live Thread. If this is the only client
    subscribed to that given thread, the thread will be automatically closed
    by app-server, at which point the server will send `thread/closed` and
    `thread/status/changed` with `status: notLoaded` notifications.
    
    This gives clients a way to prevent long-running app-server processes
    from accumulating too many thread (and related) objects in memory.
    
    Closed threads will also be removed from `thread/loaded/list`.
  • make 5.3-codex visible in cli for api users (#12808)
    5.3-codex released in api, mark it visible for API users via bundled
    `models.json`.
  • fix: harden zsh fork tests and keep subcommand approvals deterministic (#12809)
    ## Why
    The prior
    `turn_start_shell_zsh_fork_subcommand_decline_marks_parent_declined_v2`
    assertion was brittle under Bazel: command approval payloads in the test
    could include environment-dependent wrapper/command formatting
    differences, which makes exact command-string matching flaky even when
    behavior is correct.
    
    (This regression was knowingly introduced in
    https://github.com/openai/codex/pull/12800, but it was urgent to land
    that PR.)
    
    ## What changed
    - Hardened
    `turn_start_shell_zsh_fork_subcommand_decline_marks_parent_declined_v2`
    in
    [`turn_start_zsh_fork.rs`](https://github.com/openai/codex/blob/main/codex-rs/app-server/tests/suite/v2/turn_start_zsh_fork.rs):
    - Replaced strict `approval_command.starts_with("/bin/rm")` checks with
    intent-based subcommand matching.
    - Subcommand approvals are now recognized by file-target semantics
    (`first.txt` or `second.txt`) plus `rm` intent.
    - Parent approval recognition is now more tolerant of command-format
    differences while still requiring a definitive parent command context.
    - Uses a defensive loop that waits for all target subcommand decisions
    and the parent approval request.
    - Preserved the existing regression and unit test fixes from earlier
    commits in `unix_escalation.rs` and `skill_approval.rs`.
    
    ## Verification
    - Ran the zsh fork subcommand decline regression under this change:
    -
    `turn_start_shell_zsh_fork_subcommand_decline_marks_parent_declined_v2`
    - Confirmed the test is now robust against approval-command-string
    variation instead of hardcoding one expected command shape.
  • Update Codex docs success link (#12805)
    Fix a stale documentation link in the sign-in flow
  • Add simple realtime text logs (#12807)
    Update realtime debug logs to include the actual text payloads in both
    input and output paths.
    
    - In `core/src/realtime_conversation.rs`:
    - `handle_start`: add extracted assistant text output to the
    `[realtime-text]` debug log.
    - `handle_text`: add incoming text input (`params.text`) to the
    `[realtime-text]` debug log.
    
    No tests were run (per request).
  • feat(app-server): add ThreadItem::DynamicToolCall (#12732)
    Previously, clients would call `thread/start` with dynamic_tools set,
    and when a model invokes a dynamic tool, it would just make the
    server->client `item/tool/call` request and wait for the client's
    response to complete the tool call. This works, but it doesn't have an
    `item/started` or `item/completed` event.
    
    Now we are doing this:
    - [new] emit `item/started` with `DynamicToolCall` populated with the
    call arguments
    - send an `item/tool/call` server request
    - [new] once the client responds, emit `item/completed` with
    `DynamicToolCall` populated with the response.
    
    Also, with `persistExtendedHistory: true`, dynamic tool calls are now
    reconstructable in `thread/read` and `thread/resume` as
    `ThreadItem::DynamicToolCall`.
  • Propagate session ID when compacting (#12802)
    We propagate the session ID when sending requests for inference but we
    don't do the same for compaction requests. This makes it hard to link
    compaction requests to their session for debugging purposes
  • fix: enforce sandbox envelope for zsh fork execution (#12800)
    ## Why
    Zsh fork execution was still able to bypass the `WorkspaceWrite` model
    in edge cases because the fork path reconstructed command execution
    without preserving sandbox wrappers, and command extraction only
    accepted shell invocations in a narrow positional shape. This can allow
    commands to run with broader filesystem access than expected, which
    breaks the sandbox safety model.
    
    ## What changed
    - Preserved the sandboxed `ExecRequest` produced by
    `attempt.env_for(...)` when entering the zsh fork path in
    [`unix_escalation.rs`](https://github.com/openai/codex/blob/main/codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs).
    - Updated `CoreShellCommandExecutor` to execute the sandboxed command
    and working directory captured from `attempt.env_for(...)`, instead of
    re-running a freshly reconstructed shell command.
    - Made zsh-fork script extraction robust to wrapped invocations by
    scanning command arguments for `-c`/`-lc` rather than only matching the
    first positional form.
    - Added unit tests in `unix_escalation.rs` to lock in wrapper-tolerant
    parsing behavior and keep unsupported shell forms rejected.
    - Tightened the regression in
    [`skill_approval.rs`](https://github.com/openai/codex/blob/main/codex-rs/core/tests/suite/skill_approval.rs):
    - `shell_zsh_fork_still_enforces_workspace_write_sandbox` now uses an
    explicit `WorkspaceWrite` policy with `exclude_tmpdir_env_var: true` and
    `exclude_slash_tmp: true`.
    - The test attempts to write to `/tmp/...`, which is only reliably
    outside writable roots with those explicit exclusions set.
    
    ## Verification
    - Added and passed the new unit tests around `extract_shell_script`
    parsing behavior with wrapped command shapes.
      - `extract_shell_script_supports_wrapped_command_prefixes`
      - `extract_shell_script_rejects_unsupported_shell_invocation`
    - Verified the regression with the focused integration test:
    `shell_zsh_fork_still_enforces_workspace_write_sandbox`.
    
    ## Manual Testing
    
    Prior to this change, if I ran Codex via:
    
    ```
    just codex --config zsh_path=/Users/mbolin/code/codex2/codex-rs/app-server/tests/suite/zsh --enable shell_zsh_fork
    ```
    
    and asked:
    
    ```
    what is the output of /bin/ps
    ```
    
    it would run it, even though the default sandbox should prevent the
    agent from running `/bin/ps` because it is setuid on MacOS.
    
    But with this change, I now see the expected failure because it is
    blocked by the sandbox:
    
    ```
    /bin/ps exited with status 1 and produced no output in this environment.
    ```
  • Handle websocket timeout (#12791)
    Sometimes websockets will timeout with 400 error, ensure we retry it.
  • Add app-server v2 thread realtime API (#12715)
    Add experimental `thread/realtime/*` v2 requests and notifications, then
    route app-server realtime events through that thread-scoped surface with
    integration coverage.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Promote js_repl to experimental with Node requirement (#12712)
    ## Summary
    
    - Promote `js_repl` to an experimental feature that users can enable
    from `/experimental`.
    - Add `js_repl` experimental metadata, including the Node prerequisite
    and activation guidance.
    - Add regression coverage for the feature metadata and the
    `/experimental` popup.
    
    ## What Changed
    
    - Changed `Feature::JsRepl` from `Stage::UnderDevelopment` to
    `Stage::Experimental`.
    - Added experimental metadata for `js_repl` in `core/src/features.rs`:
      - name: `JavaScript REPL`
    - description: calls out interactive website debugging, inline
    JavaScript execution, and the required Node version (`>= v24.13.1`)
    - announcement: tells users to enable it, then start a new chat or
    restart Codex
    - Added a core unit test that verifies:
      - `js_repl` is experimental
      - `js_repl` is disabled by default
    - the hardcoded Node version in the description matches
    `node-version.txt`
    - Added a TUI test that opens the `/experimental` popup and verifies the
    rendered `js_repl` entry includes the Node requirement text.
    
    ## Testing
    
    - `just fmt`
    - `cargo test -p codex-tui`
    - `cargo test -p codex-core` (unit-test phase passed; stopped during the
    long `tests/all.rs` integration suite)
  • feat(network-proxy): add embedded OTEL policy audit logging (#12046)
    **PR Summary**
    
    This PR adds embedded-only OTEL policy audit logging for
    `codex-network-proxy` and threads audit metadata from `codex-core` into
    managed proxy startup.
    
    ### What changed
    - Added structured audit event emission in `network_policy.rs` with
    target `codex_otel.network_proxy`.
    - Emitted:
    - `codex.network_proxy.domain_policy_decision` once per domain-policy
    evaluation.
      - `codex.network_proxy.block_decision` for non-domain denies.
    - Added required policy/network fields, RFC3339 UTC millisecond
    `event.timestamp`, and fallback defaults (`http.request.method="none"`,
    `client.address="unknown"`).
    - Added non-domain deny audit emission in HTTP/SOCKS handlers for
    mode-guard and proxy-state denies, including unix-socket deny paths.
    - Added `REASON_UNIX_SOCKET_UNSUPPORTED` and used it for unsupported
    unix-socket auditing.
    - Added `NetworkProxyAuditMetadata` to runtime/state, re-exported from
    `lib.rs` and `state.rs`.
    - Added `start_proxy_with_audit_metadata(...)` in core config, with
    `start_proxy()` delegating to default metadata.
    - Wired metadata construction in `codex.rs` from session/auth context,
    including originator sanitization for OTEL-safe tagging.
    - Updated `network-proxy/README.md` with embedded-mode audit schema and
    behavior notes.
    - Refactored HTTP block-audit emission to a small local helper to reduce
    duplication.
    - Preserved existing unix-socket proxy-disabled host/path behavior for
    responses and blocked history while using an audit-only endpoint
    override (`server.address="unix-socket"`, `server.port=0`).
    
    ### Explicit exclusions
    - No standalone proxy OTEL startup work.
    - No `main.rs` binary wiring.
    - No `standalone_otel.rs`.
    - No standalone docs/tests.
    
    ### Tests
    - Extended `network_policy.rs` tests for event mapping, metadata
    propagation, fallbacks, timestamp format, and target prefix.
    - Extended HTTP tests to assert unix-socket deny block audit events.
    - Extended SOCKS tests to cover deny emission from handler deny
    branches.
    - Added/updated core tests to verify audit metadata threading into
    managed proxy state.
    
    ### Validation run
    - `just fmt`
    - `cargo test -p codex-network-proxy` 
    - `cargo test -p codex-core` ran with one unrelated flaky timeout
    (`shell_snapshot::tests::snapshot_shell_does_not_inherit_stdin`), and
    the test passed when rerun directly 
    
    ---------
    
    Co-authored-by: viyatb-oai <viyatb@openai.com>
  • otel: add host.name resource attribute to logs/traces via gethostname (#12352)
    **PR Summary**
    
    This PR adds the OpenTelemetry `host.name` resource attribute to Codex
    OTEL exports so every OTEL log (and trace, via the shared resource)
    carries the machine hostname.
    
    **What changed**
    
    - Added `host.name` to the shared OTEL `Resource` in
    `/Users/michael.mcgrew/code/codex/codex-rs/otel/src/otel_provider.rs`
      - This applies to both:
        - OTEL logs (`SdkLoggerProvider`)
        - OTEL traces (`SdkTracerProvider`)
    - Hostname is now resolved via `gethostname::gethostname()`
    (best-effort)
      - Value is trimmed
      - Empty values are omitted (non-fatal)
    - Added focused unit tests for:
      - including `host.name` when present
      - omitting `host.name` when missing/empty
    
    **Why**
    
    - `host.name` is host/process metadata and belongs on the OTEL
    `resource`, not per-event attributes.
    - Attaching it in the shared resource is the smallest change that
    guarantees coverage across all exported OTEL logs/traces.
    
    **Scope / Non-goals**
    
    - No public API changes
    - No changes to metrics behavior (this PR only updates log/trace
    resource metadata)
    
    **Dependency updates**
    
    - Added `gethostname` as a workspace dependency and `codex-otel`
    dependency
    - `Cargo.lock` updated accordingly
    - `MODULE.bazel.lock` unchanged after refresh/check
    
    **Validation**
    
    - `just fmt`
    - `cargo test -p codex-otel`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
  • feat: adding stream parser (#12666)
    Add a stream parser to extract citations (and others) from a stream.
    This support cases where markers are split in differen tokens.
    
    Codex never manage to make this code work so everything was done
    manually. Please review correctly and do not touch this part of the code
    without a very clear understanding of it