Commit Graph

1727 Commits

  • turn metadata followups (#11782)
    some trivial simplifications from #11677
  • support app usage analytics (#11687)
    Emit app mentioned and app used events. Dedup by (turn_id, connector_id)
    
    Example event params:
    {
        "event_type": "codex_app_used",
        "connector_id": "asdk_app_xxx",
        "thread_id": "019c5527-36d4-xxx",
        "turn_id": "019c552c-cd17-xxx",
        "app_name": "Slack (OpenAI Internal)",
        "product_client_id": "codex_cli_rs",
        "invoke_type": "explicit",
        "model_slug": "gpt-5.3-codex"
    }
  • Add js_repl kernel crash diagnostics (#11666)
    ## Summary
    
    This PR improves `js_repl` crash diagnostics so kernel failures are
    debuggable without weakening timeout/reset guarantees.
    
    ## What Changed
    
    - Added bounded kernel stderr capture and truncation logic (line + byte
    caps).
    - Added structured kernel snapshots (`pid`, exit status, stderr tail)
    for failure paths.
    - Enriched model-visible kernel-failure errors with a structured
    diagnostics payload:
      - `js_repl diagnostics: {...}`
      - Included only for likely kernel-failure write/EOF cases.
    - Improved logging around kernel write failures, unexpected exits, and
    kill/wait paths.
    - Added/updated unit tests for:
      - UTF-8-safe truncation
      - stderr tail bounds
      - structured diagnostics shape/truncation
      - conditional diagnostics emission
      - timeout kill behavior
      - forced kernel-failure diagnostics
    
    ## Why
    
    Before this, failures like broken pipe / unexpected kernel exit often
    surfaced as generic errors with little context. This change preserves
    existing behavior but adds actionable diagnostics while keeping output
    bounded.
    
    ## Scope
    
    - Code changes are limited to:
    -
    `/Users/fjord/code/codex-jsrepl-seq/codex-rs/core/src/tools/js_repl/mod.rs`
    
    ## Validation
    
    - `cargo clippy -p codex-core --all-targets -- -D warnings`
    - Targeted `codex-core` js_repl unit tests (including new
    diagnostics/timeout coverage)
    - Tried starting a long running js_repl command (sleep for 10 minutes),
    verified error output was as expected after killing the node process.
    
    #### [git stack](https://github.com/magus/git-stack-cli)
    - 👉 `1` https://github.com/openai/codex/pull/11666
    -  `2` https://github.com/openai/codex/pull/10673
    -  `3` https://github.com/openai/codex/pull/10670
  • Update read_path prompt (#11763)
    ## Summary
    
    - Created branch zuxin/read-path-update from main.
    - Copied codex-rs/core/templates/memories/read_path.md from the current
    branch.
    - Committed the content change.
    
    ## Testing
    Not run (content copy + commit only).
  • Report syntax errors in rules file (#11686)
    Currently, if there are syntax errors detected in the starlark rules
    file, the entire policy is silently ignored by the CLI. The app server
    correctly emits a message that can be displayed in a GUI.
    
    This PR changes the CLI (both the TUI and non-interactive exec) to fail
    when the rules file can't be parsed. It then prints out an error message
    and exits with a non-zero exit code. This is consistent with the
    handling of errors in the config file.
    
    This addresses #11603
  • feat(tui): prevent macOS idle sleep while turns run (#11711)
    ## Summary
    - add a shared `codex-core` sleep inhibitor that uses native macOS IOKit
    assertions (`IOPMAssertionCreateWithName` / `IOPMAssertionRelease`)
    instead of spawning `caffeinate`
    - wire sleep inhibition to turn lifecycle in `tui` (`TurnStarted`
    enables; `TurnComplete` and abort/error finalization disable)
    - gate this behavior behind a `/experimental` feature toggle
    (`[features].prevent_idle_sleep`) instead of a dedicated `[tui]` config
    flag
    - expose the toggle in `/experimental` on macOS; keep it under
    development on other platforms
    - keep behavior no-op on non-macOS targets
    
    <img width="1326" height="577" alt="image"
    src="https://github.com/user-attachments/assets/73fac06b-97ae-46a2-800a-30f9516cf8a3"
    />
    
    ## Testing
    - `cargo check -p codex-core -p codex-tui`
    - `cargo test -p codex-core sleep_inhibitor::tests -- --nocapture`
    - `cargo test -p codex-core
    tui_config_missing_notifications_field_defaults_to_enabled --
    --nocapture`
    - `cargo test -p codex-core prevent_idle_sleep_is_ -- --nocapture`
    
    ## Semantics and API references
    - This PR targets `caffeinate -i` semantics: prevent *idle system sleep*
    while allowing display idle sleep.
    - `caffeinate -i` mapping in Apple open source (`assertionMap`):
      - `kIdleAssertionFlag -> kIOPMAssertionTypePreventUserIdleSystemSleep`
    - Source:
    https://github.com/apple-oss-distributions/PowerManagement/blob/PowerManagement-1846.60.12/caffeinate/caffeinate.c#L52-L54
    - Apple IOKit docs for assertion types and API:
    -
    https://developer.apple.com/documentation/iokit/iopmlib_h/iopmassertiontypes
    -
    https://developer.apple.com/documentation/iokit/1557092-iopmassertioncreatewithname
      - https://developer.apple.com/library/archive/qa/qa1340/_index.html
    
    ## Codex Electron vs this PR (full stack path)
    - Codex Electron app requests sleep blocking with
    `powerSaveBlocker.start("prevent-app-suspension")`:
    -
    https://github.com/openai/codex/blob/main/codex/codex-vscode/electron/src/electron-message-handler.ts
    - Electron maps that string to Chromium wake lock type
    `kPreventAppSuspension`:
    -
    https://github.com/electron/electron/blob/main/shell/browser/api/electron_api_power_save_blocker.cc
    - Chromium macOS backend maps wake lock types to IOKit assertion
    constants and calls IOKit:
      - `kPreventAppSuspension -> kIOPMAssertionTypeNoIdleSleep`
    - `kPreventDisplaySleep / kPreventDisplaySleepAllowDimming ->
    kIOPMAssertionTypeNoDisplaySleep`
    -
    https://github.com/chromium/chromium/blob/main/services/device/wake_lock/power_save_blocker/power_save_blocker_mac.cc
    
    ## Why this PR uses a different macOS constant name
    - This PR uses `"PreventUserIdleSystemSleep"` directly, via
    `IOPMAssertionCreateWithName`, in
    `codex-rs/core/src/sleep_inhibitor.rs`.
    - Apple’s IOKit header documents `kIOPMAssertionTypeNoIdleSleep` as
    deprecated and recommends `kIOPMAssertPreventUserIdleSystemSleep` /
    `kIOPMAssertionTypePreventUserIdleSystemSleep`:
    -
    https://github.com/apple-oss-distributions/IOKitUser/blob/IOKitUser-100222.60.2/pwr_mgt.subproj/IOPMLib.h#L1000-L1030
    - So Chromium and this PR are using different constant names, but
    semantically equivalent idle-system-sleep prevention behavior.
    
    ## Future platform support
    The architecture is intentionally set up for multi-platform extensions:
    - UI code (`tui`) only calls `SleepInhibitor::set_turn_running(...)` on
    turn lifecycle boundaries.
    - Platform-specific behavior is isolated in
    `codex-rs/core/src/sleep_inhibitor.rs` behind `cfg(...)` blocks.
    - Feature exposure is centralized in `core/src/features.rs` and surfaced
    via `/experimental`.
    - Adding new OS backends should not require additional TUI wiring; only
    the backend internals and feature stage metadata need to change.
    
    Potential follow-up implementations:
    - Windows:
    - Add a backend using Win32 power APIs
    (`SetThreadExecutionState(ES_CONTINUOUS | ES_SYSTEM_REQUIRED)` as
    baseline).
    - Optionally move to `PowerCreateRequest` / `PowerSetRequest` /
    `PowerClearRequest` for richer assertion semantics.
    - Linux:
    - Add a backend using logind inhibitors over D-Bus
    (`org.freedesktop.login1.Manager.Inhibit` with `what="sleep"`).
      - Keep a no-op fallback where logind/D-Bus is unavailable.
    
    This PR keeps the cross-platform API surface minimal so future PRs can
    add Windows/Linux support incrementally with low churn.
    
    ---------
    
    Co-authored-by: jif-oai <jif@openai.com>
  • fix: reduce flakiness of compact_resume_after_second_compaction_preserves_history (#11663)
    ## Why
    `compact_resume_after_second_compaction_preserves_history` has been
    intermittently flaky in Windows CI.
    
    The test had two one-shot request matchers in the second compact/resume
    phase that could overlap, and it waited for the first `Warning` event
    after compaction. In practice, that made the test sensitive to
    platform/config-specific prompt shape and unrelated warning timing.
    
    ## What Changed
    - Hardened the second compaction matcher in
    `codex-rs/core/tests/suite/compact_resume_fork.rs` so it accepts
    expected compact-request variants while explicitly excluding the
    `AFTER_SECOND_RESUME` payload.
    - Updated `compact_conversation()` to wait for the specific compaction
    warning (`COMPACT_WARNING_MESSAGE`) rather than any `Warning` event.
    - Added an inline comment explaining why the matcher is intentionally
    broad but disjoint from the follow-up resume matcher.
    
    ## Test Plan
    - `cargo test -p codex-core --test all
    suite::compact_resume_fork::compact_resume_after_second_compaction_preserves_history
    -- --exact`
    - Repeated the same test in a loop (40 runs) to check for local
    nondeterminism.
  • core: limit search_tool_bm25 to Apps and clarify discovery guidance (#11669)
    ## Summary
    - Limit `search_tool_bm25` indexing to `codex_apps` tools only, so
    non-Apps MCP servers are no longer discoverable through this search
    path.
    - Move search-tool discovery guidance into the `search_tool_bm25` tool
    description (via template include) instead of injecting it as a separate
    developer message.
    - Update Apps discovery guidance wording to clarify when to use
    `search_tool_bm25` for Apps-backed systems (for example Slack, Google
    Drive, Jira, Notion) and when to call tools directly.
    - Remove dead `core` helper code (`filter_codex_apps_mcp_tools` and
    `codex_apps_connector_id`) that is no longer used after the
    tool-selection refactor.
    - Update `core` search-tool tests to assert codex-apps-only behavior and
    to validate guidance from the tool description.
    
    ## Validation
    -  `just fmt`
    -  `cargo test -p codex-core search_tool`
    - ⚠️ `cargo test -p codex-core` was attempted, but the run repeatedly
    stalled on
    `tools::js_repl::tests::js_repl_can_attach_image_via_view_image_tool`.
    
    ## Tickets
    - None
  • Fix memories output schema requirements (#11748)
    Summary
    - make the phase1 memories schema require `rollout_slug` while still
    allowing it to be `null`
    - update the corresponding test to check the required fields and
    nullable type list
    
    Testing
    - Not run (not requested)
  • feat: add token usage on memories (#11618)
    Add aggregated token usage metrics on phase 1 of memories
  • chore(core) Restrict model-suggested rules (#11671)
    ## Summary
    If the model suggests a bad rule, don't show it to the user. This does
    not impact the parsing of existing rules, just the ones we show.
    
    ## Testing
    - [x] Added unit tests
    - [x] Ran locally
  • [apps] Fix app loading logic. (#11518)
    When `app/list` is called with `force_refetch=True`, we should seed the
    results with what is already cached instead of starting from an empty
    list. Otherwise when we send app/list/updated events, the client will
    first see an empty list of accessible apps and then get the updated one.
  • chore(approvals) More approvals scenarios (#11660)
    ## Summary
    Add some additional tests to approvals flow
    
    ## Testing
    - [x] these are tests
  • Added a test to verify that feature flags that are enabled by default are stable (#11275)
    We've had a few cases recently where someone enabled a feature flag for
    a feature that's still under development or experimental. This test
    should prevent this.
  • Remove git commands from dangerous command checks (#11510)
    ### Motivation
    
    - Git subcommand matching was being classified as "dangerous" and caused
    benign developer workflows (for example `git push --force-with-lease`)
    to be blocked by the preflight policy.
    - The change aligns behavior with the intent to reserve the dangerous
    checklist for truly destructive shell ops (e.g. `rm -rf`) and avoid
    surprising developer-facing blocks.
    
    ### Description
    
    - Remove git-specific subcommand checks from
    `is_dangerous_to_call_with_exec` in
    `codex-rs/shell-command/src/command_safety/is_dangerous_command.rs`,
    leaving only explicit `rm` and `sudo` passthrough checks.
    - Deleted the git-specific helper logic that classified `reset`,
    `branch`-delete, `push` (force/delete/refspec) and `clean --force` as
    dangerous.
    - Updated unit tests in the same file to assert that various `git
    reset`/`git branch`/`git push`/`git clean` variants are no longer
    classified as dangerous.
    - Kept `find_git_subcommand` (used by safe-command classification)
    intact so safe/unsafe parsing elsewhere remains functional.
    
    ### Testing
    
    - Ran formatter with `just fmt` successfully.  
    - Ran unit tests with `cargo test -p codex-shell-command` and all tests
    passed (`144 passed; 0 failed`).
    
    ------
    [Codex
    Task](https://chatgpt.com/codex/tasks/task_i_698d19dedb4883299c3ceb5bbc6a0dcf)
  • Persist complete TurnContextItem state via canonical conversion (#11656)
    ## Summary
    
    This PR delivers the first small, shippable step toward model-visible
    state diffing by making
    `TurnContextItem` more complete and standardizing how it is built.
    
    Specifically, it:
    - Adds persisted network context to `TurnContextItem`.
    - Introduces a single canonical `TurnContext -> TurnContextItem`
    conversion path.
    - Routes existing rollout write sites through that canonical conversion
    helper.
    
    No context injection/diff behavior changes are included in this PR.
    
    ## Why this change
    
    The design goal is to make `TurnContextItem` the canonical source of
    truth for context-diff
    decisions.
    Before this PR:
    - `TurnContextItem` did not include all TurnContext-derived environment
    inputs needed for v1
    completeness.
    - Construction was duplicated at multiple write sites.
    
    This PR addresses both with a minimal, reviewable change.
    
    ## Changes
    
    ### 1) Extend `TurnContextItem` with network state
    - Added `TurnContextNetworkItem { allowed_domains, denied_domains }`.
    - Added `network: Option<TurnContextNetworkItem>` to `TurnContextItem`.
    - Kept backward compatibility by making the new field optional and
    skipped when absent.
    
    Files:
    - `codex-rs/protocol/src/protocol.rs`
    
    ### 2) Canonical conversion helper
    - Added `TurnContext::to_turn_context_item(collaboration_mode)` in core.
    - Added internal helper to derive network fields from
    `config_layer_stack.requirements().network`.
    
    Files:
    - `codex-rs/core/src/codex.rs`
    
    ### 3) Use canonical conversion at rollout write sites
    - Replaced ad hoc `TurnContextItem { ... }` construction with
    `to_turn_context_item(...)` in:
      - sampling request path
      - compaction path
    
    Files:
    - `codex-rs/core/src/codex.rs`
    - `codex-rs/core/src/compact.rs`
    
    ### 4) Update fixtures/tests for new optional field
    - Updated existing `TurnContextItem` literals in tests to include
    `network: None`.
    - Added protocol tests for:
      - deserializing old payloads with no `network`
      - serializing when `network` is present
    
    Files:
    - `codex-rs/core/tests/suite/resume_warning.rs`
    - No replay/diff logic changes.
    - Persisted rollout `TurnContextItem` now carries additional network
    context when available.
    - Older rollout lines without `network` remain readable.
  • Add new apps_mcp_gateway (#11630)
    Adds a new apps_mcp_gateway flag to route Apps MCP calls through
    https://api.openai.com/v1/connectors/mcp/ when enabled, while keeping
    legacy MCP routing as default.
  • [apps] Add is_enabled to app info. (#11417)
    - [x] Add is_enabled to app info and the response of `app/list`.
    - [x] Update TUI to have Enable/Disable button on the app detail page.
  • Add js_repl_tools_only model and routing restrictions (#10671)
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
    
    Include a link to a bug report or enhancement request.
    
    
    #### [git stack](https://github.com/magus/git-stack-cli)
    -  `1` https://github.com/openai/codex/pull/10674
    -  `2` https://github.com/openai/codex/pull/10672
    - 👉 `3` https://github.com/openai/codex/pull/10671
    -  `4` https://github.com/openai/codex/pull/10673
    -  `5` https://github.com/openai/codex/pull/10670
  • [feat] add seatbelt permission files (#11639)
    Add seatbelt permission extension abstraction as permission files for
    seatbelt profiles. This should complement our current sandbox policy
  • feat: introduce Permissions (#11633)
    ## Why
    We currently carry multiple permission-related concepts directly on
    `Config` for shell/unified-exec behavior (`approval_policy`,
    `sandbox_policy`, `network`, `shell_environment_policy`,
    `windows_sandbox_mode`).
    
    Consolidating these into one in-memory struct makes permission handling
    easier to reason about and sets up the next step: supporting named
    permission profiles (`[permissions.PROFILE_NAME]`) without changing
    behavior now.
    
    This change is mostly mechanical: it updates existing callsites to go
    through `config.permissions`, but it does not yet refactor those
    callsites to take a single `Permissions` value in places where multiple
    permission fields are still threaded separately.
    
    This PR intentionally **does not** change the on-disk `config.toml`
    format yet and keeps compatibility with legacy config keys.
    
    ## What Changed
    - Introduced `Permissions` in `core/src/config/mod.rs`.
    - Added `Config::permissions` and moved effective runtime permission
    fields under it:
      - `approval_policy`
      - `sandbox_policy`
      - `network`
      - `shell_environment_policy`
      - `windows_sandbox_mode`
    - Updated config loading/building so these effective values are still
    derived from the same existing config inputs and constraints.
    - Updated Windows sandbox helpers/resolution to read/write via
    `permissions`.
    - Threaded the new field through all permission consumers across core
    runtime, app-server, CLI/exec, TUI, and sandbox summary code.
    - Updated affected tests to reference `config.permissions.*`.
    - Renamed the struct/field from
    `EffectivePermissions`/`effective_permissions` to
    `Permissions`/`permissions` and aligned variable naming accordingly.
    
    ## Verification
    - `just fix -p codex-core -p codex-tui -p codex-cli -p codex-app-server
    -p codex-exec -p codex-utils-sandbox-summary`
    - `cargo build -p codex-core -p codex-tui -p codex-cli -p
    codex-app-server -p codex-exec -p codex-utils-sandbox-summary`
  • chore(core) Deprecate approval_policy: on-failure (#11631)
    ## Summary
    In an effort to start simplifying our sandbox setup, we're announcing
    this approval_policy as deprecated. In general, it performs worse than
    `on-request`, and we're focusing on making fewer sandbox configurations
    perform much better.
    
    ## Testing
    - [x] Tested locally
    - [x] Existing tests pass
  • add a slash command to grant sandbox read access to inaccessible directories (#11512)
    There is an edge case where a directory is not readable by the sandbox.
    In practice, we've seen very little of it, but it can happen so this
    slash command unlocks users when it does.
    
    Future idea is to make this a tool that the agent knows about so it can
    be more integrated.
  • Add js_repl host helpers and exec end events (#10672)
    ## Summary
    
    This PR adds host-integrated helper APIs for `js_repl` and updates model
    guidance so the agent can use them reliably.
    
    ### What’s included
    
    - Add `codex.tool(name, args?)` in the JS kernel so `js_repl` can call
    normal Codex tools.
    - Keep persistent JS state and scratch-path helpers available:
      - `codex.state`
      - `codex.tmpDir`
    - Wire `js_repl` tool calls through the standard tool router path.
    - Add/align `js_repl` execution completion/end event behavior with
    existing tool logging patterns.
    - Update dynamic prompt injection (`project_doc`) to document:
      - how to call `codex.tool(...)`
      - raw output behavior
    - image flow via `view_image` (`codex.tmpDir` +
    `codex.tool("view_image", ...)`)
    - stdio safety guidance (`console.log` / `codex.tool`, avoid direct
    `process.std*`)
    
    ## Why
    
    - Standardize JS-side tool usage on `codex.tool(...)`
    - Make `js_repl` behavior more consistent with existing tool execution
    and event/logging patterns.
    - Give the model enough runtime guidance to use `js_repl` safely and
    effectively.
    
    ## Testing
    
    - Added/updated unit and runtime tests for:
      - `codex.tool` calls from `js_repl` (including shell/MCP paths)
      - image handoff flow via `view_image`
      - prompt-injection text for `js_repl` guidance
      - execution/end event behavior and related regression coverage
    
    
    
    
    #### [git stack](https://github.com/magus/git-stack-cli)
    -  `1` https://github.com/openai/codex/pull/10674
    - 👉 `2` https://github.com/openai/codex/pull/10672
    -  `3` https://github.com/openai/codex/pull/10671
    -  `4` https://github.com/openai/codex/pull/10673
    -  `5` https://github.com/openai/codex/pull/10670
  • feat(app-server): experimental flag to persist extended history (#11227)
    This PR adds an experimental `persist_extended_history` bool flag to
    app-server thread APIs so rollout logs can retain a richer set of
    EventMsgs for non-lossy Thread > Turn > ThreadItems reconstruction (i.e.
    on `thread/resume`).
    
    ### Motivation
    Today, our rollout recorder only persists a small subset (e.g. user
    message, reasoning, assistant message) of `EventMsg` types, dropping a
    good number (like command exec, file change, etc.) that are important
    for reconstructing full item history for `thread/resume`, `thread/read`,
    and `thread/fork`.
    
    Some clients want to be able to resume a thread without lossiness. This
    lossiness is primarily a UI thing, since what the model sees are
    `ResponseItem` and not `EventMsg`.
    
    ### Approach
    This change introduces an opt-in `persist_full_history` flag to preserve
    those events when you start/resume/fork a thread (defaults to `false`).
    
    This is done by adding an `EventPersistenceMode` to the rollout
    recorder:
    - `Limited` (existing behavior, default)
    - `Extended` (new opt-in behavior)
    
    In `Extended` mode, persist additional `EventMsg` variants needed for
    non-lossy app-server `ThreadItem` reconstruction. We now store the
    following ThreadItems that we didn't before:
    - web search
    - command execution
    - patch/file changes
    - MCP tool calls
    - image view calls
    - collab tool outcomes
    - context compaction
    - review mode enter/exit
    
    For **command executions** in particular, we truncate the output using
    the existing `truncate_text` from core to store an upper bound of 10,000
    bytes, which is also the default value for truncating tool outputs shown
    to the model. This keeps the size of the rollout file and command
    execution items returned over the wire reasonable.
    
    And we also persist `EventMsg::Error` which we can now map back to the
    Turn's status and populates the Turn's error metadata.
    
    #### Updates to EventMsgs
    To truly make `thread/resume` non-lossy, we also needed to persist the
    `status` on `EventMsg::CommandExecutionEndEvent` and
    `EventMsg::PatchApplyEndEvent`. Previously it was not obvious whether a
    command failed or was declined (similar for apply_patch). These
    EventMsgs were never persisted before so I made it a required field.
  • Parse first order skill/connector mentions (#11547)
    This PR introduces a skill-expansion mechanism for mentions so nested or
    skill or connection mentions are expanded if present in skills invoked
    by the user. This keeps behavior aligned with existing mention handling
    while extending coverage to deeper scenarios. With these changes, users
    can create skills that invoke connectors, and skills that invoke other
    skills.
    
    Replaces #10863, which is not needed with the addition of
    [search_tool_bm25](https://github.com/openai/codex/issues/10657)
  • Add cwd to memory files (#11591)
    Add cwd to memory files so that model can deal with multi cwd memory
    better.
    
    ---------
    
    Co-authored-by: jif-oai <jif@openai.com>
  • fix(core) model_info preserves slug (#11602)
    ## Summary
    Preserve the specified model slug when we get a prefix-based match
    
    ## Testing
    - [x] added unit test
    
    ---------
    
    Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>
  • chore: drop and clean from phase 1 (#11605)
    This PR is mostly cleaning and simplifying phase 1 of memories
  • Fix config test on macOS (#11579)
    When running these tests locally, you may have system-wide config or
    requirements files. This makes the tests ignore these files.
  • Ensure list_threads drops stale rollout files (#11572)
    Summary
    - trim `state_db::list_threads_db` results to entries whose rollout
    files still exist, logging and recording a discrepancy for dropped rows
    - delete stale metadata rows from the SQLite store so future calls don’t
    surface invalid paths
    - add regression coverage in `recorder.rs` to verify stale DB paths are
    dropped when the file is missing
  • feat: mem drop cot (#11571)
    Drop CoT and compaction for memory building
  • Fix flaky pre_sampling_compact switch test (#11573)
    Summary
    - address the nondeterministic behavior observed in
    `pre_sampling_compact_runs_on_switch_to_smaller_context_model` so it no
    longer fails intermittently during model switches
    - ensure the surrounding sampling logic consistently handles the
    smaller-context case that the test exercises
    
    Testing
    - Not run (not requested)
  • feat: mem slash commands (#11569)
    Add 2 slash commands for memories:
    * `/m_drop` delete all the memories
    * `/m_update` update the memories with phase 1 and 2
  • Fix test flake (#11448)
    Flaking with
    
    ```
       Nextest run ID 6b7ff5f7-57f6-4c9c-8026-67f08fa2f81f with nextest profile: default
          Starting 3282 tests across 118 binaries (21 tests skipped)
              FAIL [  14.548s] (1367/3282) codex-core::all suite::apply_patch_cli::apply_patch_cli_can_use_shell_command_output_as_patch_input
        stdout ───
    
          running 1 test
          test suite::apply_patch_cli::apply_patch_cli_can_use_shell_command_output_as_patch_input ... FAILED
    
          failures:
    
          failures:
              suite::apply_patch_cli::apply_patch_cli_can_use_shell_command_output_as_patch_input
    
          test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 522 filtered out; finished in 14.41s
    
        stderr ───
    
          thread 'suite::apply_patch_cli::apply_patch_cli_can_use_shell_command_output_as_patch_input' (15632) panicked at C:\a\codex\codex\codex-rs\core\tests\common\lib.rs:186:14:
          timeout waiting for event: Elapsed(())
          stack backtrace:
          read_output:
          Exit code: 0
          Wall time: 8.5 seconds
          Output:
          line1
          naïve café
          line3
    
          stdout:
          line1
          naïve café
          line3
          patch:
          *** Begin Patch
          *** Add File: target.txt
          +line1
          +naïve café
          +line3
          *** End Patch
          note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
    ```