17 Commits

  • Preserve hook trust bypass in codex exec threads (#26434)
    Addresses #26383 and #26452
    
    ## Summary
    
    `codex exec --dangerously-bypass-hook-trust` printed the bypass warning,
    but valid untrusted hooks still did not run.
    
    Exec applied the flag to its initial config, then lost it when
    app-server reloaded config for the new or resumed thread.
    
    ## Fix
    
    Forward `bypass_hook_trust: true` through the existing thread request
    config override for both start and resume.
    
    The override is omitted when the flag is not enabled, preserving normal
    trust behavior.
    
    ## Testing
    
    Added:
    
    - A test confirming start and resume preserve the override.
    - An end-to-end exec test confirming a `SessionStart` hook runs and
    creates a marker file.
  • Route AGENTS.md loading through environment filesystems (#26205)
    ## Why
    
    Workspace-specific `AGENTS.md` loading needs to use the selected
    environment filesystem so remote workspaces and child agents read
    instructions from their actual environment instead of the host
    filesystem. The app-server should report the same instruction sources
    the initialized thread actually loaded, rather than independently
    rescanning configuration and filesystem state.
    
    ## What changed
    
    - Introduce `LoadedAgentsMd` to retain ordered user, project, and
    internal instructions with their provenance.
    - Load and canonicalize workspace `AGENTS.md` paths through the primary
    `EnvironmentManager` environment, then render the loaded instructions
    when constructing turn context.
    - Expose cached loaded instruction sources from initialized threads and
    use them for app-server start, resume, and fork responses.
    - Preserve global `CODEX_HOME` loading and separator behavior while
    excluding empty project files that did not supply model-visible
    instructions.
    - Add integration coverage for CLI injection, selected-environment
    provenance and rendering, empty environment selection, and cached
    sources on loaded-thread resume.
    
    ## Validation
    
    - `just test -p codex-core agents_md`
    - `just test -p codex-core
    selected_environment_sources_match_model_visible_instructions`
    - `just test -p codex-exec agents_md`
    - `just test -p codex-app-server instruction_sources`
    - `just test -p codex-app-server --status-level fail`
  • Preserve auto-review approval policy in codex exec (#23763)
    ## Why
    
    `codex exec` was forcing headless runs to `approval_policy = "never"`
    even when the resolved reviewer was `auto_review`. That prevented
    unattended exec workflows from reaching the reviewed MCP write path they
    were configured to use.
    
    ## What changed
    
    - Keep the existing headless `never` default for ordinary exec runs.
    - Re-resolve exec config without that synthetic override when the final
    reviewer resolves to `AutoReview`, so configured or requirements-driven
    approval policy is preserved.
    - Add regression coverage for:
      - `auto_review` plus `on-request` from user config
    - requirements-driven `AutoReview`, asserting exec’s final approval
    policy matches the no-override control config exactly
    
    ## Validation
    
    - `just fmt`
    - `cargo test -p codex-exec`
  • Support Codex CLI stdin piping for codex exec (#15917)
    # Summary
    
    Claude Code supports a useful prompt-plus-stdin workflow:
    
    ```bash
    echo "complex input..." | claude -p "summarize concisely"
    ```
    
    Codex previously did not support the equivalent `codex exec` form. While
    `codex exec` could read the prompt from stdin, it could not combine
    piped input with an explicit prompt argument.
    
    This change adds that missing workflow:
    
    ```bash
    echo "complex input..." | codex exec "summarize concisely"
    ```
    
    With this change, when `codex exec` receives both a positional prompt
    and piped stdin, the prompt remains the instruction and stdin is passed
    along as structured `<stdin>...</stdin>` context.
    
    Example:
    
    ```bash
    curl https://jsonplaceholder.typicode.com/comments \
      | ./target/debug/codex exec --skip-git-repo-check "format the top 20 items into a markdown table" \
      > table.md
    ```
    
    This PR also adds regression coverage for:
    - prompt argument + piped stdin
    - legacy stdin-as-prompt behavior
    - `codex exec -` forced-stdin behavior
    - empty-stdin error cases
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Handle required MCP startup failures across components (#10902)
    Summary
    - add a `required` flag for MCP servers everywhere config/CLI data is
    touched so mandatory helpers can be round-tripped
    - have `codex exec` and `codex app-server` thread start/resume fail fast
    when required MCPs fail to initialize
  • feat: Add support for --add-dir to exec and TypeScript SDK (#6565)
    ## Summary
    
    Adds support for specifying additional directories in the TypeScript SDK
    through a new `additionalDirectories` option in `ThreadOptions`.
    
    ## Changes
    
    - Added `additionalDirectories` parameter to `ThreadOptions` interface
    - Updated `CodexExec` to accept and pass through additional directories
    via the `--config` flag for `sandbox_workspace_write.writable_roots`
    - Added comprehensive test coverage for the new functionality
    
    ## Test plan
    
    - Added test case that verifies `additionalDirectories` is correctly
    passed as repeated flags
    - Existing tests continue to pass
    
    ---------
    
    Co-authored-by: Claude <noreply@anthropic.com>
  • chore: drop approve all (#5503)
    Not needed anymore
  • Set codex SDK TypeScript originator (#4894)
    ## Summary
    - ensure the TypeScript SDK sets CODEX_INTERNAL_ORIGINATOR_OVERRIDE to
    codex_sdk_ts when spawning the Codex CLI
    - extend the responses proxy test helper to capture request headers for
    assertions
    - add coverage that verifies Codex threads launched from the TypeScript
    SDK send the codex_sdk_ts originator header
    
    ## Testing
    - Not Run (not requested)
    
    
    ------
    https://chatgpt.com/codex/tasks/task_i_68e561b125248320a487f129093d16e7
  • Support CODEX_API_KEY for codex exec (#4615)
    Allows to set API key per invocation of `codex exec`
  • Add turn started/completed events and correct exit code on error (#4309)
    Adds new event for session completed that includes usage. Also ensures
    we return 1 on failures.
    ```
    {
      "type": "session.created",
      "session_id": "019987a7-93e7-7b20-9e05-e90060e411ea"
    }
    {
      "type": "turn.started"
    }
    ...
    {
      "type": "turn.completed",
      "usage": {
        "input_tokens": 78913,
        "cached_input_tokens": 65280,
        "output_tokens": 1099
      }
    }
    ```
  • Add codex exec testing helpers (#4254)
    Add a shortcut to create working directories and run codex exec with
    fake server.
  • Add exec output-schema parameter (#4079)
    Adds structured output to `exec` via the `--structured-output`
    parameter.
  • enable-resume (#3537)
    Adding the ability to resume conversations.
    we have one verb `resume`. 
    
    Behavior:
    
    `tui`:
    `codex resume`: opens session picker
    `codex resume --last`: continue last message
    `codex resume <session id>`: continue conversation with `session id`
    
    `exec`:
    `codex resume --last`: continue last conversation
    `codex resume <session id>`: continue conversation with `session id`
    
    Implementation:
    - I added a function to find the path in `~/.codex/sessions/` with a
    `UUID`. This is helpful in resuming with session id.
    - Added the above mentioned flags
    - Added lots of testing
  • [exec] Clean up apply-patch tests (#2648)
    ## Summary
    These tests were getting a bit unwieldy, and they're starting to become
    load-bearing. Let's clean them up, and get them working solidly so we
    can easily expand this harness with new tests.
    
    ## Test Plan
    - [x] Tests continue to pass
  • test: faster test execution in codex-core (#2633)
    this dramatically improves time to run `cargo test -p codex-core` (~25x
    speedup).
    
    before:
    ```
    cargo test -p codex-core  35.96s user 68.63s system 19% cpu 8:49.80 total
    ```
    
    after:
    ```
    cargo test -p codex-core  5.51s user 8.16s system 63% cpu 21.407 total
    ```
    
    both tests measured "hot", i.e. on a 2nd run with no filesystem changes,
    to exclude compile times.
    
    approach inspired by [Delete Cargo Integration
    Tests](https://matklad.github.io/2021/02/27/delete-cargo-integration-tests.html),
    we move all test cases in tests/ into a single suite in order to have a
    single binary, as there is significant overhead for each test binary
    executed, and because test execution is only parallelized with a single
    binary.