17 Commits

  • [codex] Use expect in integration tests (#28441)
    The workspace denies `clippy::expect_used` in production. Although
    `clippy.toml` allows `expect` in tests, Bazel Clippy compiles
    integration-test helper code in a way that does not receive that
    exemption, which encouraged verbose `unwrap_or_else(... panic!(...))`
    and equivalent `match`/`let else` forms.
    
    This allows `clippy::expect_used` once at each integration-test crate
    root (including aggregated suites and test-support libraries), then
    replaces manual panic-based Result and Option unwraps with
    `expect`/`expect_err`. Standalone `tests/*.rs` files remain their own
    crate roots. Intentional assertion and unexpected-variant panics remain
    unchanged, and the production `expect_used = "deny"` lint remains in
    place.
    
    The cleanup is mechanical and net-negative in line count.
  • [codex] reduce module visibility (#16978)
    ## Summary
    - reduce public module visibility across Rust crates, preferring private
    or crate-private modules with explicit crate-root public exports
    - update external call sites and tests to use the intended public crate
    APIs instead of reaching through module trees
    - add the module visibility guideline to AGENTS.md
    
    ## Validation
    - `cargo check --workspace --all-targets --message-format=short` passed
    before the final fix/format pass
    - `just fix` completed successfully
    - `just fmt` completed successfully
    - `git diff --check` passed
  • execpolicy: add host_executable() path mappings (#12964)
    ## Why
    
    `execpolicy` currently keys `prefix_rule()` matching off the literal
    first token. That works for rules like `["/usr/bin/git"]`, but it means
    shared basename rules such as `["git"]` do not help when a caller passes
    an absolute executable path like `/usr/bin/git`.
    
    This PR lays the groundwork for basename-aware matching without changing
    existing callers yet. It adds typed host-executable metadata and an
    opt-in resolution path in `codex-execpolicy`, so a follow-up PR can
    adopt the new behavior in `unix_escalation.rs` and other call sites
    without having to redesign the policy layer first.
    
    ## What Changed
    
    - added `host_executable(name = ..., paths = [...])` to the execpolicy
    parser and validated it with `AbsolutePathBuf`
    - stored host executable mappings separately from prefix rules inside
    `Policy`
    - added `MatchOptions` and opt-in `*_with_options()` APIs that preserve
    existing behavior by default
    - implemented exact-first matching with optional basename fallback,
    gated by `host_executable()` allowlists when present
    - normalized executable names for cross-platform matching so Windows
    paths like `git.exe` can satisfy `host_executable(name = "git", ...)`
    - updated `match` / `not_match` example validation to exercise the
    host-executable resolution path instead of only raw prefix-rule matching
    - preserved source locations for deferred example-validation errors so
    policy load failures still point at the right file and line
    - surfaced `resolvedProgram` on `RuleMatch` so callers can tell when a
    basename rule matched an absolute executable path
    - preserved host executable metadata when requirements policies overlay
    file-based policies in `core/src/exec_policy.rs`
    - documented the new rule shape and CLI behavior in
    `execpolicy/README.md`
    
    ## Verification
    
    - `cargo test -p codex-execpolicy`
    - added coverage in `execpolicy/tests/basic.rs` for parsing, precedence,
    empty allowlists, basename fallback, exact-match precedence, and
    host-executable-backed `match` / `not_match` examples
    - added a regression test in `core/src/exec_policy.rs` to verify
    requirements overlays preserve `host_executable()` metadata
    - verified `cargo test -p codex-core --lib`, including source-rendering
    coverage for deferred validation errors
  • feat(core): persist network approvals in execpolicy (#12357)
    ## Summary
    Persist network approval allow/deny decisions as `network_rule(...)`
    entries in execpolicy (not proxy config)
    
    It adds `network_rule` parsing + append support in `codex-execpolicy`,
    including `decision="prompt"` (parse-only; not compiled into proxy
    allow/deny lists)
    - compile execpolicy network rules into proxy allow/deny lists and
    update the live proxy state on approval
    - preserve requirements execpolicy `network_rule(...)` entries when
    merging with file-based execpolicy
    - reject broad wildcard hosts (for example `*`) for persisted
    `network_rule(...)`
  • fix(core) Deduplicate prefix_rules before appending (#10309)
    ## Summary
    We ideally shouldn't make it to this point in the first place, but if we
    do try to append a rule that already exists, we shouldn't append the
    same rule twice.
    
    ## Testing
    - [x] Added unit test for this case
  • feat: add justification arg to prefix_rule() in *.rules (#8751)
    Adds an optional `justification` parameter to the `prefix_rule()`
    execpolicy DSL so policy authors can attach human-readable rationale to
    a rule. That justification is propagated through parsing/matching and
    can be surfaced to the model (or approval UI) when a command is blocked
    or requires approval.
    
    When a command is rejected (or gated behind approval) due to policy, a
    generic message makes it hard for the model/user to understand what went
    wrong and what to do instead. Allowing policy authors to supply a short
    justification improves debuggability and helps guide the model toward
    compliant alternatives.
    
    Example:
    
    ```python
    prefix_rule(
        pattern = ["git", "push"],
        decision = "forbidden",
        justification = "pushing is blocked in this repo",
    )
    ```
    
    If Codex tried to run `git push origin main`, now the failure would
    include:
    
    ```
    `git push origin main` rejected: pushing is blocked in this repo
    ```
    
    whereas previously, all it was told was:
    
    ```
    execpolicy forbids this command
    ```
  • fix: policy/*.codexpolicy -> rules/*.rules (#7888)
    We decided that `*.rules` is a more fitting (and concise) file extension
    than `*.codexpolicy`, so we are changing the file extension for the
    "execpolicy" effort. We are also changing the subfolder of `$CODEX_HOME`
    from `policy` to `rules` to match.
    
    This PR updates the in-repo docs and we will update the public docs once
    the next CLI release goes out.
    
    Locally, I created `~/.codex/rules/default.rules` with the following
    contents:
    
    ```
    prefix_rule(pattern=["gh", "pr", "view"])
    ```
    
    And then I asked Codex to run:
    
    ```
    gh pr view 7888 --json title,body,comments
    ```
    
    and it was able to!
  • Refactor execpolicy fallback evaluation (#7544)
    ## Refactor of the `execpolicy` crate
    
    To illustrate why we need this refactor, consider an agent attempting to
    run `apple | rm -rf ./`. Suppose `apple` is allowed by `execpolicy`.
    Before this PR, `execpolicy` would consider `apple` and `pear` and only
    render one rule match: `Allow`. We would skip any heuristics checks on
    `rm -rf ./` and immediately approve `apple | rm -rf ./` to run.
    
    To fix this, we now thread a `fallback` evaluation function into
    `execpolicy` that runs when no `execpolicy` rules match a given command.
    In our example, we would run `fallback` on `rm -rf ./` and prevent
    `apple | rm -rf ./` from being run without approval.
  • execpolicy helpers (#7032)
    this PR 
    - adds a helper function to amend `.codexpolicy` files with new prefix
    rules
    - adds a utility to `Policy` allowing prefix rules to be added to
    existing `Policy` structs
    
    both additions will be helpful as we thread codexpolicy into the TUI
    workflow
  • execpolicycheck command in codex cli (#7012)
    adding execpolicycheck tool onto codex cli
    
    this is useful for validating policies (can be multiple) against
    commands.
    
    it will also surface errors in policy syntax:
    <img width="1150" height="281" alt="Screenshot 2025-11-19 at 12 46
    21 PM"
    src="https://github.com/user-attachments/assets/8f99b403-564c-4172-acc9-6574a8d13dc3"
    />
    
    this PR also changes output format when there's no match in the CLI.
    instead of returning the raw string `noMatch`, we return
    `{"noMatch":{}}`
    
    this PR is a rewrite of: https://github.com/openai/codex/pull/6932 (due
    to the numerous merge conflicts present in the original PR)
    
    ---------
    
    Co-authored-by: Michael Bolin <mbolin@openai.com>
  • test: faster test execution in codex-core (#2633)
    this dramatically improves time to run `cargo test -p codex-core` (~25x
    speedup).
    
    before:
    ```
    cargo test -p codex-core  35.96s user 68.63s system 19% cpu 8:49.80 total
    ```
    
    after:
    ```
    cargo test -p codex-core  5.51s user 8.16s system 63% cpu 21.407 total
    ```
    
    both tests measured "hot", i.e. on a 2nd run with no filesystem changes,
    to exclude compile times.
    
    approach inspired by [Delete Cargo Integration
    Tests](https://matklad.github.io/2021/02/27/delete-cargo-integration-tests.html),
    we move all test cases in tests/ into a single suite in order to have a
    single binary, as there is significant overhead for each test binary
    executed, and because test execution is only parallelized with a single
    binary.
  • Added allow-expect-in-tests / allow-unwrap-in-tests (#2328)
    This PR:
    * Added the clippy.toml to configure allowable expect / unwrap usage in
    tests
    * Removed as many expect/allow lines as possible from tests
    * moved a bunch of allows to expects where possible
    
    Note: in integration tests, non `#[test]` helper functions are not
    covered by this so we had to leave a few lingering `expect(expect_used`
    checks around
  • Disallow expect via lints (#865)
    Adds `expect()` as a denied lint. Same deal applies with `unwrap()`
    where we now need to put `#[expect(...` on ones that we legit want. Took
    care to enable `expect()` in test contexts.
    
    # Tests
    
    ```
    cargo fmt
    cargo clippy --all-features --all-targets --no-deps -- -D warnings
    cargo test
    ```
  • fix: enable clippy on tests (#870)
    https://github.com/openai/codex/pull/855 added the clippy warning to
    disallow `unwrap()`, but apparently we were not verifying that tests
    were "clippy clean" in CI, so I ended up with a lot of local errors in
    VS Code.
    
    This turns on the check in CI and fixes the offenders.
  • Update cargo to 2024 edition (#842)
    Some effects of this change:
    - New formatting changes across many files. No functionality changes
    should occur from that.
    - Calls to `set_env` are considered unsafe, since this only happens in
    tests we wrap them in `unsafe` blocks
  • feat: introduce codex_execpolicy crate for defining "safe" commands (#634)
    As described in detail in `codex-rs/execpolicy/README.md` introduced in
    this PR, `execpolicy` is a tool that lets you define a set of _patterns_
    used to match [`execv(3)`](https://linux.die.net/man/3/execv)
    invocations. When a pattern is matched, `execpolicy` returns the parsed
    version in a structured form that is amenable to static analysis.
    
    The primary use case is to define patterns match commands that should be
    auto-approved by a tool such as Codex. This supports a richer pattern
    matching mechanism that the sort of prefix-matching we have done to
    date, e.g.:
    
    
    https://github.com/openai/codex/blob/5e40d9d2211737f46136610497bcd9a8271009e0/codex-cli/src/approvals.ts#L333-L354
    
    Note we are still playing with the API and the `system_path` option in
    particular still needs some work.