Commit Graph

36 Commits

  • feat: ghost snapshot v2 (#8055)
    This PR updates ghost snapshotting to avoid capturing oversized
    untracked artifacts while keeping undo safe. Snapshot creation now
    builds a temporary index from `git status --porcelain=2 -z`, writes a
    tree and detached commit without touching refs, and records any ignored
    large files/dirs in the snapshot report. Undo uses that metadata to
    preserve large local artifacts while still cleaning up new transient
    files.
  • fix: introduce AbsolutePathBuf as part of sandbox config (#7856)
    Changes the `writable_roots` field of the `WorkspaceWrite` variant of
    the `SandboxPolicy` enum from `Vec<PathBuf>` to `Vec<AbsolutePathBuf>`.
    This is helpful because now callers can be sure the value is an absolute
    path rather than a relative one. (Though when using an absolute path in
    a Seatbelt config policy, we still have to _canonicalize_ it first.)
    
    Because `writable_roots` can be read from a config file, it is important
    that we are able to resolve relative paths properly using the parent
    folder of the config file as the base path.
  • fix: race on rx subscription (#7921)
    Fix race where the PTY was sending first chunk before the subscription
    to the broadcast
  • fix: introduce AbsolutePathBuf and resolve relative paths in config.toml (#7796)
    This PR attempts to solve two problems by introducing a
    `AbsolutePathBuf` type with a special deserializer:
    
    - `AbsolutePathBuf` attempts to be a generally useful abstraction, as it
    ensures, by constructing, that it represents a value that is an
    absolute, normalized path, which is a stronger guarantee than an
    arbitrary `PathBuf`.
    - Values in `config.toml` that can be either an absolute or relative
    path should be resolved against the folder containing the `config.toml`
    in the relative path case. This PR makes this easy to support: the main
    cost is ensuring `AbsolutePathBufGuard` is used inside
    `deserialize_config_toml_with_base()`.
    
    While `AbsolutePathBufGuard` may seem slightly distasteful because it
    relies on thread-local storage, this seems much cleaner to me than using
    than my various experiments with
    https://docs.rs/serde/latest/serde/de/trait.DeserializeSeed.html.
    Further, since the `deserialize()` method from the `Deserialize` trait
    is not async, we do not really have to worry about the deserialization
    work being spread across multiple threads in a way that would interfere
    with `AbsolutePathBufGuard`.
    
    To start, this PR introduces the use of `AbsolutePathBuf` in
    `OtelTlsConfig`. Note how this simplifies `otel_provider.rs` because it
    no longer requires `settings.codex_home` to be threaded through.
    Furthermore, this sets us up better for a world where multiple
    `config.toml` files from different folders could be loaded and then
    merged together, as the absolutifying of the paths must be done against
    the correct parent folder.
  • Vendor ConPtySystem (#7656)
    The repo we were depending on is very large and we need very small part
    of it.
    
    ---------
    
    Co-authored-by: Pavel <pavel@krymets.com>
  • Fix unified_exec on windows (#7620)
    Fix unified_exec on windows
    
    Requires removal of PSUEDOCONSOLE_INHERIT_CURSOR flag so child processed
    don't attempt to wait for cursor position response (and timeout).
    
    
    https://github.com/wezterm/wezterm/compare/main...pakrym:wezterm:PSUEDOCONSOLE_INHERIT_CURSOR?expand=1
    
    ---------
    
    Co-authored-by: pakrym-oai <pakrym@openai.com>
  • Fix: track only untracked paths in ghost snapshots (#7470)
    # Ghost snapshot ignores
    
    This PR should close #7067, #7395, #7405.
    
    Prior to this change the ghost snapshot task ran `git status
    --ignored=matching` so the report picked up literally every ignored
    file. When a directory only contained entries matched by patterns such
    as `dozens/*.txt`, `/test123/generated/*.html`, or `/wp-includes/*`, Git
    still enumerated them and the large-untracked-dir detection treated the
    parent directory as “large,” even though everything inside was
    intentionally ignored.
    
    By removing `--ignored=matching` we only capture true untracked paths
    now, so those patterns stay out of the snapshot report and no longer
    trigger the “large untracked directories” warning.
    
    ---------
    
    Signed-off-by: lionelchg <lionel.cheng@hotmail.fr>
    Co-authored-by: lionelchg <lionel.cheng@hotmail.fr>
  • chore: add cargo-deny configuration (#7119)
    - add GitHub workflow running cargo-deny on push/PR
    - document cargo-deny allowlist with workspace-dep notes and advisory
    ignores
    - align workspace crates to inherit version/edition/license for
    consistent checks
  • Add the utility to truncate by tokens (#6746)
    - This PR is to make it on path for truncating by tokens. This path will
    be initially used by unified exec and context manager (responsible for
    MCP calls mainly).
    - We are exposing new config `calls_output_max_tokens`
    - Use `tokens` as the main budget unit but truncate based on the model
    family by Introducing `TruncationPolicy`.
    - Introduce `truncate_text` as a router for truncation based on the
    mode.
    
    In next PRs:
    - remove truncate_with_line_bytes_budget
    - Add the ability to the model to override the token budget.
  • Update defaults to gpt-5.1 (#6652)
    ## Summary
    - update documentation, example configs, and automation defaults to
    reference gpt-5.1 / gpt-5.1-codex
    - bump the CLI and core configuration defaults, model presets, and error
    messaging to the new models while keeping the model-family/tool coverage
    for legacy slugs
    - refresh tests, fixtures, and TUI snapshots so they expect the upgraded
    defaults
    
    ## Testing
    - `cargo test -p codex-core
    config::tests::test_precedence_fixture_with_gpt5_profile`
    
    
    ------
    [Codex
    Task](https://chatgpt.com/codex/tasks/task_i_6916c5b3c2b08321ace04ee38604fc6b)
  • Reasoning level update (#6586)
    Automatically update reasoning levels when migrating between models
  • Use codex-linux-sandbox in unified exec (#6480)
    Unified exec isn't working on Linux because we don't provide the correct
    arg0.
    
    The library we use for pty management doesn't allow setting arg0
    separately from executable. Use the same aliasing strategy we use for
    `apply_patch` for `codex-linux-sandbox`.
    
    Use `#[ctor]` hack to dispatch codex-linux-sandbox calls.
    
    
    Addresses https://github.com/openai/codex/issues/6450
  • [app-server] remove serde(skip_serializing_if = "Option::is_none") annotations (#5939)
    We had this annotation everywhere in app-server APIs which made it so
    that fields get serialized as `field?: T`, meaning if the field as
    `None` we would omit the field in the payload. Removing this annotation
    changes it so that we return `field: T | null` instead, which makes
    codex app-server's API more aligned with the convention of public OpenAI
    APIs like Responses.
    
    Separately, remove the `#[ts(optional_fields = nullable)]` annotations
    that were recently added which made all the TS types become `field?: T |
    null` which is not great since clients need to handle undefined and
    null.
    
    I think generally it'll be best to have optional types be either:
    - `field: T | null` (preferred, aligned with public OpenAI APIs)
    - `field?: T` where we have to, such as types generated from the MCP
    schema:
    https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/schema/2025-06-18/schema.ts
    (see changes to `mcp-types/`)
    
    I updated @etraut-openai's unit test to check that all generated TS
    types are one or the other, not both (so will error if we have a type
    that has `field?: T | null`). I don't think there's currently a good use
    case for that - but we can always revisit.
  • Add missing "nullable" macro to protocol structs that contain optional fields (#5901)
    This PR addresses a current hole in the TypeScript code generation for
    the API server protocol. Fields that are marked as "Optional<>" in the
    Rust code are serialized such that the value is omitted when it is
    deserialized — appearing as `undefined`, but the TS type indicates
    (incorrectly) that it is always defined but possibly `null`. This can
    lead to subtle errors that the TypeScript compiler doesn't catch. The
    fix is to include the `#[ts(optional_fields = nullable)]` macro for all
    protocol structs that contain one or more `Optional<>` fields.
    
    This PR also includes a new test that validates that all TS protocol
    code containing "| null" in its type is marked optional ("?") to catch
    cases where `#[ts(optional_fields = nullable)]` is omitted.
  • chore: merge git crates (#5909)
    Merge `git-apply` and `git-tooling` into `utils/`
  • feat: image resizing (#5446)
    Add image resizing on the client side to reduce load on the API
  • chore: rework tools execution workflow (#5278)
    Re-work the tool execution flow. Read `orchestrator.rs` to understand
    the structure
  • Use assert_matches (#4756)
    assert_matches is soon to be in std but is experimental for now.
  • chore: refactor tool handling (#4510)
    # Tool System Refactor
    
    - Centralizes tool definitions and execution in `core/src/tools/*`:
    specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
    registry/dispatch (`registry.rs`), and shared context (`context.rs`).
    One registry now builds the model-visible tool list and binds handlers.
    - Router converts model responses to tool calls; Registry dispatches
    with consistent telemetry via `codex-rs/otel` and unified error
    handling. Function, Local Shell, MCP, and experimental `unified_exec`
    all flow through this path; legacy shell aliases still work.
    - Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
    make adding tools predictable and testable.
    
    Example: `read_file`
    - Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
    registered by `build_specs`).
    - Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
    1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
    - E2E test: `core/tests/suite/read_file.rs` validates the tool returns
    the requested lines.
    
    ## Next steps:
    - Decompose `handle_container_exec_with_params` 
    - Add parallel tool calls
  • fix: separate codex mcp into codex mcp-server and codex app-server (#4471)
    This is a very large PR with some non-backwards-compatible changes.
    
    Historically, `codex mcp` (or `codex mcp serve`) started a JSON-RPC-ish
    server that had two overlapping responsibilities:
    
    - Running an MCP server, providing some basic tool calls.
    - Running the app server used to power experiences such as the VS Code
    extension.
    
    This PR aims to separate these into distinct concepts:
    
    - `codex mcp-server` for the MCP server
    - `codex app-server` for the "application server"
    
    Note `codex mcp` still exists because it already has its own subcommands
    for MCP management (`list`, `add`, etc.)
    
    The MCP logic continues to live in `codex-rs/mcp-server` whereas the
    refactored app server logic is in the new `codex-rs/app-server` folder.
    Note that most of the existing integration tests in
    `codex-rs/mcp-server/tests/suite` were actually for the app server, so
    all the tests have been moved with the exception of
    `codex-rs/mcp-server/tests/suite/mod.rs`.
    
    Because this is already a large diff, I tried not to change more than I
    had to, so `codex-rs/app-server/tests/common/mcp_process.rs` still uses
    the name `McpProcess` for now, but I will do some mechanical renamings
    to things like `AppServer` in subsequent PRs.
    
    While `mcp-server` and `app-server` share some overlapping functionality
    (like reading streams of JSONL and dispatching based on message types)
    and some differences (completely different message types), I ended up
    doing a bit of copypasta between the two crates, as both have somewhat
    similar `message_processor.rs` and `outgoing_message.rs` files for now,
    though I expect them to diverge more in the near future.
    
    One material change is that of the initialize handshake for `codex
    app-server`, as we no longer use the MCP types for that handshake.
    Instead, we update `codex-rs/protocol/src/mcp_protocol.rs` to add an
    `Initialize` variant to `ClientRequest`, which takes the `ClientInfo`
    object we need to update the `USER_AGENT_SUFFIX` in
    `codex-rs/app-server/src/message_processor.rs`.
    
    One other material change is in
    `codex-rs/app-server/src/codex_message_processor.rs` where I eliminated
    a use of the `send_event_as_notification()` method I am generally trying
    to deprecate (because it blindly maps an `EventMsg` into a
    `JSONNotification`) in favor of `send_server_notification()`, which
    takes a `ServerNotification`, as that is intended to be a custom enum of
    all notification types supported by the app server. So to make this
    update, I had to introduce a new variant of `ServerNotification`,
    `SessionConfigured`, which is a non-backwards compatible change with the
    old `codex mcp`, and clients will have to be updated after the next
    release that contains this PR. Note that
    `codex-rs/app-server/tests/suite/list_resume.rs` also had to be update
    to reflect this change.
    
    I introduced `codex-rs/utils/json-to-toml/src/lib.rs` as a small utility
    crate to avoid some of the copying between `mcp-server` and
    `app-server`.
  • chore: extract readiness in a dedicated utils crate (#4140)
    Create an `utils` directory for the small utils crates