mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
dev
121 Commits
-
feat(app-server): add history_mode to thread (#29927)
## Description This PR adds a new `historyMode = "legacy" | "paginated"` to `Thread`. This will be stored in `SessionMeta` in the JSONL rollout file and as a new column in the SQLite thread_metadata table, and exposed on `thread/start` and on the `Thread` object in app-server. ## What changed - Added canonical `ThreadHistoryMode` with `legacy` and `paginated`, defaulting old and new SessionMeta to `legacy`. - Carried `history_mode` through core session config, ThreadStore stored metadata, local/in-memory stores, rollout metadata extraction, and the existing SQLite `threads` table. - Added experimental `historyMode` to app-server v2 `Thread` and `thread/start`. - Made paginated stored threads metadata-discoverable but unsupported for legacy full-history reads, `load_history`, live resume, and create paths. - Regenerated app-server schema fixtures and added protocol/state/thread-store/app-server coverage for persistence and fail-closed behavior. ## Compatibility floor Because users may be running various versions of Codex binaries on the same machine (TUI, Codex App, etc.), we will need to establish a compatibility floor for upcoming paginated threads, which will change how thread storage reads and writes work. The overall plan here: ``` Release N: - Add historyMode to SessionMeta / Thread / SQLite metadata. - Teach binaries to understand paginated threads. - If a binary sees `historyMode="paginated"` but does not support the paginated contract, it refuses to resume/mutate the thread. - Default remains `"legacy"`. Release N+1: - First-party clients start opting into paginated threads where appropriate. - Internal dogfood / staged rollout. - Measure old-client usage and paginated-thread unsupported errors. Release N+2: - Only after Release N+ is overwhelmingly deployed, make paginated the default. - Accept that a small tail of N-1-or-older binaries may not understand paginated threads. ``` The important behavior change is fail-closed handling for a binary that encounters a persisted `paginated` thread before it knows how to fully support paginated history. In app-server, if a thread is `paginated`, we will: - allow metadata-only discovery paths like `thread/list` and `thread/read(includeTurns=false)`, so clients can still see the thread and inspect its `historyMode` - reject legacy full-history/live-thread paths like `thread/read(includeTurns=true)` and `thread/resume` with an unsupported JSON-RPC error - avoid silently treating an unknown or future `historyMode` as `legacy` Under the hood, the ThreadStore layer also rejects legacy operations that would need to load or replay the full thread history for a paginated thread. That gives us the behavior we want for Release N: future paginated threads are visible, but this binary fails closed instead of trying to operate on them as if they were legacy threads.
Owen Lin ·
2026-06-26 09:12:42 -07:00 -
feat: add provider-aware model fallback to thread start (#29942)
## Why Helper threads such as task title generation can request a model ID that is valid for the default OpenAI provider but unavailable from the active provider. With Amazon Bedrock, `gpt-5.4-mini` is rejected while the provider static catalog exposes Bedrock model IDs such as `openai.gpt-5.5` and `openai.gpt-5.4`. This causes repeated background 404s and can surface a misleading turn error even when the main turn succeeds. Clients need an explicit way to ask app-server to resolve an unavailable helper model to the active provider default. That fallback must remain limited to providers with an authoritative static catalog so custom or dynamically discovered model IDs are not rewritten based on an incomplete catalog. Fixes #28741. ## What changed - Add the experimental `allowProviderModelFallback` option to `thread/start`, defaulting to `false` to preserve existing behavior. - Thread the option through thread creation and model selection. - When enabled for a static model manager, preserve requested models present in the catalog and replace unavailable models with the provider default. - Continue preserving explicit model IDs for dynamic model managers without fetching a catalog solely to validate them. - Document the new `thread/start` behavior in the app-server API overview. ## Test Temporary test-client harness: ``` ThreadStartParams { model: Some("gpt-5.4-mini".to_string()), allow_provider_model_fallback: true, ..Default::default() } ``` Command: ``` CODEX_HOME=/tmp/codex-bedrock-thread-start-home \ CODEX_E2E_BEDROCK_THREAD_START_ONLY=1 \ ./target/debug/codex-app-server-test-client \ --codex-bin ./target/debug/codex \ -c 'model_provider="amazon-bedrock"' \ send-message-v2 --experimental-api ignored ``` Relevant output: ``` > "method": "thread/start", > "params": { > "model": "gpt-5.4-mini", > "modelProvider": null, > "allowProviderModelFallback": true, > ... > } < "result": { < "model": "openai.gpt-5.5", < "modelProvider": "amazon-bedrock", < ... < } ```
Celia Chen ·
2026-06-25 18:24:34 +00:00 -
[codex] Add Ultra reasoning effort (#29899)
## Why Ultra should be one user-facing reasoning selection for work that benefits from both maximum reasoning and proactive multi-agent delegation. Without it, clients must coordinate maximum reasoning with the experimental `multiAgentMode` setting, even though the inference backend still expects its existing `max` effort value. This change makes reasoning effort the source of truth: clients select `ultra`, core derives proactive multi-agent behavior when the turn is eligible for multi-agent V2, and inference requests continue to use the backend-compatible `max` value. ## What changed - Add `ultra` as a first-class reasoning effort and preserve model-catalog ordering when exposing it to clients. - Convert `ultra` to `max` at the inference request boundary, including Responses HTTP/WebSocket requests, startup prewarm, compaction, and memory summarization. - Derive effective multi-agent mode per turn from effective reasoning effort: - eligible multi-agent V2 + `ultra` → `proactive` - eligible multi-agent V2 + any other effort → `explicitRequestOnly` - V1 or otherwise ineligible sessions → no multi-agent mode instruction - Keep the derived effective mode in turn context history so successive turns can emit a developer-message update only when the effective mode changes. - Remove selected multi-agent mode from core session configuration, turn construction, thread settings, resume/fork restoration, and subagent spawn plumbing. Subagents inherit reasoning effort and derive their own effective mode. - Retain the experimental app-server `multiAgentMode` fields for wire compatibility while marking them deprecated. Request values are accepted but ignored; compatibility response fields report `explicitRequestOnly`. - Display Ultra in the TUI using the order supplied by `model/list`. ## Validation - `just test -p codex-core ultra_reasoning_uses_max_for_requests` - `just test -p codex-tui model_reasoning_selection_popup`
Shijie Rao ·
2026-06-24 20:13:52 -07:00 -
[codex] Inject agent graph store into ThreadManager (#29736)
Pick up the AgentGraphStore migration. - Inject an explicit optional agent graph store into `ThreadManager` - Move all calls to spawn, close, recursive resume, and subtree/archive/delete/feedback traversal through it - Keep using `LocalAgentGraphStore` when SQLite is available This required some changes to the interface to deal with futures: - The interface now matches `ThreadStore`'s object-safe pattern by returning a boxed `AgentGraphStoreFuture` directly, allowing `ThreadManager` to hold `Arc<dyn AgentGraphStore>` *Slight behavior change!* Unfiltered subtree enumeration now performs a single all-status breadth-first traversal, so a closed grandchild beneath an open edge is included; the previous Open-then-Closed traversals could not cross mixed-status paths and silently omitted it.
Tom ·
2026-06-24 13:24:10 -07:00 -
test: add app-server auto environment helper (#29746)
## Why Start moving towards app-server tests defaulting to running against remote & foreign OS executors. To do so we need a point of indirection similar to core integration tests' `build_with_auto_env`, but with the flexibility of letting tests control environment registration if they need to. ## What This adds: - `TestAppServer::new_with_auto_env()` for constructing an app server with a default environment defined by the test runner (e.g. bazel) - `TestAppServer::auto_env_params()` for tests to easily acquire turn env params tailored to the automatic environment - `TestAppServer::send_thread_start_request_with_auto_env()` to make it easy for tests to start a thread using the automatic environment The above methods all fail if the test calling them has set up an environment where the automatic environment configuration conflicts with test-created state. ## Validation Adds a couple of basic smoke tests to the app-server test suite. Follow-ups will migrate more tests to use it.
Adam Perry @ OpenAI ·
2026-06-24 01:06:29 +00:00 -
core tests: rename automatic environment builder (#29728)
## Why Use a clearer name for what happens when this helper sets up a test environment. ## What - Rename the builder and its harness wrapper to use `auto_env` instead of `remote_env` because the helper will set up a local environment if configured by the build system.
Adam Perry @ OpenAI ·
2026-06-23 21:45:06 +00:00 -
test: branch on target OS instead of runner flavor (#29712)
## Why Core tests should branch on the executor's operating system, not on runner details such as Docker or Wine. This keeps platform behavior stable as new test backends are added and reserves Wine-specific skips for actual runner debt. ## What - Add `TestTargetOs` and target/host-aware skip helpers while keeping `TestEnvironment` internal. - Replace topology enum access with remote predicates and a narrow Docker accessor. - Migrate OS-semantic Wine skips, preserve runner-specific gaps, and document the skip taxonomy. ## Validation - `just test -p core_test_support` - `just test -p codex-core remote_test_env_can_connect_and_use_filesystem` - `bazel test //codex-rs/core:core-all-wine-exec-test --test_output=errors` reached test execution; unrelated existing view-image, path, and timing failures remain. - `just test -p codex-core` and `just test` reached broad test execution; this checkout has unrelated helper, sandbox, and timing failures.
Adam Perry @ OpenAI ·
2026-06-23 14:27:13 -07:00 -
path-uri: clarify host-native path conversion (#29501)
## Why Downstream refactors are producing confusing code with this functionality having a very generic name. Encoding the specific conversion approach in the method name makes it clearer. ## What Rename `PathUri::from_path` to `PathUri::from_host_native_path` and update its Rust call sites.
Adam Perry @ OpenAI ·
2026-06-23 00:02:33 +00:00 -
Expose thread-level multi-agent mode (#28792)
## Why Once multi-agent mode can be selected per turn, clients also need to choose the initial selection when creating a thread and observe that selection through lifecycle and settings APIs. The selected value is intentionally distinct from the effective model-visible value: no client selection is represented as `null`, even though an eligible multi-agent v2 turn derives `explicitRequestOnly` as its effective default. ## What changed - Add the optional experimental `thread/start.multiAgentMode` parameter and pass it through thread creation. - Preserve an omitted initial value as an unset selection rather than eagerly storing `explicitRequestOnly`. - Apply an explicit `thread/start` selection to the first turn through the session configuration established at thread creation. - Restore the latest persisted effective mode as the selected baseline on cold resume when rollout history contains one. - Inherit the optional selected mode from a loaded parent when creating related runtime threads. - Return the current selected `multiAgentMode` from `thread/start`, `thread/resume`, `thread/fork`, and thread settings, using `null` when no mode is selected. - Keep lifecycle reporting independent from model capability and feature eligibility; core turn construction remains responsible for calculating and persisting the effective mode. ## Not covered - Clearing an existing loaded-session selection back to unset through `turn/start`; omitted or `null` currently retains the session's selection. - A TUI control, slash command, or `config.toml` preference. ## Verification - `CARGO_INCREMENTAL=0 just test -p codex-app-server-protocol` - `CARGO_INCREMENTAL=0 just test -p codex-app-server multi_agent_mode` The focused app-server coverage verifies explicit `thread/start` initialization, first-turn prompting, nullable reporting for an omitted selection, and retention of selections that are not currently runtime-eligible. ## Stack Stacked on #28685. This PR contains only the thread initialization and lifecycle/settings API layer.
Shijie Rao ·
2026-06-19 10:50:44 +02:00 -
core: load AGENTS.md from foreign environments (#28958)
## Why Make it possible to load AGENTS.md from remote exec-servers whose OS is different than app-server. ## What - keep `AGENTS.md` discovery and provenance as `PathUri`, with root-aware parent and ancestor traversal - expose lifecycle instruction sources as legacy app-server path strings in events while retaining `PathUri` internally - preserve and test mixed POSIX and Windows paths in model context and TUI status output - cover remote Windows loading end to end by seeding the Wine prefix through host filesystem APIs - fix bug in `PathUri`'s parent() implementation that would erase Windows drive letters
Adam Perry @ OpenAI ·
2026-06-18 15:06:23 -07:00 -
current time reminders impl for system clock (varlatency 2/n) (#28824)
Stacked on #28822. ## Summary - add a host-injectable current-time provider with a built-in system implementation - record UTC developer reminders in history immediately before due model requests - keep cadence state per session and force a refresh after compaction This does NOT include the app server client <-> server clock logic. This PR is only for the reminder message & system clock that will be used in prod. ## Testing - `just test -p codex-core varlatency_` - `just clippy -p codex-core -p codex-app-server -p codex-mcp-server -p codex-thread-manager-sample` - `just fmt`
rka-oai ·
2026-06-18 19:18:42 +00:00 -
Support
openai/formextended form elicitations (#27500)# Summary Allow App Server clients to opt into `openai/form` MCP elicitations.
Gabriel Peal ·
2026-06-18 11:54:49 -07:00 -
[tests] Keep Apps out of generic core test harness (#28508)
## Summary - disable the stable Apps feature in the generic `test_codex()` integration-test harness - keep Apps-specific tests explicit: their builders re-enable Apps and point it at a local mock server ## Why Generic tests that use dummy ChatGPT auth were also enabling the host-owned `codex_apps` MCP server. That made unrelated tests contact `chatgpt.com` and wait for MCP startup, causing the Bazel timeouts observed on #28368. The generic harness should be hermetic and should not start an external service that the test did not request. This is test-only; production Apps behavior is unchanged. The broader optional-MCP startup behavior is being handled separately in #28407. ## Testing - `just test -p codex-core -E 'test(pre_sampling_compact_runs_when_comp_hash_changes) | test(model_switch_to_smaller_model_updates_token_context_window) | test(codex_apps_file_params_upload_local_paths_before_mcp_tool_call)'` - `just fix -p codex-core` - `just fmt`
jif ·
2026-06-16 13:07:43 +02:00 -
[codex] Use expect in integration tests (#28441)
The workspace denies `clippy::expect_used` in production. Although `clippy.toml` allows `expect` in tests, Bazel Clippy compiles integration-test helper code in a way that does not receive that exemption, which encouraged verbose `unwrap_or_else(... panic!(...))` and equivalent `match`/`let else` forms. This allows `clippy::expect_used` once at each integration-test crate root (including aggregated suites and test-support libraries), then replaces manual panic-based Result and Option unwraps with `expect`/`expect_err`. Standalone `tests/*.rs` files remain their own crate roots. Intentional assertion and unexpected-variant panics remain unchanged, and the production `expect_used = "deny"` lint remains in place. The cleanup is mechanical and net-negative in line count.
pakrym-oai ·
2026-06-15 21:53:47 -07:00 -
Run core integration tests against a Wine-backed Windows executor (#28401)
## Why We want to exercise a linux app-server against a windows exec-server without having to repeat every test case. This approach has slight precedent in the remote docker test setup. ## What Run the shared `codex-core` integration suite against Windows exec-server behavior from Linux. This makes cross-OS path and shell regressions visible while keeping unsupported cases owned by individual tests. - Add `local`, `docker`, and `wine-exec` test environment selection with legacy Docker compatibility. - Extend `codex_rust_crate` to generate a sharded Wine-exec variant using a cross-built Windows server and pinned Bazel Wine/PowerShell runtimes. - Teach remote-aware helpers about Windows paths and track temporary incompatibilities with source-local `skip_if_wine_exec!` calls and follow-up reasons.
Adam Perry @ OpenAI ·
2026-06-16 00:38:41 +00:00 -
[codex] exec-server honors remote environment cwd and shell (#28122)
## Why Next slice needed to make progress on the `remote_env_windows` test is to support passing a Windows cwd for the remote environment and using that environment's native shell. This lets the test run a real Windows process instead of only recording an early path or shell mismatch. ## What - change `TurnEnvironmentSelection.cwd` from `AbsolutePathBuf` to `PathUri` - convert local cwd values to URIs when constructing selections - preserve a remote primary cwd instead of replacing it with the local legacy fallback - prefer the selected environment's discovered shell for unified exec, falling back to the session shell when unavailable - convert back to a host-native absolute path at current native-only consumer boundaries - reject or deny unsupported foreign cwd values at the existing request-permissions boundary, with TODOs for its future migration - extend the hermetic Wine test to execute Windows PowerShell in `C:\windows` and verify successful process completion - record the current app-server rejection against the same Wine-backed remote Windows fixture when its cwd is supplied as a native Windows path
Adam Perry @ OpenAI ·
2026-06-14 06:07:46 +00:00 -
[codex] make PathUri::from_abs_path infallible (#27976)
## Why `PathUri::from_abs_path` can fail for absolute paths that do not have a normal `file:` URI representation, forcing filesystem call sites to handle a conversion error even though the original path can be preserved losslessly. ## What Make `from_abs_path` infallible and migrate its callers. Unrepresentable paths use `file:///%00/bad/path/<base64>`, encoding Unix bytes or Windows UTF-16LE; `to_abs_path` validates and decodes that fallback. The leading encoded null reserves a namespace that cannot collide with a real Unix or Windows path, and fallback URIs remain opaque to lexical path operations. ## Validation Added path-URI coverage for Unix null and non-UTF-8 paths, Windows device/verbatim and non-Unicode paths, serialization, malformed fallbacks, opaque lexical operations, invalid native payloads, and literal `/bad/path` collision resistance.
Adam Perry @ OpenAI ·
2026-06-12 16:58:42 -07:00 -
[codex] Load user instructions through an injected provider (#27101)
## Why We want to remove implicit use of `$CODEX_HOME` from `codex-core` and make embedders responsible for supplying user-level instructions. This also ensures user instructions load when no primary environment is selected. ## What changed Stacked on #27415, which makes `codex exec` surface thread-scoped runtime warnings. - Added `UserInstructionsProvider` to `codex-extension-api`, with absolute source attribution and recoverable loading warnings. - Added `codex-home` with the filesystem-backed provider for `AGENTS.override.md` and `AGENTS.md`, preserving precedence, fallback, trimming, lossy UTF-8 handling, and the existing uncapped global instruction size. - Removed global instruction loading from `Config` and require `ThreadManager` callers to inject a provider. - Load provider instructions once for each fresh root runtime, including runtimes without a primary environment. Running sessions retain their snapshot, while child agents inherit the parent snapshot without invoking the provider. - Keep provider instructions separate while loading project `AGENTS.md`, then assemble the model-visible instructions with the existing ordering, source attribution, warning, and turn-context behavior. - Wired the Codex home provider through the CLI, app server, MCP server, core facade, and thread-manager sample. ## Validation - `just test -p codex-home -p codex-extension-api` - `just test -p codex-core agents_md` - `just test -p codex-core guardian` - `just test -p codex-app-server thread_start_without_selected_environment_includes_only_global_instruction_source` - `just test -p codex-exec warning` - `just bazel-lock-check`
Adam Perry @ OpenAI ·
2026-06-11 19:28:47 +00:00 -
[codex] migrate ExecutorFileSystem paths to PathUri (#27424)
## Why We're moving exec-server to use PathUri for its internal path representations. ## What Move `ExecutorFileSystem` APIs to use `PathUri` instead of `AbsolutePathBuf`. Future changes will convert higher-level parts of exec-server.
Adam Perry @ OpenAI ·
2026-06-11 18:44:18 +00:00 -
Pair thread environment settings (#26687)
## Why Thread cwd and environment selections are a single logical setting in core: updating one without the other can silently desynchronize the next-turn execution context. This change makes that relationship explicit in the internal thread settings flow while preserving the existing app-server public API shape. ## What changed - Moved the cwd/environment pair through internal `ThreadSettingsOverrides.environment_settings` instead of a top-level internal `cwd` field. - Kept `thread/settings/update` public params unchanged, with app-server translating top-level `cwd` into the paired internal settings shape. - Moved `Op::UserInput` environment overrides into thread settings so user turns and settings updates use the same core path. - Updated core, app-server, MCP, memories, sample, and test callsites to construct the paired settings shape. ## Verification - `git diff --check` - Local test run starting after PR creation.
pakrym-oai ·
2026-06-08 13:55:15 -07:00 -
[codex] Use standalone tools for Responses Lite (#26490)
## Summary Responses Lite does not execute hosted Responses tools, so models using it must route web search and image generation through Codex-owned executors & standalone Response's API endpoints. This PR is stacked on #26487. ## Validation - `cargo test -p codex-core responses_lite_ --lib` - `cargo test -p codex-core standalone_executors_remain_hidden_without_flags_or_responses_lite --lib` - `cargo test -p codex-core hosted_tools_follow_provider_auth_model_and_config_gates --lib` - `cargo test -p codex-web-search-extension -p codex-image-generation-extension` - `cargo test -p codex-app-server --test all standalone_` - `cargo fmt --all -- --check`
rka-oai ·
2026-06-06 00:23:40 +00:00 -
Require absolute cwd in thread settings (#26532)
## Why Thread settings cwd overrides are expected to be resolved before they enter core. Keeping this boundary as a plain `PathBuf` made it easy for core/session code to keep fallback normalization and relative-path resolution logic in places that should only receive an already-resolved cwd. This is intentionally the absolute-cwd-only slice: it does not change environment selection stickiness or cwd-to-default-environment fallback behavior. ## What changed - Changes `ThreadSettingsOverrides.cwd`, `CodexThreadSettingsOverrides.cwd`, and `SessionSettingsUpdate.cwd` to use `AbsolutePathBuf`. - Removes core-side cwd normalization/resolution from session settings updates. - Updates affected core/app-server test helpers and callsites to pass existing absolute cwd values or use `abs()` helpers. ## Validation Opening as draft so CI can start while local validation continues.
pakrym-oai ·
2026-06-05 09:29:15 -07:00 -
Switch runtime to cloud config bundle (#24622)
## Summary - Adapts the moved `codex-cloud-config` crate from the legacy cloud requirements endpoint to the new config bundle endpoint. - Switches runtime consumers from `CloudRequirementsLoader` to `CloudConfigBundleLoader` so one shared bundle supplies cloud-delivered config and requirements. - Removes the legacy cloud requirements domain loader path. ## Details This intentionally keeps `codex-cloud-config` monolithic for review lineage: the previous PR establishes the crate move, and this PR shows the behavior change against that moved implementation. A follow-up PR splits the module back into focused files. The new bundle path preserves the important cloud requirements loader semantics where intended: account-scoped signed cache, 30 minute TTL, 5 minute refresh cadence, retry/backoff, auth recovery, and fail-closed startup loading. The cached payload changes from a single requirements TOML string to the backend-delivered bundle, and validation rejects malformed config or requirements fragments before cache write/use.
joeflorencio-openai ·
2026-06-02 13:18:59 -07:00 -
Add experimental turn additional context (#24154)
## Summary Adds experimental `additionalContext` support to `turn/start` and `turn/steer` so clients can provide ephemeral external context, such as browser or automation state, without turning that plumbing into a visible user prompt or triggering user-prompt lifecycle behavior. ## API Shape The parameter shape is: ```ts additionalContext?: Record<string, { value: string kind: "untrusted" | "application" }> | null ``` Example: ```json { "additionalContext": { "browser_info": { "value": "Active tab is CI failures.", "kind": "untrusted" }, "automation_info": { "value": "CI rerun is in progress.", "kind": "application" } } } ``` The keys are opaque and caller-defined. ## Context Injection When provided, accepted entries are inserted into model context as hidden contextual message items, not as visible thread user-message items. `kind: "untrusted"` entries are inserted with role `user`: ```text <external_${key}>${value}</external_${key}> ``` `kind: "application"` entries are inserted with role `developer`: ```text <${key}>${value}</${key}> ``` Values are not escaped. Each value is truncated to 1k approximate tokens before wrapping. For `turn/start`, accepted additional context is inserted before normal user input. For `turn/steer`, additional context is merged only when the steer includes non-empty user input; context-only steers still reject as empty input. ## Dedupe Strategy `AdditionalContextStore` lives on session state and stores the latest complete additional-context map. Each `turn/start` or non-empty `turn/steer` treats its `additionalContext` as the current complete set of values. Entries are injected only when the key is new or the exact entry for that key changed, including `value` or `kind`. After merging, the store is replaced with the provided map, so omitted keys are removed from the retained set and can be injected again later if reintroduced. Omitting `additionalContext`, passing `null`, or passing an empty object resets the store to empty and injects nothing. ## What Changed - Threads experimental v2 `additionalContext` through app-server into core turn start and steer handling. - Adds separate contextual fragment types for untrusted user-role context and application developer-role context. - Uses pending response input items so additional context can be combined with normal user input without treating it as prompt text. - Adds integration coverage for start/steer flow, role routing, dedupe/reset behavior, deletion/re-add behavior, hook-blocked input behavior, empty context-only steer rejection, external-fragment marker matching, and truncation.pakrym-oai ·
2026-05-26 13:02:34 -07:00 -
Honor client-resolved service tier defaults (#23537)
## Why Model catalog responses can now advertise a nullable `default_service_tier` for each model. Codex needs to preserve three distinct states all the way from config/app-server inputs to inference: - no explicit service tier, so the client may apply the current model catalog default when FastMode is enabled - explicit `default`, meaning the user intentionally wants standard routing - explicit catalog tier ids such as `priority`, `flex`, or future tiers Keeping those states distinct prevents the UI from showing one tier while core sends another, especially after model switches or app-server `thread/start` / `turn/start` updates. ## What Changed - Plumbed `default_service_tier` through model catalog protocol types, app-server model responses, generated schemas, model cache fixtures, and provider/model-manager conversions. - Added the request-only `default` service tier sentinel and normalized legacy config spelling so `fast` in `config.toml` still materializes as the runtime/request id `priority`. - Moved catalog default resolution to the TUI/client side, including recomputing the effective service tier when model/FastMode-dependent surfaces change. - Updated app-server thread lifecycle config construction so `serviceTier: null` preserves explicit standard-routing intent by mapping to `default` instead of internal `None`. - Kept core responsible for validating explicit tiers against the current model and stripping `default` before `/v1/responses`, without applying catalog defaults itself. ## Validation - `CARGO_INCREMENTAL=0 cargo build -p codex-cli` - `CARGO_INCREMENTAL=0 cargo test -p codex-app-server model_list` - `cargo test -p codex-tui service_tier` - `cargo test -p codex-protocol service_tier_for_request` - `cargo test -p codex-core get_service_tier` - `RUST_MIN_STACK=8388608 CARGO_INCREMENTAL=0 cargo test -p codex-core service_tier`
Shijie Rao ·
2026-05-20 15:57:50 -07:00 -
Make local environment optional in EnvironmentManager (#23369)
## Summary - make `EnvironmentManager` local environment/runtime paths optional - simplify constructor surface around snapshot materialization - rename local env accessors to `require_local_environment` / `try_local_environment` ## Validation - devbox Bazel build for touched crate surfaces - `//codex-rs/exec-server:exec-server-unit-tests` - `//codex-rs/app-server-client:app-server-client-unit-tests` - filtered touched `//codex-rs/core:core-unit-tests` cases
starr-openai ·
2026-05-19 12:55:34 -07:00 -
[3 of 7] Remove UserTurn (#23075)
**Stack position:** [3 of 7] ## Summary This PR finishes the input-op consolidation by moving the remaining `Op::UserTurn` callers onto `Op::UserInput` and deleting `Op::UserTurn`. This touches a lot of files, but it is a low-risk mechanical migration. ## Stack 1. [1 of 7] [Add thread settings to UserInput](https://github.com/openai/codex/pull/23080) 2. [2 of 7] [Remove UserInputWithTurnContext](https://github.com/openai/codex/pull/23081) 3. [3 of 7] [Remove UserTurn](https://github.com/openai/codex/pull/23075) (this PR) 4. [4 of 7] [Placeholder for OverrideTurnContext cleanup](https://github.com/openai/codex/pull/23087) 5. [5 of 7] [Replace OverrideTurnContext with ThreadSettings](https://github.com/openai/codex/pull/22508) 6. [6 of 7] [Add app-server thread settings API](https://github.com/openai/codex/pull/22509) 7. [7 of 7] [Sync TUI thread settings](https://github.com/openai/codex/pull/22510)
Eric Traut ·
2026-05-18 19:56:00 -07:00 -
[codex] Remove legacy shell output formatting paths (#22706)
## Why The client and tool pipeline still carried compatibility code for legacy structured shell output. Current shell and apply_patch responses are already plain text for model consumption, so keeping a JSON-serialization path plus shell-item rewrite logic makes the request formatter and tests preserve a format we do not need anymore. ## What Changed - Removed the client-side shell output rewrite from `core/src/client_common.rs`. - Removed the structured exec-output formatter and the shell `freeform` switch so tool emitters use one model-facing formatter. - Collapsed apply_patch/shell serialization tests around the remaining plain-text output expectations and removed duplicate one-variant parameterized cases. - Kept the `ApplyPatchModelOutput::ShellCommandViaHeredoc` compatibility input shape, but no longer treats it as a separate output-format mode. ## Validation - `cargo test -p codex-core client_common` - `cargo test -p codex-core shell_serialization` - `cargo test -p codex-core apply_patch_cli` - `just fix -p codex-core` ## Documentation No external Codex documentation update is needed.
pakrym-oai ·
2026-05-18 09:57:54 -07:00 -
chore(features) rm Feature::ApplyPatchFreeform (#22711)
## Summary Removes the feature since this is effectively on by default in all cases where we should use it, or can be configured via models.json. ## Testing - [x] unit tests pass
Dylan Hurd ·
2026-05-14 16:15:56 -07:00 -
Fix remote environment test fixtures (#22572)
## Why The Docker remote-env coverage was failing before it reached the behavior those tests are meant to exercise. The remote-aware test fixture only registered the remote environment, so tests that intentionally select both `local` and `remote` could not start a turn. After that was fixed, two tests exposed stale fixtures: the approval test was auto-approving under workspace-write, and the remote `view_image` test was writing invalid PNG bytes. ## What Changed - Added `EnvironmentManager::create_for_tests_with_local(...)` so tests can keep the provider default while also selecting `local` explicitly. - Updated `build_remote_aware()` to use that test-only manager when a remote exec-server URL is present. - Changed the remote apply-patch approval helper to use `SandboxPolicy::new_read_only_policy()` so the test actually exercises approval caching per environment. - Replaced the hardcoded remote `view_image` PNG blob with the existing `png_bytes(...)` helper so the test uses a valid image fixture. ## Validation Ran these isolated Docker remote-env tests on the devbox with `$remote-tests` setup: - `suite::remote_env::apply_patch_freeform_routes_to_selected_remote_environment` - `suite::remote_env::apply_patch_approvals_are_remembered_per_environment` - `suite::remote_env::apply_patch_intercepted_exec_command_routes_to_selected_remote_environment` - `suite::remote_env::exec_command_routes_to_selected_remote_environment` - `suite::view_image::view_image_routes_to_selected_remote_environment` All five pass.
starr-openai ·
2026-05-14 12:40:01 -07:00 -
[codex] Remove unused legacy shell tools (#22246)
## Why Recent session history showed no active use of the raw `shell`, `local_shell`, or `container.exec` execution surfaces. Keeping those handlers/specs wired into core leaves duplicate shell execution paths alongside the supported `shell_command` and unified exec tools. ## What changed - Removed the raw `shell` handler/spec and its `ShellToolCallParams` protocol helper. - Removed the legacy `local_shell` and `container.exec` handler/spec plumbing while preserving persisted-history compatibility for old response items. - Normalized model/config `default` and `local` shell selections to `shell_command`. - Pruned tests that exercised removed raw-shell/local-shell/apply-patch variants and kept coverage on `shell_command`, unified exec, and freeform `apply_patch`. ## Verification - `git diff --check` - `cargo test -p codex-protocol` - `cargo test -p codex-tools` - `cargo test -p codex-core tools::handlers::shell` - `cargo test -p codex-core tools::spec` - `cargo test -p codex-core tools::router` - `cargo test -p codex-core active_call_preserves_triggering_command_context` - `cargo test -p codex-core guardian_tests` - `cargo test -p codex-core --test all shell_serialization` - `cargo test -p codex-core --test all apply_patch_cli` - `cargo test -p codex-core --test all shell_command_` - `cargo test -p codex-core --test all local_shell` - `cargo test -p codex-core --test all otel::` - `cargo test -p codex-core --test all hooks::` - `just fix -p codex-core` - `just fix -p codex-tools`
pakrym-oai ·
2026-05-13 16:43:25 +00:00 -
extension: wire extension registries into sessions (#21737)
## Why [#21736](https://github.com/openai/codex/pull/21736) introduces the typed extension API, but the runtime does not yet carry a registry through thread/session startup or give contributors host-owned stores to read from. This PR wires that host-side path so later feature migrations can move product-specific behavior behind typed contributions without adding another bespoke seam directly to `codex-core`. ## What changed - Thread `ExtensionRegistry<Config>` through `ThreadManager`, `CodexSpawnArgs`, `Session`, and sub-agent spawn paths. - Wire `ThreadStartContributor` and `ContextContributor` - Expose the small supporting surface needed by non-core callers that construct threads directly, including `empty_extension_registry()` through `codex-core-api`. This PR lands the host plumbing only: the app-server registry is still empty, and concrete feature migrations are intended to follow separately.
jif-oai ·
2026-05-11 11:38:18 +02:00 -
tests: cover sandbox link write behavior (#21819)
## Why [PR #1705](https://github.com/openai/codex/pull/1705) moved `apply_patch` execution under the configured sandbox and called out the need for integration coverage. We already covered textual `../` escapes, but did not have coverage for link aliases that live inside a writable workspace while pointing at, or aliasing, files visible outside it. This PR locks in the current sandbox boundary without changing production write semantics. Symlink escapes into a read-only outside root should fail and leave the outside file unchanged. Existing hard links are characterized separately: if a user-created hard link already exists inside the writable root, sandboxed writes preserve normal hard-link semantics rather than replacing the link and silently breaking that relationship. ## What Changed - Added `apply_patch_cli_does_not_write_through_symlink_escape_outside_workspace` to verify `apply_patch` cannot update a symlink that targets a file outside the writable workspace. - Added `apply_patch_cli_preserves_existing_hard_link_outside_workspace` to verify `apply_patch` intentionally writes through an existing hard link and does not unlink or replace it. - Added `file_system_sandboxed_write_preserves_existing_hard_link` to verify sandboxed `fs/writeFile` preserves an existing hard link and writes the shared inode. ## Testing - `cargo test -p codex-exec-server file_system_sandboxed_write` - `cargo test -p codex-core apply_patch_cli_does_not_write_through_symlink_escape_outside_workspace` - `cargo test -p codex-core apply_patch_cli_preserves_existing_hard_link_outside_workspace` - `just fix -p codex-exec-server -p codex-core` - `just fix -p codex-core` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/21819). * #21845 * __->__ #21819
Michael Bolin ·
2026-05-09 08:28:15 -07:00 -
[codex] Delete function-style apply_patch (#21651)
## Why `apply_patch` is now a freeform/custom tool. Keeping the old JSON/function-style registration and parsing path left another way for models and tests to invoke `apply_patch`, which made the tool surface harder to reason about. ## What changed - Removed the `ApplyPatchToolType::Function` variant, JSON `apply_patch` spec, and handler support for function payloads. - Kept `apply_patch_tool_type = freeform` as the supported model metadata path, including Bedrock catalog metadata. - Migrated `apply_patch` tests and SSE fixtures to custom/freeform tool calls. ## Verification - `cargo test -p codex-tools -p codex-protocol -p codex-model-provider` - `cargo test -p codex-core tools::handlers::apply_patch --lib` - `cargo test -p codex-core --test all apply_patch_tool_executes_and_emits_patch_events` - `cargo test -p codex-core --test all apply_patch_reports_parse_diagnostics` - `cargo test -p codex-exec test_apply_patch_tool` - `just fix -p codex-core` - `just fix -p codex-tools -p codex-protocol -p codex-model-provider -p codex-exec`
pakrym-oai ·
2026-05-08 13:00:57 -07:00 -
[codex] request desktop attestation from app (#20619)
## Summary TL;DR: teaches `codex-rs` / app-server to request a desktop-provided attestation token and attach it as `x-oai-attestation` on the scoped ChatGPT Codex request paths.  ## Details This PR teaches the Codex app-server runtime how to request and attach an attestation token. It does not generate DeviceCheck tokens directly; instead, it relies on the connected desktop app to advertise that it can generate attestation and then asks that app for a fresh header value when needed. The flow is: 1. The Codex desktop app connects to app-server. 2. During `initialize`, the app can advertise that it supports `requestAttestation`. 3. Before app-server calls selected ChatGPT Codex endpoints, it sends the internal server request `attestation/generate` to the app. 4. app-server receives a pre-encoded header value back. 5. app-server forwards that value as `x-oai-attestation` on the scoped outbound requests. The code in this repo is mostly protocol and runtime plumbing: it adds the app-server request/response shape, introduces an attestation provider in core, wires that provider into Responses / compaction / realtime setup paths, and covers the intended scoping with tests. The signed macOS DeviceCheck generation remains owned by the desktop app PR. ## Related PR - Codex desktop app implementation: https://github.com/openai/openai/pull/878649 ## Validation <details> <summary>Tests run</summary> ```sh cargo test -p codex-app-server-protocol cargo test -p codex-core attestation --lib cargo test -p codex-app-server --lib attestation ``` Also ran: ```sh just fix -p codex-core just fix -p codex-app-server just fix -p codex-app-server-protocol just fmt just write-app-server-schema ``` </details> <details> <summary>E2E DeviceCheck validation</summary> First validated the signed desktop app boundary directly: launched a packaged signed `Codex.app`, sent `attestation/generate`, decoded the returned `v1.` attestation header, and validated the extracted DeviceCheck token with `personal/jm/verify_devicecheck_token.py` using bundle ID `com.openai.codex`. Apple returned `status_code: 200` and `is_ok: true`. Then ran the fuller app + app-server flow. The packaged `Codex.app` launched a current-branch app-server via `CODEX_CLI_PATH`, and a local MITM proxy intercepted outbound `chatgpt.com` traffic. The app-server requested `attestation/generate` from the real Electron app process, and the intercepted `/backend-api/codex/responses` traffic included `x-oai-attestation` on both routes: ```text GET /backend-api/codex/responses Upgrade: websocket x-oai-attestation: present POST /backend-api/codex/responses Upgrade: none x-oai-attestation: present ``` The captured header decoded to a DeviceCheck token that also validated with Apple for `com.openai.codex` (`status_code: 200`, `is_ok: true`, team `2DC432GLL2`). </details> --------- Co-authored-by: Codex <noreply@openai.com>
Jiaming Zhang ·
2026-05-08 12:36:02 -07:00 -
Add CODEX_HOME environments TOML provider (#20666)
## Why After stdio transports and provider-owned defaults exist, Codex needs a config-backed provider that can describe more than the single legacy `CODEX_EXEC_SERVER_URL` remote. This PR adds that provider without activating it in product entrypoints yet, keeping parser/validation review separate from runtime wiring. **Stack position:** this is PR 4 of 5. It builds on PR 3's provider/default model and adds the `environments.toml` provider used by PR 5. ## What Changed - Add `environment_toml.rs` as the TOML-specific home for parsing, validation, and provider construction. - Keep the TOML schema/provider structs private; the public constructor added here is `EnvironmentManager::from_codex_home(...)`. - Add `TomlEnvironmentProvider`, including validation for: - reserved ids such as `local` and `none` - duplicate ids - unknown explicit defaults - empty programs or URLs - exactly one of `url` or `program` per configured environment - Support websocket environments with `url = "ws://..."` / `wss://...`. - Support stdio-command environments with `program = "..."`. - Add helpers to load `environments.toml` from `CODEX_HOME`, but do not wire entrypoints to call them yet. - Add the `toml` dependency for parsing. ## Stack - 1. https://github.com/openai/codex/pull/20663 - Add stdio exec-server listener - 2. https://github.com/openai/codex/pull/20664 - Add stdio exec-server client transport - 3. https://github.com/openai/codex/pull/20665 - Make environment providers own default selection - **4. This PR:** https://github.com/openai/codex/pull/20666 - Add CODEX_HOME environments TOML provider - 5. https://github.com/openai/codex/pull/20667 - Load configured environments from CODEX_HOME Split from original draft: https://github.com/openai/codex/pull/20508 ## Validation Not run locally; this was split out of the original draft stack. ## Documentation This introduces the config shape for `environments.toml`; user-facing documentation should be added before this stack is treated as a documented public workflow. --------- Co-authored-by: Codex <noreply@openai.com>
starr-openai ·
2026-05-08 01:37:47 +00:00 -
Revert state DB injection and agent graph store (#21481)
## Why Reverts #20689 to restore the previous optional state DB plumbing. The conflict resolution keeps the newer installation ID and session/thread identity changes that landed after #20689, while removing the mandatory state DB and agent graph store dependency from ThreadManager construction. ## What changed - Restored `Option<StateDbHandle>` through app-server, MCP server, prompt debug, and test entry points. - Removed the `codex-core` dependency on `codex-agent-graph-store` and reverted descendant lookup back to the existing state DB path when available. - Kept newer `installation_id` forwarding by passing it beside the optional DB handle. - Kept local thread-name updates working when the optional state DB handle is absent. ## Validation - `git diff --check` - `cargo test -p codex-thread-store` - `cargo test -p codex-state -p codex-rollout -p codex-app-server-protocol` - Attempted `env CARGO_INCREMENTAL=0 cargo test -p codex-core -p codex-app-server -p codex-app-server-client -p codex-mcp-server -p codex-thread-manager-sample -p codex-tui`; blocked locally by a rustc ICE while compiling `v8 v146.4.0` with `rustc 1.93.0 (254b59607 2026-01-19)` on `aarch64-apple-darwin`.
pakrym-oai ·
2026-05-06 22:48:29 -07:00 -
2- Use string service tiers in session protocol (#20971)
## Summary - break service tier session/op/app-server protocol fields from the closed enum to string tier ids - send the service tier string directly through model requests, prewarm, compaction, memories, and TUI/app-server turn starts - regenerate app-server protocol JSON/TypeScript schemas, removing the standalone ServiceTier TS enum ## Verification - just fmt - cargo check -p codex-core -p codex-app-server -p codex-tui - just write-app-server-schema --------- Co-authored-by: Codex <noreply@openai.com>
Ahmed Ibrahim ·
2026-05-06 18:00:21 +03:00 -
Move installation ID resolution out of core startup (#21182)
## Summary - resolve or inject the installation ID before core startup and pass it through `ThreadManager`, `CodexSpawnArgs`, and `Session` as a plain `String` - keep child sessions on the parent installation ID instead of rediscovering it inside core - propagate installation ID startup failures in `mcp-server` instead of panicking ## Why Core was still touching the filesystem on the session startup path to discover `installation_id`. This moves that work to the outer host boundary so core no longer depends on `codex_home` reads during session construction. --------- Co-authored-by: Codex <noreply@openai.com>
jif-oai ·
2026-05-06 10:48:54 +00:00 -
Inject state DB, agent graph store (#20689)
## Why We want the agent graph store to be passed down the stack as a real dependency, the same way we already treat the thread store. This will let us inject the agent graph store as a real dependency and support implementations other than the local SQLite-backed one. Right now most code instantiates a state DB and an agent graph store just-in-time. Ideally, we would not depend on the state DB directly but only read through the higher-level interfaces. This change makes the dependency boundaries explicit and moves state DB initialization to process bootstrap instead of hiding it inside local store implementations. ## What changed - `ThreadManager` now requires a `StateDbHandle` and an `AgentGraphStore` at construction time instead of treating them as optional internals. - The local store constructors no longer lazily initialize SQLite. Callers now initialize the state DB once per process and use that shared handle to build: - `LocalThreadStore` - `LocalAgentGraphStore` - App bootstraps (`app-server`, `mcp-server`, `prompt_debug`, and the thread-manager sample) now initialize the state DB up front and inject the resulting handle down the stack. - `app-server` now consistently uses its process-scoped state DB handle instead of reopening SQLite or trying to recover it from loaded threads. - Device-key storage now reuses the shared state DB handle instead of maintaining its own lazy opener. - The thread archive / descendant traversal paths now use the injected `AgentGraphStore` instead of reaching through local thread-store-specific state. ## Verification - `cargo check -p codex-core -p codex-thread-store -p codex-app-server -p codex-mcp-server -p codex-thread-manager-sample --tests` - `cargo test -p codex-thread-store` - `cargo test -p codex-core thread_manager_accepts_separate_agent_graph_store_and_thread_store -- --nocapture` - `cargo test -p codex-app-server thread_archive_archives_spawned_descendants -- --nocapture`
Rasmus Rygaard ·
2026-05-05 21:45:29 +00:00 -
state: pass state db handles through consumers (#20561)
## Why SQLite state was still being opened from consumer paths, including lazy `OnceCell`-backed thread-store call sites. That let one process construct multiple state DB connections for the same Codex home, which makes SQLite lock contention and `database is locked` failures much easier to hit. State DB lifetime should be chosen by main-like entrypoints and tests, then passed through explicitly. Consumers should use the supplied `Option<StateDbHandle>` or `StateDbHandle` and keep their existing filesystem fallback or error behavior when no handle is available. The startup path also needs to keep the rollout crate in charge of SQLite state initialization. Opening `codex_state::StateRuntime` directly bypasses rollout metadata backfill, so entrypoints should initialize through `codex_rollout::state_db` and receive a handle only after required rollout backfills have completed. ## What Changed - Initialize the state DB in main-like entrypoints for CLI, TUI, app-server, exec, MCP server, and the thread-manager sample. - Pass `Option<StateDbHandle>` through `ThreadManager`, `LocalThreadStore`, app-server processors, TUI app wiring, rollout listing/recording, personality migration, shell snapshot cleanup, session-name lookup, and memory/device-key consumers. - Remove the lazy local state DB wrapper from the thread store so non-test consumers use only the supplied handle or their existing fallback path. - Make `codex_rollout::state_db::init` the local state startup path: it opens/migrates SQLite, runs rollout metadata backfill when needed, waits for concurrent backfill workers up to a bounded timeout, verifies completion, and then returns the initialized handle. - Keep optional/non-owning SQLite helpers, such as remote TUI local reads, as open-only paths that do not run startup backfill. - Switch app-server startup from direct `codex_state::StateRuntime::init` to the rollout state initializer so app-server cannot skip rollout backfill. - Collapse split rollout lookup/list APIs so callers use the normal methods with an optional state handle instead of `_with_state_db` variants. - Restore `getConversationSummary(ThreadId)` to delegate through `ThreadStore::read_thread` instead of a LocalThreadStore-specific rollout path special case. - Keep DB-backed rollout path lookup keyed on the DB row and file existence, without imposing the filesystem filename convention on existing DB rows. - Verify readable DB-backed rollout paths against `session_meta.id` before returning them, so a stale SQLite row that points at another thread's JSONL falls back to filesystem search and read-repairs the DB row. - Keep `debug prompt-input` filesystem-only so a one-off debug command does not initialize or backfill SQLite state just to print prompt input. - Keep goal-session test Codex homes alive only in the goal-specific helper, rather than leaking tempdirs from the shared session test helper. - Update tests and call sites to pass explicit state handles where DB behavior is expected and explicit `None` where filesystem-only behavior is intended. ## Validation - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p codex-rollout -p codex-thread-store -p codex-app-server -p codex-core -p codex-tui -p codex-exec -p codex-cli --tests` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout state_db_` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout find_thread_path` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout find_thread_path -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout try_init_ -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo clippy -p codex-rollout --lib -- -D warnings` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-thread-store read_thread_falls_back_when_sqlite_path_points_to_another_thread -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-thread-store` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core shell_snapshot` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all personality_migration` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all rollout_list_find` - `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all rollout_list_find::find_prefers_sqlite_path_by_id -- --nocapture` - `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all rollout_list_find -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core interrupt_accounts_active_goal_before_pausing` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-app-server get_auth_status -- --test-threads=1` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-app-server --lib` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p codex-rollout -p codex-app-server --tests` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout -p codex-thread-store -p codex-core -p codex-app-server -p codex-tui -p codex-exec -p codex-cli` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout -p codex-app-server` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-core` - `just argument-comment-lint -p codex-core` - `just argument-comment-lint -p codex-rollout` Focused coverage added in `codex-rollout`: - `recorder::tests::state_db_init_backfills_before_returning` verifies the rollout metadata row exists before startup init returns. - `state_db::tests::try_init_waits_for_concurrent_startup_backfill` verifies startup waits for another worker to finish backfill instead of disabling the handle for the process. - `state_db::tests::try_init_times_out_waiting_for_stuck_startup_backfill` verifies startup does not hang indefinitely on a stuck backfill lease. - `tests::find_thread_path_accepts_existing_state_db_path_without_canonical_filename` verifies DB-backed lookup accepts valid existing rollout paths even when the filename does not include the thread UUID. - `tests::find_thread_path_falls_back_when_db_path_points_to_another_thread` verifies DB-backed lookup ignores a stale row whose existing path belongs to another thread and read-repairs the row after filesystem fallback. Focused coverage updated in `codex-core`: - `rollout_list_find::find_prefers_sqlite_path_by_id` now uses a DB-preferred rollout file with matching `session_meta.id`, so it still verifies that valid SQLite paths win without depending on stale/empty rollout contents. `cargo test -p codex-app-server thread_list_respects_search_term_filter -- --test-threads=1 --nocapture` was attempted locally but timed out waiting for the app-server test harness `initialize` response before reaching the changed thread-list code path. `bazel test //codex-rs/thread-store:thread-store-unit-tests --test_output=errors` was attempted locally after the thread-store fix, but this container failed before target analysis while fetching `v8+` through BuildBuddy/direct GitHub. The equivalent local crate coverage, including `cargo test -p codex-thread-store`, passes. A plain local `cargo check -p codex-rollout -p codex-app-server --tests` also requires system `libcap.pc` for `codex-linux-sandbox`; the follow-up app-server check above used `CODEX_SKIP_VENDORED_BWRAP=1` in this container.
Ruslan Nigmatullin ·
2026-05-04 11:46:03 -07:00 -
Make thread store process-scoped (#19474)
- Build one app-server process ThreadStore from startup config and share it with ThreadManager and CodexMessageProcessor. - Remove per-thread/fork store reconstruction so effective thread config cannot switch the persistence backend. - Add params to ThreadStore create/resume for specifying thread metadata, since otherwise the metadata from store creation would be used (incorrectly).
Tom ·
2026-04-30 21:24:59 -07:00 -
Reduce the surface of collaboration modes (#20149)
Collaboration modes were slightly invasive both into ThreadManager construction and ModelProvider
pakrym-oai ·
2026-04-29 17:22:41 -07:00 -
Add ThreadManager sample crate (#20141)
Summary: - Add codex-thread-manager-sample, a one-shot binary that starts a ThreadManager thread, submits a prompt, and prints the final assistant output. - Pass ThreadStore into ThreadManager::new and expose thread_store_from_config for existing callsites. - Build the sample Config directly with only --model and prompt inputs. Verification: - just fmt - cargo check -p codex-thread-manager-sample -p codex-app-server -p codex-mcp-server - git diff --check Tests: Not run per request.
pakrym-oai ·
2026-04-29 11:21:06 -07:00 -
Add environment provider snapshot (#20058)
## Summary - Change `EnvironmentProvider` to return concrete `Environment` instances instead of `EnvironmentConfigurations`. - Make `DefaultEnvironmentProvider` provide the provider-visible `local` environment plus optional `remote` environment from `CODEX_EXEC_SERVER_URL`. - Keep `EnvironmentManager` as the concrete cache while exposing its own explicit local environment for `local_environment()` fallback paths. ## Validation - `just fmt` - `git diff --check` --------- Co-authored-by: Codex <noreply@openai.com>
starr-openai ·
2026-04-28 20:05:18 -07:00 -
core tests: build user turns from permission profiles (#20011)
## Summary - Add `turn_permission_fields()` so tests that construct `Op::UserTurn` directly can provide a canonical `PermissionProfile` while still filling the required legacy `sandbox_policy` compatibility field. - Migrate direct user-turn construction in core integration tests from `SandboxPolicy::DangerFullAccess` to `PermissionProfile::Disabled`. - Continue reducing direct `SandboxPolicy` usage in `codex-rs/core/tests`, from 41 files after #20010 to 32 files in this PR. ## Testing - `cargo check -p codex-core --tests` - `just fmt` - `just fix -p core_test_support` - `just fix -p codex-core`
Michael Bolin ·
2026-04-28 17:03:20 -07:00 -
core tests: submit turns with permission profiles (#20010)
## Summary - Add `PermissionProfile`-based turn submission helpers to `core_test_support`, while keeping the legacy `SandboxPolicy` helper for tests that intentionally exercise legacy fallback behavior. - Switch the default `TestCodex::submit_turn()` path to send a real `PermissionProfile` plus the required legacy compatibility projection in `Op::UserTurn`. - Migrate straightforward app/search/shell/truncation tests from `SandboxPolicy::{DangerFullAccess, ReadOnly}` to `PermissionProfile::{Disabled, read_only}`. - Add a TUI compatibility projection helper for legacy app-server fields so non-legacy writable roots are preserved instead of being downgraded to read-only. - Fix remote start/resume/fork sandbox-mode projection to classify any managed profile with writable roots as workspace-write, not only profiles that can write `cwd`. - Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 47 files to 41 files without changing production behavior. ## Testing - `cargo check -p codex-core --tests` - `cargo test -p codex-tui compatibility_profile_preserves_unbridgeable_write_roots` - `cargo test -p codex-tui sandbox_mode_preserves_non_cwd_write_roots_for_remote_sessions` - `just fmt` - `just fix -p core_test_support` - `just fix -p codex-core`Michael Bolin ·
2026-04-28 23:01:40 +00:00 -
permissions: add built-in default profiles (#19900)
## Why The migration away from `SandboxPolicy` needs new configs to start from permissions profiles instead of deriving profiles from legacy sandbox modes. Existing users can have empty `config.toml` files, and we should not rewrite user-owned config files that may live in shared repositories. This PR introduces built-in profile names so an empty config can resolve to a canonical `PermissionProfile`, while explicit named `[permissions]` profiles still behave predictably. ## What changed - Adds built-in `default_permissions` profile names: - `:read-only` maps to `PermissionProfile::read_only()`. - `:workspace` maps to the workspace-write profile, including project-root metadata carveouts. - `:danger-no-sandbox` maps to `PermissionProfile::Disabled`, preserving the distinction between no sandbox and a broad managed sandbox. - Reserves the `:` prefix for built-in profiles so user-defined `[permissions]` profiles cannot collide with future built-ins. - Allows `default_permissions` to reference a built-in profile without requiring a `[permissions]` table. - Makes an otherwise empty config choose a built-in profile by trust/platform context: trusted or untrusted project roots use `:workspace` when the platform supports that sandbox, while roots without a trust decision use `:read-only`. - Keeps legacy `sandbox_mode` configs on the legacy path, and still rejects user-defined `[permissions]` profiles that omit `default_permissions` so we do not silently guess among custom profiles. - Preserves compatibility behavior for implicit defaults: bare `network.enabled = true` allows runtime network without starting the managed proxy, explicit profile proxy policy still starts the proxy, and implicit workspace/add-dir roots keep legacy metadata carveouts. ## Verification - `cargo test -p codex-core builtin --lib` - `cargo test -p codex-core profile_network_proxy_config` - `cargo test -p codex-core implicit_builtin_workspace_profile_preserves_add_dir_metadata_carveouts` - `cargo test -p codex-core permissions_profiles_network_enabled_allows_runtime_network_without_proxy` - `cargo test -p codex-core permissions_profiles_proxy_policy_starts_managed_network_proxy` ## Documentation Public Codex config docs should mention these built-in names when the `[permissions]` config format is ready to document as stable. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19900). * #20041 * #20040 * #20037 * #20035 * #20034 * #20033 * #20032 * #20030 * #20028 * #20027 * #20026 * #20024 * #20021 * #20018 * #20016 * #20015 * #20013 * #20011 * #20010 * #20008 * __->__ #19900
Michael Bolin ·
2026-04-28 11:21:39 -07:00 -
tui: carry permission profiles on user turns (#18285)
## Why Per-turn permission overrides should use the same canonical profile abstraction as session configuration. That lets TUI submissions preserve exact configured permissions without round-tripping through legacy sandbox fields. ## What changed This adds `permission_profile` to user-turn operations, threads it through TUI/app-server submission paths, fills the new field in existing test fixtures, and adds coverage that composer submission includes the configured profile. ## Verification - `cargo test -p codex-tui permissions -- --nocapture` - `cargo test -p codex-core --test all permissions_messages -- --nocapture` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18285). * #18288 * #18287 * #18286 * __->__ #18285
Michael Bolin ·
2026-04-23 11:54:17 -07:00 -
codex: support hooks in config.toml and requirements.toml (#18893)
## Summary Support the existing hooks schema in inline TOML so hooks can be configured from both `config.toml` and enterprise-managed `requirements.toml` without requiring a separate `hooks.json` payload. This gives enterprise admins a way to ship managed hook policy through the existing requirements channel while still leaving script delivery to MDM or other device-management tooling, and it keeps `hooks.json` working unchanged for existing users. This also lays the groundwork for follow-on managed filtering work such as #15937, while continuing to respect project trust gating from #14718. It does **not** implement `allow_managed_hooks_only` itself. NOTE: yes, it's a bit unfortunate that the toml isn't formatted as closely as normal to our default styling. This is because we're trying to stay compatible with the spec for plugins/hooks that we'll need to support & the main usecase here is embedding into requirements.toml ## What changed - moved the shared hook serde model out of `codex-rs/hooks` into `codex-rs/config` so the same schema can power `hooks.json`, inline `config.toml` hooks, and managed `requirements.toml` hooks - added `hooks` support to both `ConfigToml` and `ConfigRequirementsToml`, including requirements-side `managed_dir` / `windows_managed_dir` - treated requirements-managed hooks as one constrained value via `Constrained`, so managed hook policy is merged atomically and cannot drift across requirement sources - updated hook discovery to load requirements-managed hooks first, then per-layer `hooks.json`, then per-layer inline TOML hooks, with a warning when a single layer defines both representations - threaded managed hook metadata through discovered handlers and exposed requirements hooks in app-server responses, generated schemas, and `/debug-config` - added hook/config coverage in `codex-rs/config`, `codex-rs/hooks`, `codex-rs/core/src/config_loader/tests.rs`, and `codex-rs/core/tests/suite/hooks.rs` ## Testing - `cargo test -p codex-config` - `cargo test -p codex-hooks` - `cargo test -p codex-app-server config_api` ## Documentation Companion updates are needed in the developers website repo for: - the hooks guide - the config reference, sample, basic, and advanced pages - the enterprise managed configuration guide --------- Co-authored-by: Michael Bolin <mbolin@openai.com>
Andrei Eternal ·
2026-04-22 21:20:09 -07:00