Follow up to #15357 by making proactive ChatGPT auth refresh depend on
the access token's JWT expiration instead of treating `last_refresh` age
as the primary source of truth.
* Add
`OutgoingMessageSender::send_server_notification_to_connection_and_wait`
which returns only once message is written to websocket (or failed to do
so)
* Use this mechanism to apply back pressure to stdout/stderr streams of
processes spawned by `command/exec`, to limit them to at most one
message in-memory at a time
* Use back pressure signal to also batch smaller chunks into ≈64KiB ones
This should make commands execution more robust over
high-latency/low-throughput networks
### Summary
Make `FileWatcher` a reusable core component which can be built upon.
Extract skills-related logic into a separate `SkillWatcher`.
Introduce a composable `ThrottledWatchReceiver` to throttle filesystem
events, coalescing affected paths among them.
### Testing
Updated existing unit tests.
- Prefer plugin manifest `interface.displayName` for plugin labels.
- Preserve plugin provenance when handling `list_mcp_tools` so connector
`plugin_display_names` are not clobbered.
- Add a TUI test to ensure plugin-owned app mentions are deduped
correctly.
## Summary
- replace the second-compaction test fixtures with a single ordered
`/responses` sequence
- assert against the real recorded request order instead of aggregating
per-mock captures
- realign the second-summary assertion to the first post-compaction user
turn where the summary actually appears
## Root cause
`compact_resume_after_second_compaction_preserves_history` collected
requests from multiple `mount_sse_once_match` recorders. Overlapping
matchers could record the same HTTP request more than once, so the test
indexed into a duplicated synthetic list rather than the true request
stream. That made the summary assertion depend on matcher evaluation
order and platform-specific behavior.
## Impact
- makes the flaky test deterministic by removing duplicate request
capture from the assertion path
- keeps the change scoped to the test only
## Validation
- `just fmt`
- `just argument-comment-lint`
- `env -u CODEX_SANDBOX_NETWORK_DISABLED cargo test -p codex-core
compact_resume_after_second_compaction_preserves_history -- --nocapture`
- repeated the same targeted test 10 times
---------
Co-authored-by: Codex <noreply@openai.com>
Make the inter-agent communication start a turn
As part of this, we disable the v2 notifier to prevent some odd
behaviour where the agent restart working while you're talking to it for
example
Updates plugin ordering so installed plugins are listed first, with
alphabetical sorting applied within the installed and uninstalled
groups. The behavior is now consistent across both `tui` and
`tui_app_server`, and related tests/snapshots were updated.
Bumps
[vedantmgoyal9/winget-releaser](https://github.com/vedantmgoyal9/winget-releaser)
from 19e706d4c9121098010096f9c495a70a7518b30f to
7bd472be23763def6e16bd06cc8b1cdfab0e2fd5.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/vedantmgoyal9/winget-releaser/commit/7bd472be23763def6e16bd06cc8b1cdfab0e2fd5"><code>7bd472b</code></a>
docs: add description to inputs (<a
href="https://redirect.github.com/vedantmgoyal9/winget-releaser/issues/335">#335</a>)</li>
<li><a
href="https://github.com/vedantmgoyal9/winget-releaser/commit/a43926ed822e5a076ac8a81e0a794915cbad51d1"><code>a43926e</code></a>
fix: cargo command not found in <code>ubuntu-slim</code> runner (<a
href="https://redirect.github.com/vedantmgoyal9/winget-releaser/issues/334">#334</a>)</li>
<li>See full diff in <a
href="https://github.com/vedantmgoyal9/winget-releaser/compare/19e706d4c9121098010096f9c495a70a7518b30f...7bd472be23763def6e16bd06cc8b1cdfab0e2fd5">compare
view</a></li>
</ul>
</details>
<br />
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
This PR completes the conversion of non-interactive `codex exec` to use
app server rather than directly using core events and methods.
### Summary
- move `codex-exec` off exec-owned `AuthManager` and `ThreadManager`
state
- route exec bootstrap, resume, and auth refresh through existing
app-server paths
- replace legacy `codex/event/*` decoding in exec with typed app-server
notification handling
- update human and JSONL exec output adapters to translate existing
app-server notifications only
- clean up "app server client" layer by eliminating support for legacy
notifications; this is no longer needed
- remove exposure of `authManager` and `threadManager` from "app server
client" layer
### Testing
- `exec` has pretty extensive unit and integration tests already, and
these all pass
- In addition, I asked Codex to put together a comprehensive manual set
of tests to cover all of the `codex exec` functionality (including
command-line options), and it successfully generated and ran these tests
## Problem
Today `codex-network-proxy` rejects a global `*` in
`network.allowed_domains`, so there is no static way to configure a
denylist-only posture for public hosts. Users have to enumerate broad
allowlist patterns instead.
## Approach
- Make global wildcard acceptance field-specific: `allowed_domains` can
use `*`, while `denied_domains` still rejects a global wildcard.
- Keep the existing evaluation order, so explicit denies still win first
and local/private protections still apply unless separately enabled.
- Add coverage for the denylist-only behavior and update the README to
document it.
## Validation
- `just fmt`
- `cargo test -p codex-network-proxy` (full run had one unrelated flaky
telemetry test:
`network_policy::tests::emit_block_decision_audit_event_emits_non_domain_event`;
reran in isolation and it passed)
- `cargo test -p codex-network-proxy
network_policy::tests::emit_block_decision_audit_event_emits_non_domain_event
-- --exact --nocapture`
- `just fix -p codex-network-proxy`
- `just argument-comment-lint`
Show all plugin marketplaces in the /plugins popup by removing the
`openai-curated` marketplace filter, and update plugin popup
copy/tests/snapshots to match the new behavior in both TUI codepaths.
## Summary
- move the pure sandbox policy transform helpers from `codex-core` into
`codex-sandboxing`
- move the corresponding unit tests with the extracted implementation
- update `core` and `app-server` callers to import the moved APIs
directly, without re-exports or proxy methods
## Testing
- cargo test -p codex-sandboxing
- cargo test -p codex-core sandboxing
- cargo test -p codex-app-server --lib
- just fix -p codex-sandboxing
- just fix -p codex-core
- just fix -p codex-app-server
- just fmt
- just argument-comment-lint
## Summary
- update the self-serve business usage-based limit message to direct
users to their admin for additional credits
- add a focused unit test for the self_serve_business_usage_based plan
branch
Added also:
If you are at a rate limit but you still have credits, codex cli would
tell you to switch the model. We shouldnt do this if you have credits so
fixed this.
## Test
- launched the source-built CLI and verified the updated message is
shown for the self-serve business usage-based plan

## Summary
- move macOS permission merging/intersection logic and tests from
`codex-core` into `codex-sandboxing`
- move seatbelt policy builders, permissions logic, SBPL assets, and
their tests into `codex-sandboxing`
- keep `codex-core` owning only the seatbelt spawn wrapper and switch
call sites to import the moved APIs directly
## Notes
- no re-exports added
- moved the seatbelt tests with the implementation so internal helpers
could stay private
- local verification is still finishing while this PR is open
## Summary
- add a new `codex-sandboxing` crate for sandboxing extraction work
- move the pure Linux sandbox argv builders and their unit tests out of
`codex-core`
- keep `core::landlock` as the spawn wrapper and update direct callers
to use `codex_sandboxing::landlock`
## Testing
- `cargo test -p codex-sandboxing`
- `cargo test -p codex-core landlock`
- `cargo test -p codex-cli debug_sandbox`
- `just argument-comment-lint`
## Notes
- this is step 1 of the move plan aimed at minimizing per-PR diffs
- no re-exports or no-op proxy methods were added
## Summary
- add `ForkSnapshotMode` to `ThreadManager::fork_thread` so callers can
request either a committed snapshot or an interrupted snapshot
- share the model-visible `<turn_aborted>` history marker between the
live interrupt path and interrupted forks
- update the small set of direct fork callsites to pass
`ForkSnapshotMode::Committed`
Note: this enables /btw to work similarly as Esc to interrupt (hopefully
somewhat in distribution)
---------
Co-authored-by: Codex <noreply@openai.com>
## What changed
- adds a targeted snapshot test for rollback with contextual diffs in
`codex_tests.rs`
- snapshots the exact model-visible request input before the rolled-back
turn and on the follow-up request after rollback
- shows the duplicate developer and environment context pair appearing
again before the follow-up user message
## Why
Rollback currently rewinds the reference context baseline without
rewinding the live session overrides. On the next turn, the same
contextual diff is emitted again and duplicated in the request sent to
the model.
## Impact
- makes the regression visible in a canonical snapshot test
- keeps the snapshot on the shared `context_snapshot` path without
adding new formatting helpers
- gives a direct repro for future fixes to rollback/context
reconstruction
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
Adds support for approvals_reviewer to `Op::UserTurn` so we can migrate
`[CodexMessageProcessor::turn_start]` to use Op::UserTurn
## Testing
- [x] Adds quick test for the new field
Co-authored-by: Codex <noreply@openai.com>
Use `serde` to encode the inter agent communication to an assistant
message and use the decode to see if this is such a message
Note: this assume serde on small pattern is fast enough
- add `PreToolUse` hook for bash-like tool execution only at first
- block shell execution before dispatch with deny-only hook behavior
- introduces common.rs matcher framework for matching when hooks are run
example run:
```
› run three parallel echo commands, and the second one should echo "[block-pre-tool-use]" as a test
• Running the three echo commands in parallel now and I’ll report the output directly.
• Running PreToolUse hook: name for demo pre tool use hook
• Running PreToolUse hook: name for demo pre tool use hook
• Running PreToolUse hook: name for demo pre tool use hook
PreToolUse hook (completed)
warning: wizard-tower PreToolUse demo inspected Bash: echo "first parallel echo"
PreToolUse hook (blocked)
warning: wizard-tower PreToolUse demo blocked a Bash command on purpose.
feedback: PreToolUse demo blocked the command. Remove [block-pre-tool-use] to continue.
PreToolUse hook (completed)
warning: wizard-tower PreToolUse demo inspected Bash: echo "third parallel echo"
• Ran echo "first parallel echo"
└ first parallel echo
• Ran echo "third parallel echo"
└ third parallel echo
• Three little waves went out in parallel.
1. printed first parallel echo
2. was blocked before execution because it contained the exact test string [block-pre-tool-use]
3. printed third parallel echo
There was also an unrelated macOS defaults warning around the successful commands, but the echoes
themselves worked fine. If you want, I can rerun the second one with a slightly modified string so
it passes cleanly.
```
## Summary
- route /realtime, Ctrl+C, and deleted realtime meters through the same
realtime stop path
- keep generic transcription placeholder cleanup free of realtime
shutdown side effects
## Testing
- Ran
- Relied on CI for verification; did not run local tests
---------
Co-authored-by: Codex <noreply@openai.com>