codex

Fix #2296 Add "minimal" reasoning effort for GPT 5 models (#2326 )

This pull request resolves #2296; I've confirmed if it works by:

1. Add settings to ~/.codex/config.toml:
```toml
model_reasoning_effort = "minimal"
```

2. Run the CLI:
```
cd codex-rs
cargo build && RUST_LOG=trace cargo run --bin codex
/status
tail -f ~/.codex/log/codex-tui.log
```

Co-authored-by: pakrym-oai <pakrym@openai.com>

Kazuhiro Sera · 2025-08-15 12:59:52 -07:00

dcfdd2faf5

fix: introduce codex-protocol crate (#2355 )

Michael Bolin · 2025-08-15 12:44:40 -07:00

d262244725

tui: skip identical consecutive entries in local composer history (#2352 )

This PR avoids inserting duplicate consecutive messages into the Chat
Composer's local history.

Jeremy Rose · 2025-08-15 10:55:44 -07:00

7c26c8e091

feat: introduce ClientRequest::SendUserTurn (#2345 )

This adds a new request type, `SendUserTurn`, that makes it possible to
submit a `Op::UserTurn` operation (introduced in #2329) to a
conversation. This PR also adds a new integration test that verifies
that changing from `AskForApproval::UnlessTrusted` to
`AskForApproval::Never` mid-conversation ensures that an elicitation is
no longer sent for running `python3 -c print(42)`.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2345).
* __->__ #2345
* #2329
* #2343
* #2340
* #2338

Michael Bolin · 2025-08-15 10:05:58 -07:00

eda50d8372

feat: introduce Op:UserTurn (#2329 )

This introduces `Op::UserTurn`, which makes it possible to override many
of the fields that were set when the `Session` was originally created
when creating a new conversation turn. This is one way we could support
changing things like `model` or `cwd` in the middle of the conversation,
though we may want to consider making each field optional, or
alternatively having a separate `Op` that mutates the `TurnContext`
associated with a `submission_loop()`.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2329).
* #2345
* __->__ #2329
* #2343
* #2340
* #2338

Michael Bolin · 2025-08-15 09:56:05 -07:00

17aa394ae7

feat: introduce TurnContext (#2343 )

This PR introduces `TurnContext`, which is designed to hold a set of
fields that should be constant for a turn of a conversation. Note that
the fields of `TurnContext` were previously governed by `Session`.

Ultimately, we want to enable users to change these values between turns
(changing model, approval policy, etc.), though in the current
implementation, the `TurnContext` is constant for the entire
conversation.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2345).
* #2345
* #2329
* __->__ #2343
* #2340
* #2338

Michael Bolin · 2025-08-15 09:40:02 -07:00

13ed67cfc1

tui: align diff display by always showing sign char and keeping fixed gutter (#2353 )

diff lines without a sign char were misaligned.

Jeremy Rose · 2025-08-15 09:32:45 -07:00

45d6c74682

fix: try to fix flakiness in test_shell_command_approval_triggers_elicitation (#2344 )

I still see flakiness in
`test_shell_command_approval_triggers_elicitation()` on occasion where
`MockServer` claims it has not received all of its expected requests.

I recently introduced a similar type of test in #2264,
`test_codex_jsonrpc_conversation_flow()`, which I have not seen flake
(yet!), so this PR pulls over two things I did in that test:

- increased `worker_threads` from `2` to `4`
- added an assertion to make sure the `task_complete` notification is
received

Honestly, I'm still not sure why `MockServer` claims it sometimes does
not receive all its expected requests given that we assert that the
final `JSONRPCResponse` is read on the stream, but let's give this a
shot.

Assuming this fixes things, my hypothesis is that the increase in
`worker_threads` helps because perhaps there are async tasks in
`MockServer` that do not reliably complete fully when there are not
enough threads available? If that is correct, it seems like the test
would still be flaky, though perhaps with lower frequency?

Michael Bolin · 2025-08-15 09:17:20 -07:00

265fd89e31

fix: introduce MutexExt::lock_unchecked() so we stop ignoring unwrap() throughout codex.rs (#2340 )

This way we are sure a dangerous `unwrap()` does not sneak in!

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2340).
* #2345
* #2329
* #2343
* __->__ #2340
* #2338

Michael Bolin · 2025-08-15 09:14:44 -07:00

6730592433

fix: tighten up checks against writable folders for SandboxPolicy (#2338 )

I was looking at the implementation of `Session::get_writable_roots()`,
which did not seem right, as it was a copy of writable roots, which is
not guaranteed to be in sync with the `sandbox_policy` field.

I looked at who was calling `get_writable_roots()` and its only call
site was `apply_patch()` in `codex-rs/core/src/apply_patch.rs`, which
took the roots and forwarded them to `assess_patch_safety()` in
`safety.rs`. I updated `assess_patch_safety()` to take `sandbox_policy:
&SandboxPolicy` instead of `writable_roots: &[PathBuf]` (and replaced
`Session::get_writable_roots()` with `Session::get_sandbox_policy()`).

Within `safety.rs`, it was fairly easy to update
`is_write_patch_constrained_to_writable_paths()` to work with
`SandboxPolicy`, and in particular, it is far more accurate because, for
better or worse, `SandboxPolicy::get_writable_roots_with_cwd()` _returns
an empty vec_ for `SandboxPolicy::DangerFullAccess`, suggesting that
_nothing_ is writable when in reality _everything_ is writable. With
this PR, `is_write_patch_constrained_to_writable_paths()` now does the
right thing for each variant of `SandboxPolicy`.

I thought this would be the end of the story, but it turned out that
`test_writable_roots_constraint()` in `safety.rs` needed to be updated,
as well. In particular, the test was writing to
`std::env::current_dir()` instead of a `TempDir`, which I suspect was a
holdover from earlier when `SandboxPolicy::WorkspaceWrite` would always
make `TMPDIR` writable on macOS, which made it hard to write tests to
verify `SandboxPolicy` in `TMPDIR`. Fortunately, we now have
`exclude_tmpdir_env_var` as an option on
`SandboxPolicy::WorkspaceWrite`, so I was able to update the test to
preserve the existing behavior, but to no longer write to
`std::env::current_dir()`.







---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2338).
* #2345
* #2329
* #2343
* #2340
* __->__ #2338

Michael Bolin · 2025-08-15 09:06:15 -07:00

26c8373821

[tools] Add apply_patch tool (#2303 )

## Summary
We've been seeing a number of issues and reports with our synthetic
`apply_patch` tool, e.g. #802. Let's make this a real tool - in my
anecdotal testing, it's critical for GPT-OSS models, but I'd like to
make it the standard across GPT-5 and codex models as well.

## Testing
- [x] Tested locally
- [x] Integration test

Dylan · 2025-08-15 11:55:53 -04:00

6df8e35314

tui: include optional full command line in history display (#2334 )

Add env var to show the raw, unparsed command line under parsed
commands. When we have transcript mode we should show the full command
there, but this is useful for debugging.

Jeremy Rose · 2025-08-14 22:06:42 -07:00

917e29803b

Format multiline commands (#2333 )

<img width="966" height="729" alt="image"
src="https://github.com/user-attachments/assets/fa45b7e1-cd46-427f-b2bc-8501e9e4760b"
/>
<img width="797" height="530" alt="image"
src="https://github.com/user-attachments/assets/6993eec5-e157-4df7-b558-15643ad10d64"
/>

pakrym-oai · 2025-08-14 19:49:42 -07:00

5552688621

Cleanup rust login server a bit more (#2331 )

Remove some extra abstractions.

---------

Co-authored-by: easong-openai <easong@openai.com>

pakrym-oai · 2025-08-14 19:42:14 -07:00

76df07350a

re-implement session id in status (#2332 )

Basically the same thing as https://github.com/openai/codex/pull/2297

easong-openai · 2025-08-15 02:14:46 +00:00

d0b907d399

Added allow-expect-in-tests / allow-unwrap-in-tests (#2328 )

This PR:
* Added the clippy.toml to configure allowable expect / unwrap usage in
tests
* Removed as many expect/allow lines as possible from tests
* moved a bunch of allows to expects where possible

Note: in integration tests, non `#[test]` helper functions are not
covered by this so we had to leave a few lingering `expect(expect_used`
checks around

Parker Thompson · 2025-08-14 17:59:01 -07:00

a075424437

AGENTS.md more strongly suggests running targeted tests first (#2306 )

Jeremy Rose · 2025-08-15 00:51:32 +00:00

8bdb4521c9

fix: trying to simplify rust-ci.yml (#2327 )

It turns out that https://github.com/openai/codex/pull/2324 did not
quite work as intended. Chat's new idea is to have this catch-all "CI
results" job and update our branch protection rules to require this
instead.

Michael Bolin · 2025-08-14 17:44:10 -07:00

dd63d61a59

Fix AF_UNIX, sockpair, recvfrom in linux sandbox (#2309 )

When using codex-tui on a linux system I was unable to run `cargo
clippy` inside of codex due to:
```
[pid 3548377] socketpair(AF_UNIX, SOCK_SEQPACKET|SOCK_CLOEXEC, 0,  <unfinished ...>
[pid 3548370] close(8 <unfinished ...>
[pid 3548377] <... socketpair resumed>0x7ffb97f4ed60) = -1 EPERM (Operation not permitted)
```
And
```
3611300 <... recvfrom resumed>0x708b8b5cffe0, 8, 0, NULL, NULL) = -1 EPERM (Operation not permitted)
```

This PR:
* Fixes a bug that disallowed AF_UNIX to allow it on `socket()`
* Adds recvfrom() to the syscall allow list, this should be fine since
we disable opening new sockets. But we should validate there is not a
open socket inheritance issue.
* Allow socketpair to be called for AF_UNIX
* Adds tests for AF_UNIX components
* All of which allows running `cargo clippy` within the sandbox on
linux, and possibly other tooling using a fork server model + AF_UNIX
comms.

Parker Thompson · 2025-08-14 17:12:41 -07:00

c26d42ab69

Port login server to rust (#2294 )

Port the login server to rust.

---------

Co-authored-by: pakrym-oai <pakrym@openai.com>

easong-openai · 2025-08-14 17:11:26 -07:00

e9b597cfa3

clear running commands in various places (#2325 )

we have a very unclear lifecycle for the chatwidget—this should only
have to be added in one place! but this fixes the "hanging commands"
issue where the active_exec_cell wasn't correctly cleared when commands
finished.

To repro w/o this PR:
1. prompt "run sleep 10"
2. once the command starts running, press <kbd>Esc</kbd>
3. prompt "run echo hi"

Expected: 

```
✓ Completed
  └ ⌨️ echo hi

codex
hi
```

Actual:

```
⚙︎ Working
  └ ⌨️ echo hi

▌ Ask Codex to do anything
```

i.e. the "Working" never changes to "Completed".

The bug is fixed with this PR.

Jeremy Rose · 2025-08-15 00:01:19 +00:00

afc377bae5

fix: ensure rust-ci always "runs" when a PR is submitted (#2324 )

Our existing path filters for `rust-ci.yml`:


https://github.com/openai/codex/blob/235987843c3d6647c0819c1071f9b9f064673e9c/.github/workflows/rust-ci.yml#L1-L11

made it so that PRs that touch only `README.md` would not trigger those
builds, which is a problem because our branch protection rules are set
as follows:

<img width="1569" height="1883" alt="Screenshot 2025-08-14 at 4 45
59 PM"
src="https://github.com/user-attachments/assets/5a61f8cc-cdaf-4341-abda-7faa7b46dbd4"
/>

With the existing setup, a change to `README.md` would get stuck in
limbo because not all the CI jobs required to merge would get run. It
turns out that we need to "run" all the jobs, but make them no-ops when
the `codex-rs` and `.github` folders are untouched to get the best of
both worlds.

I asked chat how to fix this, as we want CI to be fast for
documentation-only changes. It had two suggestions:

- Use https://github.com/dorny/paths-filter or some other third-party
action.
- Write an inline Bash script to avoid a third-party dependency.

This PR takes the latter approach so that we are clear about what we're
running in CI.

Michael Bolin · 2025-08-14 17:00:19 -07:00

333803ed04

add a timer to running exec commands (#2321 )

sometimes i switch back to codex and i don't know how long a command has
been running.

<img width="744" height="462" alt="Screenshot 2025-08-14 at 3 30 07 PM"
src="https://github.com/user-attachments/assets/bd80947f-5a47-43e6-ad19-69c2995a2a29"
/>

Jeremy Rose · 2025-08-14 19:32:45 -04:00

235987843c

fix: add call_id to ApprovalParams in mcp-server/src/wire_format.rs (#2322 )

Clients still need this field.

Michael Bolin · 2025-08-14 16:09:12 -07:00

6a0f709cff

fix: run python_multiprocessing_lock_works integration test on Mac and Linux (#2318 )

The high-order bit on this PR is that it makes it so `sandbox.rs` tests
both Mac and Linux, as we introduce a general
`spawn_command_under_sandbox()` function with platform-specific
implementations for testing.

An important, and interesting, discovery in porting the test to Linux is
that (for reasons cited in the code comments), `/dev/shm` has to be
added to `writable_roots` on Linux in order for `multiprocessing.Lock`
to work there. Granting write access to `/dev/shm` comes with some
degree of risk, so we do not make this the default for Codex CLI.

Piggybacking on top of #2317, this moves the
`python_multiprocessing_lock_works` test yet again, moving
`codex-rs/core/tests/sandbox.rs` to `codex-rs/exec/tests/sandbox.rs`
because in `codex-rs/exec/tests` we can use `cargo_bin()` like so:

```
let codex_linux_sandbox_exe = assert_cmd::cargo::cargo_bin("codex-exec");
```

which is necessary so we can use `codex_linux_sandbox_exe` and therefore
`spawn_command_under_linux_sandbox` in an integration test.

This also moves `spawn_command_under_linux_sandbox()` out of `exec.rs`
and into `landlock.rs`, which makes things more consistent with
`seatbelt.rs` in `codex-core`.

For reference, https://github.com/openai/codex/pull/1808 is the PR that
made the change to Seatbelt to get this test to pass on Mac.

Michael Bolin · 2025-08-14 15:47:48 -07:00

2ecca79663

fix: move general sandbox tests to codex-rs/core/tests/sandbox.rs (#2317 )

Previous to this PR, `codex-rs/core/tests/sandbox.rs` contained
integration tests that were specific to Seatbelt. This PR moves those
tests to `codex-rs/core/src/seatbelt.rs` and designates
`codex-rs/core/tests/sandbox.rs` to be used as the home for
cross-platform (well, Mac and Linux...) sandbox tests.

To start, this migrates
`python_multiprocessing_lock_works_under_seatbelt()` from #1823 to the
new `sandbox.rs` because this is the type of thing that should work on
both Mac _and_ Linux, though I still need to do some work to clean up
the test so it works on both platforms.

Michael Bolin · 2025-08-14 14:48:38 -07:00

a8c7f5391c

test(core): add seatbelt sem lock tests (#1823 )

## Summary
- add a unit test to ensure the macOS seatbelt policy allows POSIX
semaphores
- add a macOS-only test that runs a Python multiprocessing Lock under
Seatbelt

## Testing
- `cargo test -p codex_core seatbelt_base_policy_allows_ipc_posix_sem
--no-fail-fast` (failed: failed to download from
`https://static.crates.io/crates/tokio-stream/0.1.17/download`)
- `cargo test -p codex_core seatbelt_base_policy_allows_ipc_posix_sem
--no-fail-fast --offline` (failed: attempting to make an HTTP request,
but --offline was specified)
- `cargo test --all-features --no-fail-fast --offline` (failed:
attempting to make an HTTP request, but --offline was specified)
- `just fmt` (failed: command not found: just)
- `just fix` (failed: command not found: just)

Ran tests locally to confirm it passes on master and failed before my
previous change

------
https://chatgpt.com/codex/tasks/task_i_6890f221e0a4833381cfb53e11499bcc

David Z Hao · 2025-08-14 14:23:06 -07:00

992e81d9b5

fix bash commands being incorrectly quoted in display (#2313 )

The "display format" of commands was sometimes producing incorrect
quoting like `echo foo '>' bar`, which is importantly different from the
actual command that was being run. This refactors ParsedCommand to have
a string in `cmd` instead of a vec, as a `vec` can't accurately capture
a full command.

Jeremy Rose · 2025-08-14 17:08:29 -04:00

7038827bf4

use a central animation loop (#2268 )

instead of each shimmer needing to have its own animation thread, have
render_ref schedule a new frame if it wants one and coalesce to the
earliest next frame. this also makes the animations
frame-timing-independent, based on start time instead of frame count.

Jeremy Rose · 2025-08-14 16:59:47 -04:00

20cd61e2a4

text elements in textarea for pasted content (#2302 )

This improves handling of pasted content in the textarea. It's no longer
possible to partially delete a placeholder (e.g. by ^W or ^D), nor is it
possible to place the cursor inside a placeholder. Also, we now render
placeholders in a different color to make them more clearly
differentiated.


https://github.com/user-attachments/assets/2051b3c3-963d-4781-a610-3afee522ae29

Jeremy Rose · 2025-08-14 20:58:51 +00:00

fd2b059504

fix: do not allow dotenv to create/modify environment variables starting with CODEX_ (#2308 )

This ensures Codex cannot drop a `.env` file with a value of
`CODEX_HOME` that points to a folder that Codex can control.

Michael Bolin · 2025-08-14 13:57:15 -07:00

c25f3ea53e

fix: parallelize logic in Session::new() (#2305 )

#2291 made it so that `Session::new()` is on the critical path to
`Codex::spawn()`, which means it is on the hot path to CLI startup. This
refactors `Session::new()` to run a number of async tasks in parallel
that were previously run serially to try to reduce latency.

Michael Bolin · 2025-08-14 13:29:58 -07:00

8f11652458

remove logs from composer by default (#2307 )

Currently the composer shows `handle_codex_event:<event name>` by
default which feels confusing. Let's make it appear in trace.

aibrahim-oai · 2025-08-14 13:01:15 -07:00

b62c2d9552

remove the · animation (#2271 )

the pulsing dot felt too noisy to me next to the shimmering "Working"
text. we'll bring it back for streaming response text perhaps?

Jeremy Rose · 2025-08-14 19:30:41 +00:00

475ba13479

[context] Store context messages in rollouts (#2243 )

## Summary
Currently, we use request-time logic to determine the user_instructions
and environment_context messages. This means that neither of these
values can change over time as conversations go on. We want to add in
additional details here, so we're migrating these to save these messages
to the rollout file instead. This is simpler for the client, and allows
us to append additional environment_context messages to each turn if we
want

## Testing
- [x] Integration test coverage
- [x] Tested locally with a few turns, confirmed model could reference
environment context and cached token metrics were reasonably high

Dylan · 2025-08-14 14:51:13 -04:00

544980c008

remove "status text" in bottom line (#2279 )

this used to hold the most recent log line, but it was kinda broken and
not that useful.

Jeremy Rose · 2025-08-14 14:10:21 -04:00

b42e679227

HistoryCell is a trait (#2283 )

refactors HistoryCell to be a trait instead of an enum. Also collapse
the many "degenerate" HistoryCell enums which were just a store of lines
into a single PlainHistoryCell type.

The goal here is to allow more ways of rendering history cells (e.g.
expanded/collapsed/"live"), and I expect we will return to more varied
types of HistoryCell as we develop this area.

Jeremy Rose · 2025-08-14 14:10:05 -04:00

585f7b0679

Tag InputItem (#2304 )

Instead of:
```
{ Text: { text: string } }
```

It is now:
```
{ type: "text", data: { text: string } }
```
which makes for cleaner discriminated unions

Gabriel Peal · 2025-08-14 17:58:04 +00:00

cdd33b2c04

exploration: create Session as part of Codex::spawn() (#2291 )

Historically, `Codex::spawn()` would create the instance of `Codex` and
enforce, by construction, that `Op::ConfigureSession` was the first `Op`
submitted via `submit()`. Then over in `submission_loop()`, it would
handle the case for taking the parameters of `Op::ConfigureSession` and
turning it into a `Session`.

This approach has two challenges from a state management perspective:


https://github.com/openai/codex/blob/f968a1327ad39a7786759ea8f1d1c088fe41e91b/codex-rs/core/src/codex.rs#L718

- The local `sess` variable in `submission_loop()` has to be `mut` and
`Option<Arc<Session>>` because it is not invariant that a `Session` is
present for the lifetime of the loop, so there is a lot of logic to deal
with the case where `sess` is `None` (e.g., the `send_no_session_event`
function and all of its callsites).
- `submission_loop()` is written in such a way that
`Op::ConfigureSession` could be observed multiple times, but in
practice, it is only observed exactly once at the start of the loop.

In this PR, we try to simplify the state management by _removing_ the
`Op::ConfigureSession` enum variant and constructing the `Session` as
part of `Codex::spawn()` so that it can be passed to `submission_loop()`
as `Arc<Session>`. The original logic from the `Op::ConfigureSession`
has largely been moved to the new `Session::new()` constructor.

---

Incidentally, I also noticed that the handling of `Op::ConfigureSession`
can result in events being dispatched in addition to
`EventMsg::SessionConfigured`, as an `EventMsg::Error` is created for
every MCP initialization error, so it is important to preserve that
behavior:


https://github.com/openai/codex/blob/f968a1327ad39a7786759ea8f1d1c088fe41e91b/codex-rs/core/src/codex.rs#L901-L916

Though admittedly, I believe this does not play nice with #2264, as
these error messages will likely be dispatched before the client has a
chance to call `addConversationListener`, so we likely need to make it
so `newConversation` automatically creates the subscription, but we must
also guarantee that the "ack" from `newConversation` is returned before
any other conversation-related notifications are sent so the client
knows what `conversation_id` to match on.

Michael Bolin · 2025-08-14 09:55:28 -07:00

cf7a7e63a3

feat: add support for an InterruptConversation request (#2287 )

This adds `ClientRequest::InterruptConversation`, which effectively maps
directly to `Op::Interrupt`.

---

* __->__  #2287
* #2286
* #2285

Michael Bolin · 2025-08-13 23:12:03 -07:00

f968a1327a

fix: add support for exec and apply_patch approvals in the new wire format (#2286 )

Now when `CodexMessageProcessor` receives either a
`EventMsg::ApplyPatchApprovalRequest` or a
`EventMsg::ExecApprovalRequest`, it sends the appropriate request from
the server to the client. When it gets a response, it forwards it on to
the `CodexConversation`.

Note this takes a lot of code from:


https://github.com/openai/codex/blob/main/codex-rs/mcp-server/src/conversation_loop.rs

https://github.com/openai/codex/blob/main/codex-rs/mcp-server/src/exec_approval.rs

https://github.com/openai/codex/blob/main/codex-rs/mcp-server/src/patch_approval.rs

I am copy/pasting for now because I am trying to consolidate around the
new `wire_format.rs`, so I plan to delete these other files soon.

Now that we have requests going both from client-to-server and
server-to-client, I renamed `CodexRequest` to `ClientRequest`.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2286).
* #2287
* __->__ #2286
* #2285

Michael Bolin · 2025-08-13 23:00:50 -07:00

539f4b290e

fix: make all fields of Session private (#2285 )

As `Session` needs a bit of work, it will make things easier to move
around if we can start by reducing the extent of its public API. This
makes all the fields private, though adds three `pub(crate)` getters.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2285).
* #2287
* #2286
* __->__ #2285

Michael Bolin · 2025-08-13 22:53:54 -07:00

085f166707

Use enhancement tag for feature requests (#2282 )

Kazuhiro Sera · 2025-08-14 12:08:35 +09:00

6d0eb9128e

Clarify PR/Contribution guidelines and issue templates (#2281 )

Co-authored-by: Dylan <dylan.hurd@openai.com>

Gabriel Peal · 2025-08-13 21:56:29 -04:00

e8ffecd632

Parse reasoning text content (#2277 )

Sometimes COT is returns as text content instead of `ReasoningText`. We
should parse it but not serialize back on requests.

---------

Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>

pakrym-oai · 2025-08-13 18:39:58 -07:00

f1be7978cf

fix: verify notifications are sent with the conversationId set (#2278 )

This updates `CodexMessageProcessor` so that each notification it sends
for a `EventMsg` from a `CodexConversation` such that:

- The `params` always has an appropriate `conversationId` field.
- The `method` is now includes the name of the `EventMsg` type rather
than using `codex/event` as the `method` type for all notifications. (We
currently prefix the method name with `codex/event/`, but I think that
should go away once we formalize the notification schema in
`wire_format.rs`.)

As part of this, we update `test_codex_jsonrpc_conversation_flow()` to
verify that the `task_finished` notification has made it through the
system instead of sleeping for 5s and "hoping" the server finished
processing the task. Note we have seen some flakiness in some of our
other, similar integration tests, and I expect adding a similar check
would help in those cases, as well.

Michael Bolin · 2025-08-13 17:54:12 -07:00

a62510e0ae

feat: support traditional JSON-RPC request/response in MCP server (#2264 )

This introduces a new set of request types that our `codex mcp`
supports. Note that these do not conform to MCP tool calls so that
instead of having to send something like this:

```json
{
  "jsonrpc": "2.0",
  "method": "tools/call",
  "id": 42,
  "params": {
    "name": "newConversation",
    "arguments": {
      "model": "gpt-5",
      "approvalPolicy": "on-request"
    }
  }
}
```

we can send something like this:


```json
{
  "jsonrpc": "2.0",
  "method": "newConversation",
  "id": 42,
  "params": {
    "model": "gpt-5",
    "approvalPolicy": "on-request"
  }
}
```

Admittedly, this new format is not a valid MCP tool call, but we are OK
with that right now. (That is, not everything we might want to request
of `codex mcp` is something that is appropriate for an autonomous agent
to do.)

To start, this introduces four request types:

- `newConversation`
- `sendUserMessage`
- `addConversationListener`
- `removeConversationListener`

The new `mcp-server/tests/codex_message_processor_flow.rs` shows how
these can be used.

The types are defined on the `CodexRequest` enum, so we introduce a new
`CodexMessageProcessor` that is responsible for dealing with requests
from this enum. The top-level `MessageProcessor` has been updated so
that when `process_request()` is called, it first checks whether the
request conforms to `CodexRequest` and dispatches it to
`CodexMessageProcessor` if so.

Note that I also decided to use `camelCase` for the on-the-wire format,
as that seems to be the convention for MCP.

For the moment, the new protocol is defined in `wire_format.rs` within
the `mcp-server` crate, but in a subsequent PR, I will probably move it
to its own crate to ensure the protocol has minimal dependencies and
that we can codegen a schema from it.



---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2264).
* #2278
* __->__ #2264

Michael Bolin · 2025-08-13 17:36:29 -07:00

e7bad650ff

Enable reasoning for codex-prefixed models (#2275 )

## Summary
- enable reasoning for any model slug starting with `codex-`
- provide default model info for `codex-*` slugs
- test that codex models are detected and support reasoning

## Testing
- `just fmt`
- `just fix` *(fails: E0658 `let` expressions in this position are
unstable)*
- `cargo test --all-features` *(fails: E0658 `let` expressions in this
position are unstable)*

------
https://chatgpt.com/codex/tasks/task_i_689d13f8705483208a6ed21c076868e1

pakrym-oai · 2025-08-13 17:02:50 -07:00

de2c6a2ce7

fix: skip cargo test for release builds on ordinary CI because it is slow, particularly with --all-features set (#2276 )

I put this PR together because I noticed I have to wait quite a bit
longer on my PRs since we added
https://github.com/openai/codex/pull/2242 to catch more build issues.

I think we should think about reigning in our use of create features,
but this should be good enough to speed things up for now.

Michael Bolin · 2025-08-13 16:27:20 -07:00

3a0656df63

tui: standardize tree prefix glyphs to └ (#2274 )

Replace mixed `⎿` and `L` prefixes with `└` in TUI rendering.

<img width="454" height="659" alt="Screenshot 2025-08-13 at 4 02 03 PM"
src="https://github.com/user-attachments/assets/61c9c7da-830b-4040-bb79-a91be90870ca"
/>

Jeremy Rose · 2025-08-13 19:14:03 -04:00

bb9ce3cb78

832 Commits