codex

Fix EventMsg Optional (#3604 )

dedrisian-oai · 2025-09-15 00:34:33 +00:00

2aa84b8891

Handle resuming/forking after compact (#3533 )

We need to construct the history different when compact happens. For
this, we need to just consider the history after compact and convert
compact to a response item.

This needs to change and use `build_compact_history` when this #3446 is
merged.

Ahmed Ibrahim · 2025-09-14 13:23:31 +00:00

bbea6bbf7e

Review Mode (Core) (#3401 )

## 📝 Review Mode -- Core

This PR introduces the Core implementation for Review mode:

- New op `Op::Review { prompt: String }:` spawns a child review task
with isolated context, a review‑specific system prompt, and a
`Config.review_model`.
- `EnteredReviewMode`: emitted when the child review session starts.
Every event from this point onwards reflects the review session.
- `ExitedReviewMode(Option<ReviewOutputEvent>)`: emitted when the review
finishes or is interrupted, with optional structured findings:

```json
{
  "findings": [
    {
      "title": "<≤ 80 chars, imperative>",
      "body": "<valid Markdown explaining *why* this is a problem; cite files/lines/functions>",
      "confidence_score": <float 0.0-1.0>,
      "priority": <int 0-3>,
      "code_location": {
        "absolute_file_path": "<file path>",
        "line_range": {"start": <int>, "end": <int>}
      }
    }
  ],
  "overall_correctness": "patch is correct" | "patch is incorrect",
  "overall_explanation": "<1-3 sentence explanation justifying the overall_correctness verdict>",
  "overall_confidence_score": <float 0.0-1.0>
}
```

## Questions

### Why separate out its own message history?

We want the review thread to match the training of our review models as
much as possible -- that means using a custom prompt, removing user
instructions, and starting a clean chat history.

We also want to make sure the review thread doesn't leak into the parent
thread.

### Why do this as a mode, vs. sub-agents?

1. We want review to be a synchronous task, so it's fine for now to do a
bespoke implementation.
2. We're still unclear about the final structure for sub-agents. We'd
prefer to land this quickly and then refactor into sub-agents without
rushing that implementation.

dedrisian-oai · 2025-09-12 23:25:10 +00:00

90a0fd342f

feat: context compaction (#3446 )

## Compact feature:
1. Stops the model when the context window become too large
2. Add a user turn, asking for the model to summarize
3. Build a bridge that contains all the previous user message + the
summary. Rendered from a template
4. Start sampling again from a clean conversation with only that bridge

jif-oai · 2025-09-12 13:07:10 -07:00

ea225df22e

feat: reasoning effort as optional (#3527 )

Allow the reasoning effort to be optional

jif-oai · 2025-09-12 12:06:33 -07:00

c6fd056aa6

feat: change the behavior of SetDefaultModel RPC so None clears the value. (#3529 )

It turns out that we want slightly different behavior for the
`SetDefaultModel` RPC because some models do not work with reasoning
(like GPT-4.1), so we should be able to explicitly clear this value.

Verified in `codex-rs/mcp-server/tests/suite/set_default_model.rs`.

Michael Bolin · 2025-09-12 11:35:51 -07:00

abdcb40f4c

feat: added SetDefaultModel to JSON-RPC server (#3512 )

This adds `SetDefaultModel`, which takes `model` and `reasoning_effort`
as optional fields. If set, the field will overwrite what is in the
user's `config.toml`.

This reuses logic that was added to support the `/model` command in the
TUI: https://github.com/openai/codex/pull/2799.

Michael Bolin · 2025-09-11 23:44:17 -07:00

c172e8e997

feat: include reasoning_effort in NewConversationResponse (#3506 )

`ClientRequest::NewConversation` picks up the reasoning level from the user's defaults in `config.toml`, so it should be reported in `NewConversationResponse`.

Michael Bolin · 2025-09-11 21:04:40 -07:00

9bbeb75361

bug: default to image (#3501 )

Default the MIME type to image

jif-oai · 2025-09-11 23:10:24 +00:00

44bb53df1e

Add Compact and Turn Context to the rollout items (#3444 )

Adding compact and turn context to the rollout items

based on #3440

Ahmed Ibrahim · 2025-09-11 18:08:51 +00:00

674e3d3c90

Simplify auth flow and reconcile differences between ChatGPT and API Key auth (#3189 )

This PR does the following:
* Adds the ability to paste or type an API key.
* Removes the `preferred_auth_method` config option. The last login
method is always persisted in auth.json, so this isn't needed.
* If OPENAI_API_KEY env variable is defined, the value is used to
prepopulate the new UI. The env variable is otherwise ignored by the
CLI.
* Adds a new MCP server entry point "login_api_key" so we can implement
this same API key behavior for the VS Code extension.
<img width="473" height="140" alt="Screenshot 2025-09-04 at 3 51 04 PM"
src="https://github.com/user-attachments/assets/c11bbd5b-8a4d-4d71-90fd-34130460f9d9"
/>
<img width="726" height="254" alt="Screenshot 2025-09-04 at 3 51 32 PM"
src="https://github.com/user-attachments/assets/6cc76b34-309a-4387-acbc-15ee5c756db9"
/>

Eric Traut · 2025-09-11 09:16:34 -07:00

e13b35ecb0

Change forking to read the rollout from file (#3440 )

This PR changes get history op to get path. Then, forking will use a
path. This will help us have one unified codepath for resuming/forking
conversations. Will also help in having rollout history in order. It
also fixes a bug where you won't see the UI when resuming after forking.

Ahmed Ibrahim · 2025-09-10 17:42:54 -07:00

162e1235a8

Unified execution (#3288 )

## Unified PTY-Based Exec Tool

Note: this requires to have this flag in the config:
`use_experimental_unified_exec_tool=true`

- Adds a PTY-backed interactive exec feature (“unified_exec”) with
session reuse via
  session_id, bounded output (128 KiB), and timeout clamping (≤ 60 s).
- Protocol: introduces ResponseItem::UnifiedExec { session_id,
arguments, timeout_ms }.
- Tools: exposes unified_exec as a function tool (Responses API);
excluded from Chat
  Completions payload while still supported in tool lists.
- Path handling: resolves commands via PATH (or explicit paths), with
UTF‑8/newline‑aware
  truncation (truncate_middle).
- Tests: cover command parsing, path resolution, session
persistence/cleanup, multi‑session
  isolation, timeouts, and truncation behavior.

jif-oai · 2025-09-10 17:38:11 -07:00

c09ed74a16

feat: add UserInfo request to JSON-RPC server (#3428 )

This adds a simple endpoint that provides the email address encoded in
`$CODEX_HOME/auth.json`.

As noted, for now, we do not hit the server to verify this is the user's
true email address.

Michael Bolin · 2025-09-10 17:03:35 -07:00

65f3528cad

Added images to UserMessageEvent (#3400 )

This PR adds an `images` field to the existing `UserMessageEvent` so we
can encode zero or more images associated with a user message. This
allows images to be restored when conversations are restored.

Eric Traut · 2025-09-10 10:18:43 -07:00

39db113cc9

Move initial history to protocol (#3422 )

To fix an edge case of forking then resuming

#3419

Ahmed Ibrahim · 2025-09-10 10:17:24 -07:00

45bd5ca4b9

Do not send reasoning item IDs (#3390 )

Response API doesn't require IDs on reasoning items anymore. 

Fixes: https://github.com/openai/codex/issues/3292

pakrym-oai · 2025-09-09 14:47:06 -07:00

5bcc9d8b77

feat: add ArchiveConversation to ClientRequest (#3353 )

Adds support for `ArchiveConversation` in the JSON-RPC server that takes
a `(ConversationId, PathBuf)` pair and:

- verifies the `ConversationId` corresponds to the rollout id at the
`PathBuf`
- if so, invokes
`ConversationManager.remove_conversation(ConversationId)`
- if the `CodexConversation` was in memory, send `Shutdown` and wait for
`ShutdownComplete` with a timeout
- moves the `.jsonl` file to `$CODEX_HOME/archived_sessions`

---------

Co-authored-by: Gabriel Peal <gabriel@openai.com>

Michael Bolin · 2025-09-09 11:39:00 -04:00

ace14e8d36

fix: include rollout_path in NewConversationResponse (#3352 )

Adding the `rollout_path` to the `NewConversationResponse` makes it so a
client can perform subsequent operations on a `(ConversationId,
PathBuf)` pair. #3353 will introduce support for `ArchiveConversation`.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/3352).
* #3353
* __->__ #3352

Michael Bolin · 2025-09-09 00:11:48 -07:00

2a76a08a9e

feat: Run cargo shear during CI (#3338 )

Run cargo shear as part of the CI to ensure no unused dependencies

jif-oai · 2025-09-09 01:05:08 +00:00

a9c68ea270

Generate more typescript types and return conversation id with ConversationSummary (#3219 )

This PR does multiple things that are necessary for conversation resume
to work from the extension. I wanted to make sure everything worked so
these changes wound up in one PR:
1. Generate more ts types
2. Resume rollout history files rather than create a new one every time
it is resumed so you don't see a duplicate conversation in history for
every resume. Chatted with @aibrahim-oai to verify this
3. Return conversation_id in conversation summaries
4. [Cleanup] Use serde and strong types for a lot of the rollout file
parsing

Gabriel Peal · 2025-09-08 17:54:47 -04:00

5eaaf307e1

Format large numbers in a more readable way. (#2046 )

- In the bottom line of the TUI, print the number of tokens to 3 sigfigs
  with an SI suffix, e.g. "1.23K".
- Elsewhere where we print a number, I figure it's worthwhile to print
  the exact number, because e.g. it's a summary of your session. Here we print
  the numbers comma-separated.

Justin Lebar · 2025-09-08 21:48:48 +00:00

18330c2362

Add a getUserAgent MCP method (#3320 )

This will allow the extension to pass this user agent + a suffix for its
requests

Gabriel Peal · 2025-09-08 13:30:13 -04:00

5c1416d99b

Use ConversationId instead of raw Uuids (#3282 )

We're trying to migrate from `session_id: Uuid` to `conversation_id:
ConversationId`. Not only does this give us more type safety but it
unifies our terminology across Codex and with the implementation of
session resuming, a conversation (which can span multiple sessions) is
more appropriate.

I started this impl on https://github.com/openai/codex/pull/3219 as part
of getting resume working in the extension but it's big enough that it
should be broken out.

Gabriel Peal · 2025-09-07 23:22:25 -04:00

c8fab51372

Move token usage/context information to session level (#3221 )

Move context information into the main loop so it can be used to
interrupt the loop or start auto-compaction.

pakrym-oai · 2025-09-06 15:19:23 +00:00

0269096229

Never store requests (#3212 )

When item ids are sent to Responses API it will load them from the
database ignoring the provided values. This adds extra latency.

Not having the mode to store requests also allows us to simplify the
code.

## Breaking change

The `disable_response_storage` configuration option is removed.

pakrym-oai · 2025-09-05 10:41:47 -07:00

5775174ec2

chore: improve serialization of ServerNotification (#3193 )

This PR introduces introduces a new
`OutgoingMessage::AppServerNotification` variant that is designed to
wrap a `ServerNotification`, which makes the serialization more
straightforward compared to
`OutgoingMessage::Notification(OutgoingNotification)`. We still use the
latter for serializing an `Event` as a `JSONRPCMessage::Notification`,
but I will try to get away from that in the near future.

With this change, now the generated TypeScript type for
`ServerNotification` is:

```typescript
export type ServerNotification =
  | { "method": "authStatusChange", "params": AuthStatusChangeNotification }
  | { "method": "loginChatGptComplete", "params": LoginChatGptCompleteNotification };
```

whereas before it was:

```typescript
export type ServerNotification =
  | { type: "auth_status_change"; data: AuthStatusChangeNotification }
  | { type: "login_chat_gpt_complete"; data: LoginChatGptCompleteNotification };
```

Once the `Event`s are migrated to the `ServerNotification` enum in Rust,
it should be considerably easier to work with notifications on the
TypeScript side, as it will be possible to `switch (message.method)` and
check for exhaustiveness.

Though we will probably need to introduce:

```typescript
export type ServerMessage = ServerRequest | ServerNotification;
```

and then we still need to group all of the `ServerResponse` types
together, as well.

Michael Bolin · 2025-09-04 17:49:50 -07:00

3f40fbc0a8

MCP: add session resume + history listing; (#3185 )

# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Ahmed Ibrahim · 2025-09-04 23:44:18 +00:00

907d3dd348

Correctly calculate remaining context size (#3190 )

We had multiple issues with context size calculation:
1. `initial_prompt_tokens` calculation based on cache size is not
reliable, cache misses might set it to much higher value. For now
hardcoded to a safer constant.
2. Input context size for GPT-5 is 272k (that's where 33% came from).

Fixes.

pakrym-oai · 2025-09-04 23:34:14 +00:00

7df9e9c664

[mcp-server] Update read config interface (#3093 )

## Summary
Follow-up to #3056

This PR updates the mcp-server interface for reading the config settings
saved by the user. At risk of introducing _another_ Config struct, I
think it makes sense to avoid tying our protocol to ConfigToml, as its
become a bit unwieldy. GetConfigTomlResponse was a de-facto struct for
this already - better to make it explicit, in my opinion.

This is technically a breaking change of the mcp-server protocol, but
given the previous interface was introduced so recently in #2725, and we
have not yet even started to call it, I propose proceeding with the
breaking change - but am open to preserving the old endpoint.

## Testing
- [x] Added additional integration test coverage

Dylan · 2025-09-04 16:26:41 -07:00

82ed7bd285

fix: fix serde_as annotation and verify with test (#3170 )

I didn't do https://github.com/openai/codex/pull/3163 correctly the
first time: now verified with a test.

Michael Bolin · 2025-09-04 10:38:00 -07:00

91708bb031

fix: use a more efficient wire format for ExecCommandOutputDeltaEvent.chunk (#3163 )

When serializing to JSON, the existing solution created an enormous
array of ints, which is far more bytes on the wire than a base64-encoded
string would be.

Michael Bolin · 2025-09-04 08:21:58 -07:00

0a83db5512

Dividing UserMsgs into categories to send it back to the tui (#3127 )

This PR does the following:

- divides user msgs into 3 categories: plain, user instructions, and
environment context
- Centralizes adding user instructions and environment context to a
degree
- Improve the integration testing

Building on top of #3123

Specifically this
[comment](https://github.com/openai/codex/pull/3123#discussion_r2319885089).
We need to send the user message while ignoring the User Instructions
and Environment Context we attach.

Ahmed Ibrahim · 2025-09-04 05:34:50 +00:00

2b96f9f569

Replay EventMsgs from Response Items when resuming a session with history. (#3123 )

### Overview

This PR introduces the following changes:
	1.	Adds a unified mechanism to convert ResponseItem into EventMsg.
2. Ensures that when a session is initialized with initial history, a
vector of EventMsg is sent along with the session configuration. This
allows clients to re-render the UI accordingly.
	3. 	Added integration testing

### Caveats

This implementation does not send every EventMsg that was previously
dispatched to clients. The excluded events fall into two categories:
	•	“Arguably” rolled-out events
Examples include tool calls and apply-patch calls. While these events
are conceptually rolled out, we currently only roll out ResponseItems.
These events are already being handled elsewhere and transformed into
EventMsg before being sent.
	•	Non-rolled-out events
Certain events such as TurnDiff, Error, and TokenCount are not rolled
out at all.

### Future Directions

At present, resuming a session involves maintaining two states:
	•	UI State
Clients can replay most of the important UI from the provided EventMsg
history.
	•	Model State
The model receives the complete session history to reconstruct its
internal state.

This design provides a solid foundation. If, in the future, more precise
UI reconstruction is needed, we have two potential paths:
1. Introduce a third data structure that allows us to derive both
ResponseItems and EventMsgs.
2. Clearly divide responsibilities: the core system ensures the
integrity of the model state, while clients are responsible for
reconstructing the UI.

Ahmed Ibrahim · 2025-09-04 04:47:00 +00:00

f2036572b6

MCP sandbox call (#3128 )

I have read the CLA Document and I hereby sign the CLA

jif-oai · 2025-09-03 17:05:03 -07:00

bea64569c1

chore: Clean up verbosity config (#3056 )

## Summary
It appears that #2108 hit a merge conflict with #2355 - I failed to
notice the path difference when re-reviewing the former. This PR
rectifies that, and consolidates it into the protocol package, in line
with our philosophy of specifying types in one place.

## Testing
- [x] Adds config test for model_verbosity

Dylan · 2025-09-03 12:20:31 -07:00

db5276f8e6

rework message styling (#2877 )

https://github.com/user-attachments/assets/cf07f62b-1895-44bb-b9c3-7a12032eb371

Jeremy Rose · 2025-09-02 17:29:58 +00:00

e442ecedab

Following up on #2371 post commit feedback (#2852 )

- Introduce websearch end to complement the begin 
- Moves the logic of adding the sebsearch tool to
create_tools_json_for_responses_api
- Making it the client responsibility to toggle the tool on or off 
- Other misc in #2371 post commit feedback
- Show the query:

<img width="1392" height="151" alt="image"
src="https://github.com/user-attachments/assets/8457f1a6-f851-44cf-bcca-0d4fe460ce89"
/>

Ahmed Ibrahim · 2025-08-28 19:24:38 -07:00

9dbe7284d2

Custom /prompts (#2696 )

Adds custom `/prompts` to `~/.codex/prompts/<command>.md`.

<img width="239" height="107" alt="Screenshot 2025-08-25 at 6 22 42 PM"
src="https://github.com/user-attachments/assets/fe6ebbaa-1bf6-49d3-95f9-fdc53b752679"
/>

---

Details:

1. Adds `Op::ListCustomPrompts` to core.
2. Returns `ListCustomPromptsResponse` with list of `CustomPrompt`
(name, content).
3. TUI calls the operation on load, and populates the custom prompts
(excluding prompts that collide with builtins).
4. Selecting the custom prompt automatically sends the prompt to the
agent.

dedrisian-oai · 2025-08-29 02:16:39 +00:00

b8e8454b3f

[mcp-server] Add GetConfig endpoint (#2725 )

## Summary
Adds a GetConfig request to the MCP Protocol, so MCP clients can
evaluate the resolved config.toml settings which the harness is using.

## Testing
- [x] Added an end to end test of the endpoint

Dylan · 2025-08-27 09:59:03 -07:00

0cec0770e2

send context window with task started (#2752 )

- Send context window with task started
- Accounting for changing the model per turn

Ahmed Ibrahim · 2025-08-27 00:04:21 -07:00

d0e06f74e2

Add web search tool (#2371 )

Adds web_search tool, enabling the model to use Responses API web_search
tool.
- Disabled by default, enabled by --search flag
- When --search is passed, exposes web_search_request function tool to
the model, which triggers user approval. When approved, the model can
use the web_search tool for the remainder of the turn
<img width="1033" height="294" alt="image"
src="https://github.com/user-attachments/assets/62ac6563-b946-465c-ba5d-9325af28b28f"
/>

---------

Co-authored-by: easong-openai <easong@openai.com>

Reuben Narad · 2025-08-23 22:58:56 -07:00

363636f5eb

send-aggregated output (#2364 )

We want to send an aggregated output of stderr and stdout so we don't
have to aggregate it stderr+stdout as we lose order sometimes.

---------

Co-authored-by: Gabriel Peal <gpeal@users.noreply.github.com>

Ahmed Ibrahim · 2025-08-23 16:54:31 +00:00

957d44918d

fork conversation from a previous message (#2575 )

This can be the underlying logic in order to start a conversation from a
previous message. will need some love in the UI.

Base for building this: #2588

Ahmed Ibrahim · 2025-08-22 17:06:09 -07:00

311ad0ce26

tui: coalesce command output; show unabridged commands in transcript (#2590 )

https://github.com/user-attachments/assets/effec7c7-732a-4b61-a2ae-3cb297b6b19b

Jeremy Rose · 2025-08-22 16:32:31 -07:00

d994019f3f

Move models.rs to protocol (#2595 )

Moving models.rs to protocol so we can use them in `Codex` operations

Ahmed Ibrahim · 2025-08-22 22:18:54 +00:00

097782c775

Add AuthManager and enhance GetAuthStatus command (#2577 )

This PR adds a central `AuthManager` struct that manages the auth
information used across conversations and the MCP server. Prior to this,
each conversation and the MCP server got their own private snapshots of
the auth information, and changes to one (such as a logout or token
refresh) were not seen by others.

This is especially problematic when multiple instances of the CLI are
run. For example, consider the case where you start CLI 1 and log in to
ChatGPT account X and then start CLI 2 and log out and then log in to
ChatGPT account Y. The conversation in CLI 1 is still using account X,
but if you create a new conversation, it will suddenly (and
unexpectedly) switch to account Y.

With the `AuthManager`, auth information is read from disk at the time
the `ConversationManager` is constructed, and it is cached in memory.
All new conversations use this same auth information, as do any token
refreshes.

The `AuthManager` is also used by the MCP server's GetAuthStatus
command, which now returns the auth method currently used by the MCP
server.

This PR also includes an enhancement to the GetAuthStatus command. It
now accepts two new (optional) input parameters: `include_token` and
`refresh_token`. Callers can use this to request the in-use auth token
and can optionally request to refresh the token.

The PR also adds tests for the login and auth APIs that I recently added
to the MCP server.

Eric Traut · 2025-08-22 13:10:11 -07:00

dc42ec0eb4

Parse and expose stream errors (#2540 )

easong-openai · 2025-08-21 01:15:24 -07:00

8ad56be06e

Add a serde tag to ParsedItem (#2546 )

Gabriel Peal · 2025-08-21 01:34:46 -04:00

74683bab91

Added new auth-related methods and events to mcp server (#2496 )

This PR adds the following:
* A getAuthStatus method on the mcp server. This returns the auth method
currently in use (chatgpt or apikey) or none if the user is not
authenticated. It also returns the "preferred auth method" which
reflects the `preferred_auth_method` value in the config.
* A logout method on the mcp server. If called, it logs out the user and
deletes the `auth.json` file — the same behavior in the cli's `/logout`
command.
* An `authStatusChange` event notification that is sent when the auth
status changes due to successful login or logout operations.
* Logic to pass command-line config overrides to the mcp server at
startup time. This allows use cases like `codex mcp -c
preferred_auth_method=apikey`.

Eric Traut · 2025-08-20 20:36:34 -07:00

dacff9675a

61 Commits