codex

Add AuthManager and enhance GetAuthStatus command (#2577 )

This PR adds a central `AuthManager` struct that manages the auth
information used across conversations and the MCP server. Prior to this,
each conversation and the MCP server got their own private snapshots of
the auth information, and changes to one (such as a logout or token
refresh) were not seen by others.

This is especially problematic when multiple instances of the CLI are
run. For example, consider the case where you start CLI 1 and log in to
ChatGPT account X and then start CLI 2 and log out and then log in to
ChatGPT account Y. The conversation in CLI 1 is still using account X,
but if you create a new conversation, it will suddenly (and
unexpectedly) switch to account Y.

With the `AuthManager`, auth information is read from disk at the time
the `ConversationManager` is constructed, and it is cached in memory.
All new conversations use this same auth information, as do any token
refreshes.

The `AuthManager` is also used by the MCP server's GetAuthStatus
command, which now returns the auth method currently used by the MCP
server.

This PR also includes an enhancement to the GetAuthStatus command. It
now accepts two new (optional) input parameters: `include_token` and
`refresh_token`. Callers can use this to request the in-use auth token
and can optionally request to refresh the token.

The PR also adds tests for the login and auth APIs that I recently added
to the MCP server.

Eric Traut · 2025-08-22 13:10:11 -07:00

dc42ec0eb4

Added new auth-related methods and events to mcp server (#2496 )

This PR adds the following:
* A getAuthStatus method on the mcp server. This returns the auth method
currently in use (chatgpt or apikey) or none if the user is not
authenticated. It also returns the "preferred auth method" which
reflects the `preferred_auth_method` value in the config.
* A logout method on the mcp server. If called, it logs out the user and
deletes the `auth.json` file — the same behavior in the cli's `/logout`
command.
* An `authStatusChange` event notification that is sent when the auth
status changes due to successful login or logout operations.
* Logic to pass command-line config overrides to the mcp server at
startup time. This allows use cases like `codex mcp -c
preferred_auth_method=apikey`.

Eric Traut · 2025-08-20 20:36:34 -07:00

dacff9675a

chore: upgrade to Rust 1.89 (#2465 )

Codex created this PR from the following prompt:

> upgrade this entire repo to Rust 1.89. Note that this requires
updating codex-rs/rust-toolchain.toml as well as the workflows in
.github/. Make sure that things are "clippy clean" as this change will
likely uncover new Clippy errors. `just fmt` and `cargo clippy --tests`
are sufficient to check for correctness

Note this modifies a lot of lines because it folds nested `if`
statements using `&&`.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2465).
* #2467
* __->__ #2465

Michael Bolin · 2025-08-19 13:22:02 -07:00

50c48e88f5

Show login options when not signed in with ChatGPT (#2440 )

Motivation: we have users who uses their API key although they want to
use ChatGPT account. We want to give them the chance to always login
with their account.

This PR displays login options when the user is not signed in with
ChatGPT. Even if you have set an OpenAI API key as an environment
variable, you will still be prompted to log in with ChatGPT.

We’ve also added a new flag, `always_use_api_key_signing` false by
default, which ensures you are never asked to log in with ChatGPT and
always defaults to using your API key.



https://github.com/user-attachments/assets/b61ebfa9-3c5e-4ab7-bf94-395c23a0e0af

After ChatGPT sign in:


https://github.com/user-attachments/assets/d58b366b-c46a-428f-a22f-2ac230f991c0

Ahmed Ibrahim · 2025-08-19 03:22:48 +00:00

97f995a749

fix: remove shutdown_flag param to run_login_server() (#2399 )

In practice, this was always passed in as `None`, so eliminated the
param and updated all the call sites.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2399).
* __->__ #2399
* #2398
* #2396
* #2395
* #2394
* #2393
* #2389

Michael Bolin · 2025-08-19 01:15:50 +00:00

2aad3a13b8

fix: async-ify login flow (#2393 )

This replaces blocking I/O with async/non-blocking I/O in a number of
cases. This facilitates the use of `tokio::sync::Notify` and
`tokio::select!` in #2394.









---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2393).
* #2399
* #2398
* #2396
* #2395
* #2394
* __->__ #2393
* #2389

Michael Bolin · 2025-08-18 17:23:40 -07:00

6e8c055fd5

protocol-ts (#2425 )

Michael Bolin · 2025-08-18 13:08:53 -07:00

fc6cfd5ecc

chore: move mcp-server/src/wire_format.rs to protocol/src/mcp_protocol.rs (#2423 )

The existing `wire_format.rs` should share more types with the
`codex-protocol` crate (like `AskForApproval` instead of maintaining a
parallel `CodexToolCallApprovalPolicy` enum), so this PR moves
`wire_format.rs` into `codex-protocol`, renaming it as
`mcp-protocol.rs`. We also de-dupe types, where appropriate.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2423).
* #2424
* __->__ #2423

Michael Bolin · 2025-08-18 09:36:57 -07:00

712bfa04ac

Remove duplicated "Successfully logged in message" (#2357 )

pakrym-oai · 2025-08-15 13:01:27 -07:00

c1156a878b

Cleanup rust login server a bit more (#2331 )

Remove some extra abstractions.

---------

Co-authored-by: easong-openai <easong@openai.com>

pakrym-oai · 2025-08-14 19:42:14 -07:00

76df07350a

Port login server to rust (#2294 )

Port the login server to rust.

---------

Co-authored-by: pakrym-oai <pakrym@openai.com>

easong-openai · 2025-08-14 17:11:26 -07:00

e9b597cfa3

fix: run python_multiprocessing_lock_works integration test on Mac and Linux (#2318 )

The high-order bit on this PR is that it makes it so `sandbox.rs` tests
both Mac and Linux, as we introduce a general
`spawn_command_under_sandbox()` function with platform-specific
implementations for testing.

An important, and interesting, discovery in porting the test to Linux is
that (for reasons cited in the code comments), `/dev/shm` has to be
added to `writable_roots` on Linux in order for `multiprocessing.Lock`
to work there. Granting write access to `/dev/shm` comes with some
degree of risk, so we do not make this the default for Codex CLI.

Piggybacking on top of #2317, this moves the
`python_multiprocessing_lock_works` test yet again, moving
`codex-rs/core/tests/sandbox.rs` to `codex-rs/exec/tests/sandbox.rs`
because in `codex-rs/exec/tests` we can use `cargo_bin()` like so:

```
let codex_linux_sandbox_exe = assert_cmd::cargo::cargo_bin("codex-exec");
```

which is necessary so we can use `codex_linux_sandbox_exe` and therefore
`spawn_command_under_linux_sandbox` in an integration test.

This also moves `spawn_command_under_linux_sandbox()` out of `exec.rs`
and into `landlock.rs`, which makes things more consistent with
`seatbelt.rs` in `codex-core`.

For reference, https://github.com/openai/codex/pull/1808 is the PR that
made the change to Seatbelt to get this test to pass on Mac.

Michael Bolin · 2025-08-14 15:47:48 -07:00

2ecca79663

chore: introduce ConversationManager as a clearinghouse for all conversations (#2240 )

This PR does two things because after I got deep into the first one I
started pulling on the thread to the second:

- Makes `ConversationManager` the place where all in-memory
conversations are created and stored. Previously, `MessageProcessor` in
the `codex-mcp-server` crate was doing this via its `session_map`, but
this is something that should be done in `codex-core`.
- It unwinds the `ctrl_c: tokio::sync::Notify` that was threaded
throughout our code. I think this made sense at one time, but now that
we handle Ctrl-C within the TUI and have a proper `Op::Interrupt` event,
I don't think this was quite right, so I removed it. For `codex exec`
and `codex proto`, we now use `tokio::signal::ctrl_c()` directly, but we
no longer make `Notify` a field of `Codex` or `CodexConversation`.

Changes of note:

- Adds the files `conversation_manager.rs` and `codex_conversation.rs`
to `codex-core`.
- `Codex` and `CodexSpawnOk` are no longer exported from `codex-core`:
other crates must use `CodexConversation` instead (which is created via
`ConversationManager`).
- `core/src/codex_wrapper.rs` has been deleted in favor of
`ConversationManager`.
- `ConversationManager::new_conversation()` returns `NewConversation`,
which is in line with the `new_conversation` tool we want to add to the
MCP server. Note `NewConversation` includes `SessionConfiguredEvent`, so
we eliminate checks in cases like `codex-rs/core/tests/client.rs` to
verify `SessionConfiguredEvent` is the first event because that is now
internal to `ConversationManager`.
- Quite a bit of code was deleted from
`codex-rs/mcp-server/src/message_processor.rs` since it no longer has to
manage multiple conversations itself: it goes through
`ConversationManager` instead.
- `core/tests/live_agent.rs` has been deleted because I had to update a
bunch of tests and all the tests in here were ignored, and I don't think
anyone ever ran them, so this was just technical debt, at this point.
- Removed `notify_on_sigint()` from `util.rs` (and in a follow-up, I
hope to refactor the blandly-named `util.rs` into more descriptive
files).
- In general, I started replacing local variables named `codex` as
`conversation`, where appropriate, though admittedly I didn't do it
through all the integration tests because that would have added a lot of
noise to this PR.




---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2240).
* #2264
* #2263
* __->__ #2240

Michael Bolin · 2025-08-13 13:38:18 -07:00

08ed618f72

fix: display canonical command name in help (#2246 )

## Summary
- ensure CLI help uses `codex` as program name regardless of binary
filename

## Testing
- `just fmt`
- `just fix` *(fails: `let` expressions in this position are unstable)*
- `cargo test --all-features` *(fails: `let` expressions in this
position are unstable)*

------
https://chatgpt.com/codex/tasks/task_i_689bd5a731188320814dcbbc546ce22a

pakrym-oai · 2025-08-13 09:39:11 -07:00

e6dc5a6df5

chore: move top-level load_auth() to CodexAuth::from_codex_home() (#1966 )

There are two valid ways to create an instance of `CodexAuth`:
`from_api_key()` and `from_codex_home()`. Now both are static methods of
`CodexAuth` and are listed first in the implementation.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1966).
* #1971
* #1970
* __->__ #1966
* #1965
* #1962

Michael Bolin · 2025-08-07 16:49:37 -07:00

b991c04f86

chore: make CodexAuth::api_key a private field (#1965 )

Force callers to access this information via `get_token()` rather than
messing with it directly.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1965).
* #1971
* #1970
* #1966
* __->__ #1965
* #1962

Michael Bolin · 2025-08-07 16:40:01 -07:00

02c9c2ecad

fix: public load_auth() fn always called with include_env_var=true (#1961 )

Apparently `include_env_var=false` was only used for testing, so clean
up the API a little to make that clear.


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1961).
* #1962
* __->__ #1961

Michael Bolin · 2025-08-07 14:19:30 -07:00

7d67159587

Add logout command to CLI and TUI (#1932 )

## Summary
- support `codex logout` via new subcommand and helper that removes the
stored `auth.json`
- expose a `logout` function in `codex-login` and test it
- add `/logout` slash command in the TUI; command list is filtered when
not logged in and the handler deletes `auth.json` then exits

## Testing
- `just fix` *(fails: failed to get `diffy` from crates.io)*
- `cargo test --all-features` *(fails: failed to get `diffy` from
crates.io)*

------
https://chatgpt.com/codex/tasks/task_i_68945c3facac832ca83d48499716fb51

Gabriel Peal · 2025-08-07 04:17:33 -04:00

6d19b73edf

First pass at a TUI onboarding (#1876 )

This sets up the scaffolding and basic flow for a TUI onboarding
experience. It covers sign in with ChatGPT, env auth, as well as some
safety guidance.

Next up:
1. Replace the git warning screen
2. Use this to configure default approval/sandbox modes


Note the shimmer flashes are from me slicing the video, not jank.

https://github.com/user-attachments/assets/0fbe3479-fdde-41f3-87fb-a7a83ab895b8

Gabriel Peal · 2025-08-06 18:22:14 -04:00

2d5de795aa

chore: refactor exec.rs: create separate seatbelt.rs and spawn.rs files (#1762 )

At 550 lines, `exec.rs` was a bit large. In particular, I found it hard
to locate the Seatbelt-related code quickly without a file with
`seatbelt` in the name, so this refactors things so:

- `spawn_command_under_seatbelt()` and dependent code moves to a new
`seatbelt.rs` file
- `spawn_child_async()` and dependent code moves to a new `spawn.rs`
file

Michael Bolin · 2025-07-31 13:11:47 -07:00

5a0ad5ab8f

Add codex login --api-key (#1759 )

Allow setting the API key via `codex login --api-key`

pakrym-oai · 2025-07-31 17:48:49 +00:00

549846b29a

Auto format toml (#1745 )

Add recommended extension and configure it to auto format prompt.

pakrym-oai · 2025-07-30 18:37:00 -07:00

51b6bdefbe

Add login status command (#1716 )

Print the current login mode, sanitized key and return an appropriate
status.

pakrym-oai · 2025-07-30 14:09:26 -07:00

301ec72107

Add support for a separate chatgpt auth endpoint (#1712 )

Adds a `CodexAuth` type that encapsulates information about available
auth modes and logic for refreshing the token.
Changes `Responses` API to send requests to different endpoints based on
the auth type.
Updates login_with_chatgpt to support API-less mode and skip the key
exchange.

pakrym-oai · 2025-07-30 19:40:15 +00:00

ea01a5ffe2

replace login screen with a simple prompt (#1713 )

Perhaps there was an intention to make the login screen prettier, but it
feels quite silly right now to just have a screen that says "press q",
so replace it with something that lets the user directly login without
having to quit the app.

<img width="1283" height="635" alt="Screenshot 2025-07-28 at 2 54 05 PM"
src="https://github.com/user-attachments/assets/f19e5595-6ef9-4a2d-b409-aa61b30d3628"
/>

Jeremy Rose · 2025-07-28 17:25:14 -07:00

f66704a88f

fix: move arg0 handling out of codex-linux-sandbox and into its own crate (#1697 )

Michael Bolin · 2025-07-28 08:31:24 -07:00

9102255854

chore: update Codex::spawn() to return a struct instead of a tuple (#1677 )

Also update `init_codex()` to return a `struct` instead of a tuple, as well.

Michael Bolin · 2025-07-27 20:01:35 -07:00

2405c40026

Easily Selectable History (#1672 )

This update replaces the previous ratatui history widget with an
append-only log so that the terminal can handle text selection and
scrolling. It also disables streaming responses, which we'll do our best
to bring back in a later PR. It also adds a small summary of token use
after the TUI exits.

easong-openai · 2025-07-25 01:56:40 -07:00

480e82b00d

Fix flaky test (#1664 )

Co-authored-by: aibrahim-oai <aibrahim@openai.com>

Gabriel Peal · 2025-07-23 18:40:00 -04:00

db84722080

[mcp-server] Add reply tool call (#1643 )

## Summary
Adds a new mcp tool call, `codex-reply`, so we can continue existing
sessions. This is a first draft and does not yet support sessions from
previous processes.

## Testing
- [x] tested with mcp client

Dylan · 2025-07-21 21:01:56 -07:00

18b2b30841

Add codex apply to apply a patch created from the Codex remote agent (#1528 )

In order to to this, I created a new `chatgpt` crate where we can put
any code that interacts directly with ChatGPT as opposed to the OpenAI
API. I added a disclaimer to the README for it that it should primarily
be modified by OpenAI employees.


https://github.com/user-attachments/assets/bb978e33-d2c9-4d8e-af28-c8c25b1988e8

Gabriel Peal · 2025-07-11 13:30:11 -04:00

bfeb8c92a5

fix: the completion subcommand should assume the CLI is named codex, not codex-cli (#1496 )

Current 0.4.0 release:

```
~/code/codex2/codex-rs$ codex completion | head
_codex-cli() {
    local i cur prev opts cmd
    COMPREPLY=()
    if [[ "${BASH_VERSINFO[0]}" -ge 4 ]]; then
        cur="$2"
    else
        cur="${COMP_WORDS[COMP_CWORD]}"
    fi
    prev="$3"
    cmd=""
```

with this change:

```
~/code/codex2/codex-rs$ just codex completion | head
cargo run --bin codex -- "$@"
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.82s
     Running `target/debug/codex completion`
_codex() {
    local i cur prev opts cmd
    COMPREPLY=()
    if [[ "${BASH_VERSINFO[0]}" -ge 4 ]]; then
        cur="$2"
    else
        cur="${COMP_WORDS[COMP_CWORD]}"
    fi
    prev="$3"
    cmd=""
```

Michael Bolin · 2025-07-09 14:08:35 -07:00

268267b59e

feat: add codex completion to generate shell completions (#1491 )

Once this lands, we can update our brew formula to use
`generate_completions_from_executable()` like so:


https://github.com/Homebrew/homebrew-core/blob/905238ff7f8f48fdc6f4ae2a9fbdd364bf7c7f9d/Formula/h/hgrep.rb#L21-L25

Michael Bolin · 2025-07-08 21:43:27 -07:00

4a15ebc1ca

feat: add support for --sandbox flag (#1476 )

On a high-level, we try to design `config.toml` so that you don't have
to "comment out a lot of stuff" when testing different options.

Previously, defining a sandbox policy was somewhat at odds with this
principle because you would define the policy as attributes of
`[sandbox]` like so:

```toml
[sandbox]
mode = "workspace-write"
writable_roots = [ "/tmp" ]
```

but if you wanted to temporarily change to a read-only sandbox, you
might feel compelled to modify your file to be:

```toml
[sandbox]
mode = "read-only"
# mode = "workspace-write"
# writable_roots = [ "/tmp" ]
```

Technically, commenting out `writable_roots` would not be strictly
necessary, as `mode = "read-only"` would ignore `writable_roots`, but
it's still a reasonable thing to do to keep things tidy.

Currently, the various values for `mode` do not support that many
attributes, so this is not that hard to maintain, but one could imagine
this becoming more complex in the future.

In this PR, we change Codex CLI so that it no longer recognizes
`[sandbox]`. Instead, it introduces a top-level option, `sandbox_mode`,
and `[sandbox_workspace_write]` is used to further configure the sandbox
when when `sandbox_mode = "workspace-write"` is used:

```toml
sandbox_mode = "workspace-write"

[sandbox_workspace_write]
writable_roots = [ "/tmp" ]
```

This feels a bit more future-proof in that it is less tedious to
configure different sandboxes:

```toml
sandbox_mode = "workspace-write"

[sandbox_read_only]
# read-only options here...

[sandbox_workspace_write]
writable_roots = [ "/tmp" ]

[sandbox_danger_full_access]
# danger-full-access options here...
```

In this scheme, you never need to comment out the configuration for an
individual sandbox type: you only need to redefine `sandbox_mode`.

Relatedly, previous to this change, a user had to do `-c
sandbox.mode=read-only` to change the mode on the command line. With
this change, things are arguably a bit cleaner because the equivalent
option is `-c sandbox_mode=read-only` (and now `-c
sandbox_workspace_write=...` can be set separately).

Though more importantly, we introduce the `-s/--sandbox` option to the
CLI, which maps directly to `sandbox_mode` in `config.toml`, making
config override behavior easier to reason about. Moreover, as you can
see in the updates to the various Markdown files, it is much easier to
explain how to configure sandboxing when things like `--sandbox
read-only` can be used as an example.

Relatedly, this cleanup also made it straightforward to add support for
a `sandbox` option for Codex when used as an MCP server (see the changes
to `mcp-server/src/codex_tool_config.rs`).

Fixes https://github.com/openai/codex/issues/1248.

Michael Bolin · 2025-07-07 22:31:30 -07:00

e0c08cea4f

feat: redesign sandbox config (#1373 )

This is a major redesign of how sandbox configuration works and aims to
fix https://github.com/openai/codex/issues/1248. Specifically, it
replaces `sandbox_permissions` in `config.toml` (and the
`-s`/`--sandbox-permission` CLI flags) with a "table" with effectively
three variants:

```toml
# Safest option: full disk is read-only, but writes and network access are disallowed.
[sandbox]
mode = "read-only"

# The cwd of the Codex task is writable, as well as $TMPDIR on macOS.
# writable_roots can be used to specify additional writable folders.
[sandbox]
mode = "workspace-write"
writable_roots = []  # Optional, defaults to the empty list.
network_access = false  # Optional, defaults to false.

# Disable sandboxing: use at your own risk!!!
[sandbox]
mode = "danger-full-access"
```

This should make sandboxing easier to reason about. While we have
dropped support for `-s`, the way it works now is:

- no flags => `read-only`
- `--full-auto` => `workspace-write`
- currently, there is no way to specify `danger-full-access` via a CLI
flag, but we will revisit that as part of
https://github.com/openai/codex/issues/1254

Outstanding issue:

- As noted in the `TODO` on `SandboxPolicy::is_unrestricted()`, we are
still conflating sandbox preferences with approval preferences in that
case, which needs to be cleaned up.

Michael Bolin · 2025-06-24 16:59:47 -07:00

0776d78357

feat: add support for login with ChatGPT (#1212 )

This does not implement the full Login with ChatGPT experience, but it
should unblock people.

**What works**

* The `codex` multitool now has a `login` subcommand, so you can run
`codex login`, which should write `CODEX_HOME/auth.json` if you complete
the flow successfully. The TUI will now read the `OPENAI_API_KEY` from
`auth.json`.
* The TUI should refresh the token if it has expired and the necessary
information is in `auth.json`.
* There is a `LoginScreen` in the TUI that tells you to run `codex
login` if both (1) your model provider expects to use `OPENAI_API_KEY`
as its env var, and (2) `OPENAI_API_KEY` is not set.

**What does not work**

* The `LoginScreen` does not support the login flow from within the TUI.
Instead, it tells you to quit, run `codex login`, and then run `codex`
again.
* `codex exec` does read from `auth.json` yet, nor does it direct the
user to go through the login flow if `OPENAI_API_KEY` is not be found.
* The `maybeRedeemCredits()` function from `get-api-key.tsx` has not
been ported from TypeScript to `login_with_chatgpt.py` yet:


https://github.com/openai/codex/blob/a67a67f3258fc21e147b6786a143fe3e15e6d5ba/codex-cli/src/utils/get-api-key.tsx#L84-L89

**Implementation**

Currently, the OAuth flow requires running a local webserver on
`127.0.0.1:1455`. It seemed wasteful to incur the additional binary cost
of a webserver dependency in the Rust CLI just to support login, so
instead we implement this logic in Python, as Python has a `http.server`
module as part of its standard library. Specifically, we bundle the
contents of a single Python file as a string in the Rust CLI and then
use it to spawn a subprocess as `python3 -c
{{SOURCE_FOR_PYTHON_SERVER}}`.

As such, the most significant files in this PR are:

```
codex-rs/login/src/login_with_chatgpt.py
codex-rs/login/src/lib.rs
```

Now that the CLI may load `OPENAI_API_KEY` from the environment _or_
`CODEX_HOME/auth.json`, we need a new abstraction for reading/writing
this variable, so we introduce:

```
codex-rs/core/src/openai_api_key.rs
```

Note that `std::env::set_var()` is [rightfully] `unsafe` in Rust 2024,
so we use a LazyLock<RwLock<Option<String>>> to store `OPENAI_API_KEY`
so it is read in a thread-safe manner.

Ultimately, it should be possible to go through the entire login flow
from the TUI. This PR introduces a placeholder `LoginScreen` UI for that
right now, though the new `codex login` subcommand introduced in this PR
should be a viable workaround until the UI is ready.

**Testing**

Because the login flow is currently implemented in a standalone Python
file, you can test it without building any Rust code as follows:

```
rm -rf /tmp/codex_home && mkdir /tmp/codex_home
CODEX_HOME=/tmp/codex_home python3 codex-rs/login/src/login_with_chatgpt.py
```

For reference:

* the original TypeScript implementation was introduced in
https://github.com/openai/codex/pull/963
* support for redeeming credits was later added in
https://github.com/openai/codex/pull/974

Michael Bolin · 2025-06-04 08:44:17 -07:00

515b6331bd

feat: add support for -c/--config to override individual config items (#1137 )

This PR introduces support for `-c`/`--config` so users can override
individual config values on the command line using `--config
name=value`. Example:

```
codex --config model=o4-mini
```

Making it possible to set arbitrary config values on the command line
results in a more flexible configuration scheme and makes it easier to
provide single-line examples that can be copy-pasted from documentation.

Effectively, it means there are four levels of configuration for some
values:

- Default value (e.g., `model` currently defaults to `o4-mini`)
- Value in `config.toml` (e.g., user could override the default to be
`model = "o3"` in their `config.toml`)
- Specifying `-c` or `--config` to override `model` (e.g., user can
include `-c model=o3` in their list of args to Codex)
- If available, a config-specific flag can be used, which takes
precedence over `-c` (e.g., user can specify `--model o3` in their list
of args to Codex)

Now that it is possible to specify anything that could be configured in
`config.toml` on the command line using `-c`, we do not need to have a
custom flag for every possible config option (which can clutter the
output of `--help`). To that end, as part of this PR, we drop support
for the `--disable-response-storage` flag, as users can now specify `-c
disable_response_storage=true` to get the equivalent functionality.

Under the hood, this works by loading the `config.toml` into a
`toml::Value`. Then for each `key=value`, we create a small synthetic
TOML file with `value` so that we can run the TOML parser to get the
equivalent `toml::Value`. We then parse `key` to determine the point in
the original `toml::Value` to do the insert/replace. Once all of the
overrides from `-c` args have been applied, the `toml::Value` is
deserialized into a `ConfigToml` and then the `ConfigOverrides` are
applied, as before.

Michael Bolin · 2025-05-27 23:11:44 -07:00

d60f350cf8

fix: forgot to pass codex_linux_sandbox_exe through in cli/src/debug_sandbox.rs (#1095 )

I accidentally missed this in https://github.com/openai/codex/pull/1086.

Michael Bolin · 2025-05-23 11:53:13 -07:00

4bf81373a7

fix: overhaul how we spawn commands under seccomp/landlock on Linux (#1086 )

Historically, we spawned the Seatbelt and Landlock sandboxes in
substantially different ways:

For **Seatbelt**, we would run `/usr/bin/sandbox-exec` with our policy
specified as an arg followed by the original command:


https://github.com/openai/codex/blob/d1de7bb383552e8fadd94be79d65d188e00fd562/codex-rs/core/src/exec.rs#L147-L219

For **Landlock/Seccomp**, we would do
`tokio::runtime::Builder::new_current_thread()`, _invoke
Landlock/Seccomp APIs to modify the permissions of that new thread_, and
then spawn the command:


https://github.com/openai/codex/blob/d1de7bb383552e8fadd94be79d65d188e00fd562/codex-rs/core/src/exec_linux.rs#L28-L49

While it is neat that Landlock/Seccomp supports applying a policy to
only one thread without having to apply it to the entire process, it
requires us to maintain two different codepaths and is a bit harder to
reason about. The tipping point was
https://github.com/openai/codex/pull/1061, in which we had to start
building up the `env` in an unexpected way for the existing
Landlock/Seccomp approach to continue to work.

This PR overhauls things so that we do similar things for Mac and Linux.
It turned out that we were already building our own "helper binary"
comparable to Mac's `sandbox-exec` as part of the `cli` crate:


https://github.com/openai/codex/blob/d1de7bb383552e8fadd94be79d65d188e00fd562/codex-rs/cli/Cargo.toml#L10-L12

We originally created this to build a small binary to include with the
Node.js version of the Codex CLI to provide support for Linux
sandboxing.

Though the sticky bit is that, at this point, we still want to deploy
the Rust version of Codex as a single, standalone binary rather than a
CLI and a supporting sandboxing binary. To satisfy this goal, we use
"the arg0 trick," in which we:

* use `std::env::current_exe()` to get the path to the CLI that is
currently running
* use the CLI as the `program` for the `Command`
* set `"codex-linux-sandbox"` as arg0 for the `Command`

A CLI that supports sandboxing should check arg0 at the start of the
program. If it is `"codex-linux-sandbox"`, it must invoke
`codex_linux_sandbox::run_main()`, which runs the CLI as if it were
`codex-linux-sandbox`. When acting as `codex-linux-sandbox`, we make the
appropriate Landlock/Seccomp API calls and then use `execvp(3)` to spawn
the original command, so do _replace_ the process rather than spawn a
subprocess. Incidentally, we do this before starting the Tokio runtime,
so the process should only have one thread when `execvp(3)` is called.

Because the `core` crate that needs to spawn the Linux sandboxing is not
a CLI in its own right, this means that every CLI that includes `core`
and relies on this behavior has to (1) implement it and (2) provide the
path to the sandboxing executable. While the path is almost always
`std::env::current_exe()`, we needed to make this configurable for
integration tests, so `Config` now has a `codex_linux_sandbox_exe:
Option<PathBuf>` property to facilitate threading this through,
introduced in https://github.com/openai/codex/pull/1089.

This common pattern is now captured in
`codex_linux_sandbox::run_with_sandbox()` and all of the `main.rs`
functions that should use it have been updated as part of this PR.

The `codex-linux-sandbox` crate added to the Cargo workspace as part of
this PR now has the bulk of the Landlock/Seccomp logic, which makes
`core` a bit simpler. Indeed, `core/src/exec_linux.rs` and
`core/src/landlock.rs` were removed/ported as part of this PR. I also
moved the unit tests for this code into an integration test,
`linux-sandbox/tests/landlock.rs`, in which I use
`env!("CARGO_BIN_EXE_codex-linux-sandbox")` as the value for
`codex_linux_sandbox_exe` since `std::env::current_exe()` is not
appropriate in that case.

Michael Bolin · 2025-05-23 11:37:07 -07:00

89ef4efdcf

feat: add codex_linux_sandbox_exe: Option<PathBuf> field to Config (#1089 )

https://github.com/openai/codex/pull/1086 is a work-in-progress to make
Linux sandboxing work more like Seatbelt where, for the command we want
to sandbox, we build up the command and then hand it, and some sandbox
configuration flags, to another command to set up the sandbox and then
run it.

In the case of Seatbelt, macOS provides this helper binary and provides
it at `/usr/bin/sandbox-exec`. For Linux, we have to build our own and
pass it through (which is what #1086 does), so this makes the new
`codex_linux_sandbox_exe` available on `Config` so that it will later be
available in `exec.rs` when we need it in #1086.

Michael Bolin · 2025-05-22 21:52:28 -07:00

d1de7bb383

feat: introduce support for shell_environment_policy in config.toml (#1061 )

To date, when handling `shell` and `local_shell` tool calls, we were
spawning new processes using the environment inherited from the Codex
process itself. This means that the sensitive `OPENAI_API_KEY` that
Codex needs to talk to OpenAI models was made available to everything
run by `shell` and `local_shell`. While there are cases where that might
be useful, it does not seem like a good default.

This PR introduces a complex `shell_environment_policy` config option to
control the `env` used with these tool calls. It is inevitably a bit
complex so that it is possible to override individual components of the
policy so without having to restate the entire thing.

Details are in the updated `README.md` in this PR, but here is the
relevant bit that explains the individual fields of
`shell_environment_policy`:

| Field | Type | Default | Description |
| ------------------------- | -------------------------- | ------- |
-----------------------------------------------------------------------------------------------------------------------------------------------
|
| `inherit` | string | `core` | Starting template for the
environment:<br>`core` (`HOME`, `PATH`, `USER`, …), `all` (clone full
parent env), or `none` (start empty). |
| `ignore_default_excludes` | boolean | `false` | When `false`, Codex
removes any var whose **name** contains `KEY`, `SECRET`, or `TOKEN`
(case-insensitive) before other rules run. |
| `exclude` | array&lt;string&gt; | `[]` | Case-insensitive glob
patterns to drop after the default filter.<br>Examples: `"AWS_*"`,
`"AZURE_*"`. |
| `set` | table&lt;string,string&gt; | `{}` | Explicit key/value
overrides or additions – always win over inherited values. |
| `include_only` | array&lt;string&gt; | `[]` | If non-empty, a
whitelist of patterns; only variables that match _one_ pattern survive
the final step. (Generally used with `inherit = "all"`.) |


In particular, note that the default is `inherit = "core"`, so:

* if you have extra env variables that you want to inherit from the
parent process, use `inherit = "all"` and then specify `include_only`
* if you have extra env variables where you want to hardcode the values,
the default `inherit = "core"` will work fine, but then you need to
specify `set`

This configuration is not battle-tested, so we will probably still have
to play with it a bit. `core/src/exec_env.rs` has the critical business
logic as well as unit tests.

Though if nothing else, previous to this change:

```
$ cargo run --bin codex -- debug seatbelt -- printenv OPENAI_API_KEY
# ...prints OPENAI_API_KEY...
```

But after this change it does not print anything (as desired).

One final thing to call out about this PR is that the
`configure_command!` macro we use in `core/src/exec.rs` has to do some
complex logic with respect to how it builds up the `env` for the process
being spawned under Landlock/seccomp. Specifically, doing
`cmd.env_clear()` followed by `cmd.envs(&$env_map)` (which is arguably
the most intuitive way to do it) caused the Landlock unit tests to fail
because the processes spawned by the unit tests started failing in
unexpected ways! If we forgo `env_clear()` in favor of updating env vars
one at a time, the tests still pass. The comment in the code talks about
this a bit, and while I would like to investigate this more, I need to
move on for the moment, but I do plan to come back to it to fully
understand what is going on. For example, this suggests that we might
not be able to spawn a C program that calls `env_clear()`, which would
be...weird. We may still have to fiddle with our Landlock config if that
is the case.

Michael Bolin · 2025-05-22 09:51:19 -07:00

cb379d7797

feat: add mcp subcommand to CLI to run Codex as an MCP server (#934 )

Previously, running Codex as an MCP server required a standalone binary
in our Cargo workspace, but this PR makes it available as a subcommand
(`mcp`) of the main CLI.

Ran this with:

```
RUST_LOG=debug npx @modelcontextprotocol/inspector cargo run --bin codex -- mcp
```

and verified it worked as expected in the inspector at
`http://127.0.0.1:6274/`.

Michael Bolin · 2025-05-14 13:15:41 -07:00

497c5396c0

Disallow expect via lints (#865 )

Adds `expect()` as a denied lint. Same deal applies with `unwrap()`
where we now need to put `#[expect(...` on ones that we legit want. Took
care to enable `expect()` in test contexts.

# Tests

```
cargo fmt
cargo clippy --all-features --all-targets --no-deps -- -D warnings
cargo test
```

jcoens-openai · 2025-05-12 08:45:46 -07:00

f3bd143867

feat: experimental env var: CODEX_SANDBOX_NETWORK_DISABLED (#879 )

When using Codex to develop Codex itself, I noticed that sometimes it
would try to add `#[ignore]` to the following tests:

```
keeps_previous_response_id_between_tasks()
retries_on_early_close()
```

Both of these tests start a `MockServer` that launches an HTTP server on
an ephemeral port and requires network access to hit it, which the
Seatbelt policy associated with `--full-auto` correctly denies. If I
wasn't paying attention to the code that Codex was generating, one of
these `#[ignore]` annotations could have slipped into the codebase,
effectively disabling the test for everyone.

To that end, this PR enables an experimental environment variable named
`CODEX_SANDBOX_NETWORK_DISABLED` that is set to `1` if the
`SandboxPolicy` used to spawn the process does not have full network
access. I say it is "experimental" because I'm not convinced this API is
quite right, but we need to start somewhere. (It might be more
appropriate to have an env var like `CODEX_SANDBOX=full-auto`, but the
challenge is that our newer `SandboxPolicy` abstraction does not map to
a simple set of enums like in the TypeScript CLI.)

We leverage this new functionality by adding the following code to the
aforementioned tests as a way to "dynamically disable" them:

```rust
if std::env::var(CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR).is_ok() {
    println!(
        "Skipping test because it cannot execute when network is disabled in a Codex sandbox."
    );
    return;
}
```

We can use the `debug seatbelt --full-auto` command to verify that
`cargo test` fails when run under Seatbelt prior to this change:

```
$ cargo run --bin codex -- debug seatbelt --full-auto -- cargo test
---- keeps_previous_response_id_between_tasks stdout ----

thread 'keeps_previous_response_id_between_tasks' panicked at /Users/mbolin/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/wiremock-0.6.3/src/mock_server/builder.rs:107:46:
Failed to bind an OS port for a mock server.: Os { code: 1, kind: PermissionDenied, message: "Operation not permitted" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    keeps_previous_response_id_between_tasks

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

error: test failed, to rerun pass `-p codex-core --test previous_response_id`
```

Though after this change, the above command succeeds! This means that,
going forward, when Codex operates on Codex itself, when it runs `cargo
test`, only "real failures" should cause the command to fail.

As part of this change, I decided to tighten up the codepaths for
running `exec()` for shell tool calls. In particular, we do it in `core`
for the main Codex business logic itself, but we also expose this logic
via `debug` subcommands in the CLI in the `cli` crate. The logic for the
`debug` subcommands was not quite as faithful to the true business logic
as I liked, so I:

* refactored a bit of the Linux code, splitting `linux.rs` into
`linux_exec.rs` and `landlock.rs` in the `core` crate.
* gating less code behind `#[cfg(target_os = "linux")]` because such
code does not get built by default when I develop on Mac, which means I
either have to build the code in Docker or wait for CI signal
* introduced `macro_rules! configure_command` in `exec.rs` so we can
have both sync and async versions of this code. The synchronous version
seems more appropriate for straight threads or potentially fork/exec.

Michael Bolin · 2025-05-09 18:29:34 -07:00

fde48aaa0d

Workspace lints and disallow unwrap (#855 )

Sets submodules to use workspace lints. Added denying unwrap as a
workspace level lint, which found a couple of cases where we could have
propagated errors. Also manually labeled ones that were fine by my eye.

jcoens-openai · 2025-05-08 09:46:18 -07:00

87cf120873

fix: creating an instance of Codex requires a Config (#859 )

I discovered that I accidentally introduced a change in
https://github.com/openai/codex/pull/829 where we load a fresh `Config`
in the middle of `codex.rs`:


https://github.com/openai/codex/blob/c3e10e180a341e719f61014ea508f6d9dbffe05b/codex-rs/core/src/codex.rs#L515-L522

This is not good because the `Config` could differ from the one that has
the user's overrides specified from the CLI. Also, in unit tests, it
means the `Config` was picking up my personal settings as opposed to
using a vanilla config, which was problematic.

This PR cleans things up by moving the common case where
`Op::ConfigureSession` is derived from `Config` (originally done in
`codex_wrapper.rs`) and making it the standard way to initialize `Codex`
by putting it in `Codex::spawn()`. Note this also eliminates quite a bit
of boilerplate from the tests and relieves the caller of the
responsibility of minting out unique IDs when invoking `submit()`.

Michael Bolin · 2025-05-07 16:33:28 -07:00

cfe50c7107

Update cargo to 2024 edition (#842 )

Some effects of this change:
- New formatting changes across many files. No functionality changes
should occur from that.
- Calls to `set_env` are considered unsafe, since this only happens in
tests we wrap them in `unsafe` blocks

jcoens-openai · 2025-05-07 08:37:48 -07:00

8a89d3aeda

chore: introduce codex-common crate (#843 )

I started this PR because I wanted to share the `format_duration()`
utility function in `codex-rs/exec/src/event_processor.rs` with the TUI.
The question was: where to put it?

`core` should have as few dependencies as possible, so moving it there
would introduce a dependency on `chrono`, which seemed undesirable.
`core` already had this `cli` feature to deal with a similar situation
around sharing common utility functions, so I decided to:

* make `core` feature-free
* introduce `common`
* `common` can have as many "special interest" features as it needs,
each of which can declare their own deps
* the first two features of common are `cli` and `elapsed`

In practice, this meant updating a number of `Cargo.toml` files,
replacing this line:

```toml
codex-core = { path = "../core", features = ["cli"] }
```

with these:

```toml
codex-core = { path = "../core" }
codex-common = { path = "../common", features = ["cli"] }
```

Moving `format_duration()` into its own file gave it some "breathing
room" to add a unit test, so I had Codex generate some tests and new
support for durations over 1 minute.

Michael Bolin · 2025-05-06 17:38:56 -07:00

c577e94b67

feat: make cwd a required field of Config so we stop assuming std::env::current_dir() in a session (#800 )

In order to expose Codex via an MCP server, I realized that we should be
taking `cwd` as a parameter rather than assuming
`std::env::current_dir()` as the `cwd`. Specifically, the user may want
to start a session in a directory other than the one where the MCP
server has been started.

This PR makes `cwd: PathBuf` a required field of `Session` and threads
it all the way through, though I think there is still an issue with not
honoring `workdir` for `apply_patch`, which is something we also had to
fix in the TypeScript version: https://github.com/openai/codex/pull/556.

This also adds `-C`/`--cd` to change the cwd via the command line.

To test, I ran:

```
cargo run --bin codex -- exec -C /tmp 'show the output of ls'
```

and verified it showed the contents of my `/tmp` folder instead of
`$PWD`.

Michael Bolin · 2025-05-04 10:57:12 -07:00

421e159888

chore: remove the REPL crate/subcommand (#754 )

@oai-ragona and I discussed it, and we feel the REPL crate has served
its purpose, so we're going to delete the code and future archaeologists
can find it in Git history.

Michael Bolin · 2025-04-30 10:15:50 -07:00

c432d9ef81

58 Commits