Files
codex/codex-rs/core
T
Celia Chen 88694e8417 chore: stop app-server auth refresh storms after permanent token failure (#15530)
built from #14256. PR description from @etraut-openai:

This PR addresses a hole in [PR
11802](https://github.com/openai/codex/pull/11802). The previous PR
assumed that app server clients would respond to token refresh failures
by presenting the user with an error ("you must log in again") and then
not making further attempts to call network endpoints using the expired
token. While they do present the user with this error, they don't
prevent further attempts to call network endpoints and can repeatedly
call `getAuthStatus(refreshToken=true)` resulting in many failed calls
to the token refresh endpoint.

There are three solutions I considered here:
1. Change the getAuthStatus app server call to return a null auth if the
caller specified "refreshToken" on input and the refresh attempt fails.
This will cause clients to immediately log out the user and return them
to the log in screen. This is a really bad user experience. It's also a
breaking change in the app server contract that could break third-party
clients.
2. Augment the getAuthStatus app server call to return an additional
field that indicates the state of "token could not be refreshed". This
is a non-breaking change to the app server API, but it requires
non-trivial changes for all clients to properly handle this new field
properly.
3. Change the getAuthStatus implementation to handle the case where a
token refresh fails by marking the AuthManager's in-memory access and
refresh tokens as "poisoned" so it they are no longer used. This is the
simplest fix that requires no client changes.

I chose option 3.

Here's Codex's explanation of this change:

When an app-server client asks `getAuthStatus(refreshToken=true)`, we
may try to refresh a stale ChatGPT access token. If that refresh fails
permanently (for example `refresh_token_reused`, expired, or revoked),
the old behavior was bad in two ways:

1. We kept the in-memory auth snapshot alive as if it were still usable.
2. Later auth checks could retry refresh again and again, creating a
storm of doomed `/oauth/token` requests and repeatedly surfacing the
same failure.

This is especially painful for app-server clients because they poll auth
status and can keep driving the refresh path without any real chance of
recovery.

This change makes permanent refresh failures terminal for the current
managed auth snapshot without changing the app-server API contract.

What changed:
- `AuthManager` now poisons the current managed auth snapshot in memory
after a permanent refresh failure, keyed to the unchanged `AuthDotJson`.
- Once poisoned, later refresh attempts for that same snapshot fail fast
locally without calling the auth service again.
- The poison is cleared automatically when auth materially changes, such
as a new login, logout, or reload of different auth state from storage.
- `getAuthStatus(includeToken=true)` now omits `authToken` after a
permanent refresh failure instead of handing out the stale cached bearer
token.

This keeps the current auth method visible to clients, avoids forcing an
immediate logout flow, and stops repeated refresh attempts for
credentials that cannot recover.

---------

Co-authored-by: Eric Traut <etraut@openai.com>
88694e8417 ยท 2026-03-24 12:39:58 -07:00
History
..

codex-core

This crate implements the business logic for Codex. It is designed to be used by the various Codex UIs written in Rust.

Dependencies

Note that codex-core makes some assumptions about certain helper utilities being available in the environment. Currently, this support matrix is:

macOS

Expects /usr/bin/sandbox-exec to be present.

When using the workspace-write sandbox policy, the Seatbelt profile allows writes under the configured writable roots while keeping .git (directory or pointer file), the resolved gitdir: target, and .codex read-only.

Network access and filesystem read/write roots are controlled by SandboxPolicy. Seatbelt consumes the resolved policy and enforces it.

Seatbelt also supports macOS permission-profile extensions layered on top of SandboxPolicy:

  • no extension profile provided: keeps legacy default preferences read access (user-preference-read).
  • extension profile provided with no macos_preferences grant: does not add preferences access clauses.
  • macos_preferences = "readonly": enables cfprefs read clauses and user-preference-read.
  • macos_preferences = "readwrite": includes readonly clauses plus user-preference-write and cfprefs shm write clauses.
  • macos_automation = true: enables broad Apple Events send permissions.
  • macos_automation = ["com.apple.Notes", ...]: enables Apple Events send only to listed bundle IDs.
  • macos_launch_services = true: enables LaunchServices lookups and open/launch operations.
  • macos_accessibility = true: enables com.apple.axserver mach lookup.
  • macos_calendar = true: enables com.apple.CalendarAgent mach lookup.
  • macos_contacts = "read_only": enables Address Book read access and Contacts read services.
  • macos_contacts = "read_write": includes the readonly Contacts clauses plus Address Book writes and keychain/temp helpers required for writes.

Linux

Expects the binary containing codex-core to run the equivalent of codex sandbox linux (legacy alias: codex debug landlock) when arg0 is codex-linux-sandbox. See the codex-arg0 crate for details.

Legacy SandboxPolicy / sandbox_mode configs are still supported on Linux. They can continue to use the legacy Landlock path when the split filesystem policy is sandbox-equivalent to the legacy model after cwd resolution.

Split filesystem policies that need direct FileSystemSandboxPolicy enforcement, such as read-only or denied carveouts under a broader writable root, automatically route through bubblewrap. The legacy Landlock path is used only when the split filesystem policy round-trips through the legacy SandboxPolicy model without changing semantics. That includes overlapping cases like /repo = write, /repo/a = none, /repo/a/b = write, where the more specific writable child must reopen under a denied parent.

The Linux sandbox helper prefers /usr/bin/bwrap whenever it is available and supports the required argv-rewrite flags, and falls back to the vendored bubblewrap path compiled into the binary otherwise. When /usr/bin/bwrap is missing or too old to support the required flags, Codex also surfaces a startup warning through its normal notification path instead of printing directly from the sandbox helper.

Windows

Legacy SandboxPolicy / sandbox_mode configs are still supported on Windows.

The elevated setup/runner backend supports legacy ReadOnlyAccess::Restricted for read-only and workspace-write policies. Restricted read access honors explicit readable roots plus the command cwd, and keeps writable roots readable when workspace-write is used.

When include_platform_defaults = true, the elevated Windows backend adds backend-managed system read roots required for basic execution, such as C:\Windows, C:\Program Files, C:\Program Files (x86), and C:\ProgramData. When it is false, those extra system roots are omitted.

The unelevated restricted-token backend still supports the legacy full-read Windows model only. Restricted read-only policies continue to fail closed there instead of running with weaker read enforcement.

New [permissions] / split filesystem policies remain supported on Windows only when they round-trip through the legacy SandboxPolicy model without changing semantics. Richer split-only carveouts still fail closed instead of running with weaker enforcement.

All Platforms

Expects the binary containing codex-core to simulate the virtual apply_patch CLI when arg1 is --codex-run-as-apply-patch. See the codex-arg0 crate for details.