Avoid persisting ShutdownComplete after thread shutdown (#19630)

## Why

Fixes #19475.

`codex exec` can finish successfully and then emit an `ERROR` on stderr:

```text
failed to record rollout items: thread <id> not found
```

That happens because shutdown closes the live thread writer before
emitting `ShutdownComplete`. The terminal event was still using the
normal `send_event_raw` path, so it tried to append rollout items
through a recorder that had already been removed. The answer is correct,
but wrappers that treat stderr as failure can retry completed exec runs.

This looks like a likely recent regression from
[#18882](https://github.com/openai/codex/pull/18882), which routed live
thread writes through `ThreadStore` and added the shutdown-time live
writer close. I have not bisected this, so the PR treats #18882 as the
likely source based on the affected shutdown code path rather than a
proven first-bad commit.

## What Changed

`ShutdownComplete` now bypasses rollout persistence after thread
shutdown and is delivered directly to clients. The shutdown path still
records the protocol event in the rollout trace before delivery,
preserving trace visibility without attempting a post-shutdown
thread-store append.

The change also adds a regression test with the in-memory thread store
to assert that shutdown creates and shuts down the live thread without
appending another item after shutdown.
This commit is contained in:
Eric Traut
2026-04-27 22:02:08 -07:00
committed by GitHub
Unverified
parent b7e5588d18
commit 5ba908d179
2 changed files with 39 additions and 1 deletions
+4 -1
View File
@@ -977,7 +977,10 @@ pub async fn shutdown(sess: &Arc<Session>, sub_id: String) -> bool {
id: sub_id,
msg: EventMsg::ShutdownComplete,
};
sess.send_event_raw(event).await;
sess.services
.rollout_thread_trace
.record_protocol_event(&event.msg);
sess.deliver_event_raw(event).await;
sess.services
.rollout_thread_trace
.record_ended(codex_rollout_trace::RolloutStatus::Completed);
+35
View File
@@ -4477,6 +4477,41 @@ async fn spawn_task_turn_span_inherits_dispatch_trace_context() {
);
}
#[cfg(debug_assertions)]
#[tokio::test]
async fn shutdown_complete_does_not_append_to_thread_store_after_shutdown() {
let (mut session, _turn_context) = make_session_and_context().await;
let store = Arc::new(codex_thread_store::InMemoryThreadStore::default());
let thread_store: Arc<dyn codex_thread_store::ThreadStore> = store.clone();
let live_thread = LiveThread::create(
Arc::clone(&thread_store),
CreateThreadParams {
thread_id: session.conversation_id,
forked_from_id: None,
source: SessionSource::Exec,
base_instructions: BaseInstructions::default(),
dynamic_tools: Vec::new(),
event_persistence_mode: ThreadEventPersistenceMode::Limited,
},
)
.await
.expect("create thread persistence");
session.services.thread_store = thread_store;
session.services.live_thread = Some(live_thread);
let session = Arc::new(session);
assert!(handlers::shutdown(&session, "sub-1".to_string()).await);
assert_eq!(
codex_thread_store::InMemoryThreadStoreCalls {
create_thread: 1,
shutdown_thread: 1,
..Default::default()
},
store.calls().await
);
}
#[tokio::test]
async fn shutdown_and_wait_allows_multiple_waiters() {
let (session, _turn_context) = make_session_and_context().await;