codex

mirror of https://github.com/pchuan98/codex.git synced 2026-07-01 00:31:56 +08:00

Test selected capabilities across unavailable resume (#30215 )

## Why

The selected-capability integration test already covers initial
attachment and cold resume, but it resumes while the selected executor
is still reachable.

That leaves an important World State transition untested: a thread
remembers its selected capability root, resumes while that environment
is unavailable, and later sees the same stable environment return.

## What this tests

This extends the existing end-to-end scenario:

```text
selected executor available
        ↓
app-server stops and the executor goes away
        ↓
thread resumes with the executor unavailable
        ↓
skills, selected MCP tools, and connector attribution are absent
        ↓
the same environment ID is attached again
        ↓
skills, MCP tools, and connector attribution return
```

The test also checks that the unavailable snapshot explicitly tells the
model that no selected-environment skills are currently available. After
reattachment, it invokes the selected skill again and verifies that a
new executor-owned MCP process starts.

## Scope

This is test-only. It keeps the existing assumption that an environment
ID refers to stable capability contents. It does not add package-file
invalidation or live transport reconnect behavior.

jif · 2026-06-26 11:02:27 +01:00

3c03bb4f18

Test selected capabilities across availability and resume (#30157 )

## Why

This stack crosses World State, executor skills, selected plugin
metadata, MCP processes, connectors, dynamic environments, and resume.
This PR adds two end-to-end scenarios that validate those pieces
together.

Both tests enable `deferred_executor`, so they exercise the real
delayed-environment path.

## Scenario 1: availability across turns and resume

```text
1. Start a thread with one selected plugin root bound to E1.
2. E1 is unavailable.
   - executor skill is absent
   - selected MCP is absent
   - connector has no selected-plugin attribution
3. Start E1 and register the same stable environment ID.
4. Start a new turn.
   - the executor skill appears through World State
   - its body beats a colliding host skill
   - the selected MCP tool is advertised and executes inside E1
   - the connector is attributed to the selected plugin
5. Start another turn without changing E1.
   - the MCP PID stays the same, proving runtime reuse
6. Restart app-server and resume the thread.
   - durable selected-root intent is restored
   - skills, MCP, and connector attribution are restored
   - a new MCP PID proves ephemeral process state was rebuilt
```

## Scenario 2: availability changes inside one turn

```text
1. Start a turn while E1 is unavailable.
2. The first model sample sees no executor skill, MCP, or selected connector.
3. The turn pauses on request_user_input.
4. Start E1 and register it while that same turn is still active.
5. Continue the turn.
6. The very next model sample sees:
   - the executor skill catalog
   - the selected MCP tool
   - selected-plugin connector attribution
7. The model calls the MCP, and its output proves execution happened inside E1.
```

This second scenario specifically protects the aeon-style behavior:
capability state is captured again for every sampling step, not only at
the next user turn.

## Scope

These are integration tests only. They do not add a combinatorial matrix
for unsupported plugin-file mutation, environment generations, transport
disconnects, or delayed `required = true` executor MCPs.

jif · 2026-06-26 03:11:55 +01:00

25f50de6ed

2 Commits