Add Python SDK public API and examples (#14446)

## TL;DR WIP esp the examples Thin the Python SDK public surface so the wrapper layer returns canonical app-server generated models directly. - keeps `Codex` / `AsyncCodex` / `Thread` / `Turn` and input helpers, but removes alias-only type layers and custom result models - `metadata` now returns `InitializeResponse` and `run()` returns the generated app-server `Turn` - updates docs, examples, notebook, and tests to use canonical generated types and regenerates `v2_all.py` against current schema - keeps the pinned runtime-package integration flow and real integration coverage ## Validation - `PYTHONPATH=sdk/python/src python3 -m pytest sdk/python/tests` - `GH_TOKEN="$(gh auth token)" RUN_REAL_CODEX_TESTS=1 PYTHONPATH=sdk/python/src python3 -m pytest sdk/python/tests -rs` --------- Co-authored-by: Codex <noreply@openai.com>
2026-07-01 00:31:56 +08:00 · 2026-03-17 16:05:56 -07:00
parent 0d1539e74c
commit fc75d07504
46 changed files with 5081 additions and 69 deletions
@@ -0,0 +1,190 @@
+# Codex App Server SDK — API Reference
+
+Public surface of `codex_app_server` for app-server v2.
+
+This SDK surface is experimental. The current implementation intentionally allows only one active `TurnHandle.stream()` or `TurnHandle.run()` consumer per client instance at a time.
+
+## Package Entry
+
+```python
+from codex_app_server import (
+    Codex,
+    AsyncCodex,
+    Thread,
+    AsyncThread,
+    TurnHandle,
+    AsyncTurnHandle,
+    InitializeResponse,
+    Input,
+    InputItem,
+    TextInput,
+    ImageInput,
+    LocalImageInput,
+    SkillInput,
+    MentionInput,
+    TurnStatus,
+)
+from codex_app_server.generated.v2_all import ThreadItem
+```
+
+- Version: `codex_app_server.__version__`
+- Requires Python >= 3.10
+- Canonical generated app-server models live in `codex_app_server.generated.v2_all`
+
+## Codex (sync)
+
+```python
+Codex(config: AppServerConfig | None = None)
+```
+
+Properties/methods:
+
+- `metadata -> InitializeResponse`
+- `close() -> None`
+- `thread_start(*, approval_policy=None, base_instructions=None, config=None, cwd=None, developer_instructions=None, ephemeral=None, model=None, model_provider=None, personality=None, sandbox=None) -> Thread`
+- `thread_list(*, archived=None, cursor=None, cwd=None, limit=None, model_providers=None, sort_key=None, source_kinds=None) -> ThreadListResponse`
+- `thread_resume(thread_id: str, *, approval_policy=None, base_instructions=None, config=None, cwd=None, developer_instructions=None, model=None, model_provider=None, personality=None, sandbox=None) -> Thread`
+- `thread_fork(thread_id: str, *, approval_policy=None, base_instructions=None, config=None, cwd=None, developer_instructions=None, model=None, model_provider=None, sandbox=None) -> Thread`
+- `thread_archive(thread_id: str) -> ThreadArchiveResponse`
+- `thread_unarchive(thread_id: str) -> Thread`
+- `models(*, include_hidden: bool = False) -> ModelListResponse`
+
+Context manager:
+
+```python
+with Codex() as codex:
+    ...
+```
+
+## AsyncCodex (async parity)
+
+```python
+AsyncCodex(config: AppServerConfig | None = None)
+```
+
+Preferred usage:
+
+```python
+async with AsyncCodex() as codex:
+    ...
+```
+
+`AsyncCodex` initializes lazily. Context entry is the standard path because it
+ensures startup and shutdown are paired explicitly.
+
+Properties/methods:
+
+- `metadata -> InitializeResponse`
+- `close() -> Awaitable[None]`
+- `thread_start(*, approval_policy=None, base_instructions=None, config=None, cwd=None, developer_instructions=None, ephemeral=None, model=None, model_provider=None, personality=None, sandbox=None) -> Awaitable[AsyncThread]`
+- `thread_list(*, archived=None, cursor=None, cwd=None, limit=None, model_providers=None, sort_key=None, source_kinds=None) -> Awaitable[ThreadListResponse]`
+- `thread_resume(thread_id: str, *, approval_policy=None, base_instructions=None, config=None, cwd=None, developer_instructions=None, model=None, model_provider=None, personality=None, sandbox=None) -> Awaitable[AsyncThread]`
+- `thread_fork(thread_id: str, *, approval_policy=None, base_instructions=None, config=None, cwd=None, developer_instructions=None, ephemeral=None, model=None, model_provider=None, sandbox=None) -> Awaitable[AsyncThread]`
+- `thread_archive(thread_id: str) -> Awaitable[ThreadArchiveResponse]`
+- `thread_unarchive(thread_id: str) -> Awaitable[AsyncThread]`
+- `models(*, include_hidden: bool = False) -> Awaitable[ModelListResponse]`
+
+Async context manager:
+
+```python
+async with AsyncCodex() as codex:
+    ...
+```
+
+## Thread / AsyncThread
+
+`Thread` and `AsyncThread` share the same shape and intent.
+
+### Thread
+
+- `turn(input: Input, *, approval_policy=None, cwd=None, effort=None, model=None, output_schema=None, personality=None, sandbox_policy=None, summary=None) -> TurnHandle`
+- `read(*, include_turns: bool = False) -> ThreadReadResponse`
+- `set_name(name: str) -> ThreadSetNameResponse`
+- `compact() -> ThreadCompactStartResponse`
+
+### AsyncThread
+
+- `turn(input: Input, *, approval_policy=None, cwd=None, effort=None, model=None, output_schema=None, personality=None, sandbox_policy=None, summary=None) -> Awaitable[AsyncTurnHandle]`
+- `read(*, include_turns: bool = False) -> Awaitable[ThreadReadResponse]`
+- `set_name(name: str) -> Awaitable[ThreadSetNameResponse]`
+- `compact() -> Awaitable[ThreadCompactStartResponse]`
+
+## TurnHandle / AsyncTurnHandle
+
+### TurnHandle
+
+- `steer(input: Input) -> TurnSteerResponse`
+- `interrupt() -> TurnInterruptResponse`
+- `stream() -> Iterator[Notification]`
+- `run() -> codex_app_server.generated.v2_all.Turn`
+
+Behavior notes:
+
+- `stream()` and `run()` are exclusive per client instance in the current experimental build
+- starting a second turn consumer on the same `Codex` instance raises `RuntimeError`
+
+### AsyncTurnHandle
+
+- `steer(input: Input) -> Awaitable[TurnSteerResponse]`
+- `interrupt() -> Awaitable[TurnInterruptResponse]`
+- `stream() -> AsyncIterator[Notification]`
+- `run() -> Awaitable[codex_app_server.generated.v2_all.Turn]`
+
+Behavior notes:
+
+- `stream()` and `run()` are exclusive per client instance in the current experimental build
+- starting a second turn consumer on the same `AsyncCodex` instance raises `RuntimeError`
+
+## Inputs
+
+```python
+@dataclass class TextInput: text: str
+@dataclass class ImageInput: url: str
+@dataclass class LocalImageInput: path: str
+@dataclass class SkillInput: name: str; path: str
+@dataclass class MentionInput: name: str; path: str
+
+InputItem = TextInput | ImageInput | LocalImageInput | SkillInput | MentionInput
+Input = list[InputItem] | InputItem
+```
+
+## Generated Models
+
+The SDK wrappers return and accept canonical generated app-server models wherever possible:
+
+```python
+from codex_app_server.generated.v2_all import (
+    AskForApproval,
+    ThreadReadResponse,
+    Turn,
+    TurnStartParams,
+    TurnStatus,
+)
+```
+
+## Retry + errors
+
+```python
+from codex_app_server import (
+    retry_on_overload,
+    JsonRpcError,
+    MethodNotFoundError,
+    InvalidParamsError,
+    ServerBusyError,
+    is_retryable_error,
+)
+```
+
+- `retry_on_overload(...)` retries transient overload errors with exponential backoff + jitter.
+- `is_retryable_error(exc)` checks if an exception is transient/overload-like.
+
+## Example
+
+```python
+from codex_app_server import Codex, TextInput
+
+with Codex() as codex:
+    thread = codex.thread_start(model="gpt-5.4", config={"model_reasoning_effort": "high"})
+    completed_turn = thread.turn(TextInput("Say hello in one sentence.")).run()
+    print(completed_turn.id, completed_turn.status)
+```
@@ -8,24 +8,45 @@

 ## `run()` vs `stream()`

- `Turn.run()` is the easiest path. It consumes events until completion and returns `TurnResult`.
- `Turn.stream()` yields raw notifications (`Notification`) so you can react event-by-event.
+- `TurnHandle.run()` / `AsyncTurnHandle.run()` is the easiest path. It consumes events until completion and returns the canonical generated app-server `Turn` model.
+- `TurnHandle.stream()` / `AsyncTurnHandle.stream()` yields raw notifications (`Notification`) so you can react event-by-event.

 Choose `run()` for most apps. Choose `stream()` for progress UIs, custom timeout logic, or custom parsing.

 ## Sync vs async clients

- `Codex` is the minimal sync SDK and best default.
- `AsyncAppServerClient` wraps the sync transport with `asyncio.to_thread(...)` for async-friendly call sites.
+- `Codex` is the sync public API.
+- `AsyncCodex` is an async replica of the same public API shape.
+- Prefer `async with AsyncCodex()` for async code. It is the standard path for
+  explicit startup/shutdown, and `AsyncCodex` initializes lazily on context
+  entry or first awaited API use.

 If your app is not already async, stay with `Codex`.

-## `thread(...)` vs `thread_resume(...)`
+## Public kwargs are snake_case

- `codex.thread(thread_id)` only binds a local helper to an existing thread ID.
- `codex.thread_resume(thread_id, ...)` performs a `thread/resume` RPC and can apply overrides (model, instructions, sandbox, etc.).
+Public API keyword names are snake_case. The SDK still maps them to wire camelCase under the hood.

-Use `thread(...)` for simple continuation. Use `thread_resume(...)` when you need explicit resume semantics or override fields.
+If you are migrating older code, update these names:
+
+- `approvalPolicy` -> `approval_policy`
+- `baseInstructions` -> `base_instructions`
+- `developerInstructions` -> `developer_instructions`
+- `modelProvider` -> `model_provider`
+- `modelProviders` -> `model_providers`
+- `sortKey` -> `sort_key`
+- `sourceKinds` -> `source_kinds`
+- `outputSchema` -> `output_schema`
+- `sandboxPolicy` -> `sandbox_policy`
+
+## Why only `thread_start(...)` and `thread_resume(...)`?
+
+The public API keeps only explicit lifecycle calls:
+
+- `thread_start(...)` to create new threads
+- `thread_resume(thread_id, ...)` to continue existing threads
+
+This avoids duplicate ways to do the same operation and keeps behavior explicit.

 ## Why does constructor fail?

@@ -61,7 +82,7 @@ python scripts/update_sdk_artifacts.py \
 A turn is complete only when `turn/completed` arrives for that turn ID.

 - `run()` waits for this automatically.
- With `stream()`, make sure you keep consuming notifications until completion.
+- With `stream()`, keep consuming notifications until completion.

 ## How do I retry safely?

@@ -72,6 +93,6 @@ Do not blindly retry all errors. For `InvalidParamsError` or `MethodNotFoundErro
 ## Common pitfalls

 - Starting a new thread for every prompt when you wanted continuity.
- Forgetting to `close()` (or not using `with Codex() as codex:`).
- Ignoring `TurnResult.status` and `TurnResult.error`.
- Mixing SDK input classes with raw dicts incorrectly in minimal API paths.
+- Forgetting to `close()` (or not using context managers).
+- Assuming `run()` returns extra SDK-only fields instead of the generated `Turn` model.
+- Mixing SDK input classes with raw dicts incorrectly.
@@ -1,6 +1,8 @@
 # Getting Started

-This is the fastest path from install to a multi-turn thread using the minimal SDK surface.
+This is the fastest path from install to a multi-turn thread using the public SDK surface.
+
+The SDK is experimental. Treat the API, bundled runtime strategy, and packaging details as unstable until the first public release.

 ## 1) Install

@@ -15,30 +17,32 @@ Requirements:

 - Python `>=3.10`
 - installed `codex-cli-bin` runtime package, or an explicit `codex_bin` override
- Local Codex auth/session configured
+- local Codex auth/session configured

-## 2) Run your first turn
+## 2) Run your first turn (sync)

 ```python
 from codex_app_server import Codex, TextInput

 with Codex() as codex:
-    print("Server:", codex.metadata.server_name, codex.metadata.server_version)
+    server = codex.metadata.serverInfo
+    print("Server:", None if server is None else server.name, None if server is None else server.version)

-    thread = codex.thread_start(model="gpt-5")
-    result = thread.turn(TextInput("Say hello in one sentence.")).run()
+    thread = codex.thread_start(model="gpt-5.4", config={"model_reasoning_effort": "high"})
+    completed_turn = thread.turn(TextInput("Say hello in one sentence.")).run()

-    print("Thread:", result.thread_id)
-    print("Turn:", result.turn_id)
-    print("Status:", result.status)
-    print("Text:", result.text)
+    print("Thread:", thread.id)
+    print("Turn:", completed_turn.id)
+    print("Status:", completed_turn.status)
+    print("Items:", len(completed_turn.items or []))
 ```

 What happened:

 - `Codex()` started and initialized `codex app-server`.
 - `thread_start(...)` created a thread.
- `turn(...).run()` consumed events until `turn/completed` and returned a `TurnResult`.
+- `turn(...).run()` consumed events until `turn/completed` and returned the canonical generated app-server `Turn` model.
+- one client can have only one active `TurnHandle.stream()` / `TurnHandle.run()` consumer at a time in the current experimental build

 ## 3) Continue the same thread (multi-turn)

@@ -46,16 +50,37 @@ What happened:
 from codex_app_server import Codex, TextInput

 with Codex() as codex:
-    thread = codex.thread_start(model="gpt-5")
+    thread = codex.thread_start(model="gpt-5.4", config={"model_reasoning_effort": "high"})

    first = thread.turn(TextInput("Summarize Rust ownership in 2 bullets.")).run()
    second = thread.turn(TextInput("Now explain it to a Python developer.")).run()

-    print("first:", first.text)
-    print("second:", second.text)
+    print("first:", first.id, first.status)
+    print("second:", second.id, second.status)
 ```

-## 4) Resume an existing thread
+## 4) Async parity
+
+Use `async with AsyncCodex()` as the normal async entrypoint. `AsyncCodex`
+initializes lazily, and context entry makes startup/shutdown explicit.
+
+```python
+import asyncio
+from codex_app_server import AsyncCodex, TextInput
+
+
+async def main() -> None:
+    async with AsyncCodex() as codex:
+        thread = await codex.thread_start(model="gpt-5.4", config={"model_reasoning_effort": "high"})
+        turn = await thread.turn(TextInput("Continue where we left off."))
+        completed_turn = await turn.run()
+        print(completed_turn.id, completed_turn.status)
+
+
+asyncio.run(main())
+```
+
+## 5) Resume an existing thread

 ```python
 from codex_app_server import Codex, TextInput
@@ -63,12 +88,20 @@ from codex_app_server import Codex, TextInput
 THREAD_ID = "thr_123"  # replace with a real id

 with Codex() as codex:
-    thread = codex.thread(THREAD_ID)
-    result = thread.turn(TextInput("Continue where we left off.")).run()
-    print(result.text)
+    thread = codex.thread_resume(THREAD_ID)
+    completed_turn = thread.turn(TextInput("Continue where we left off.")).run()
+    print(completed_turn.id, completed_turn.status)
 ```

-## 5) Next stops
+## 6) Generated models
+
+The convenience wrappers live at the package root, but the canonical app-server models live under:
+
+```python
+from codex_app_server.generated.v2_all import Turn, TurnStatus, ThreadReadResponse
+```
+
+## 7) Next stops

 - API surface and signatures: `docs/api-reference.md`
 - Common decisions/pitfalls: `docs/faq.md`