Commit Graph

2 Commits

  • Simplify Python hosting core (#6492)
    Remove linking, multicast, durable delivery, and host push machinery from the v1 hosting core. Keep those scenarios in a proposed follow-up ADR and update channel packages, samples, docs, tests, and workspace metadata around the smaller host/channel contract.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: add agent-framework-hosting core package (#5638)
    * feat(hosting): add agent-framework-hosting core package
    
    New ``agent-framework-hosting`` package implementing ADR 0026 / SPEC-002:
    the channel-neutral host that lets a single ``Agent`` (or ``Workflow``)
    fan out across multiple wire protocols ("channels") behind one Starlette
    ASGI app.
    
    Surface (re-exported from ``agent_framework_hosting``):
    
    - ``AgentFrameworkHost`` — wraps a hostable target, mounts channels onto
      an ASGI app, owns per-isolation-key ``AgentSession`` reuse, threads
      request context (``response_id`` / ``previous_response_id``) into
      context providers via an ``ExitStack`` of ``bind_request_context``
      calls, and exposes an opt-in Hypercorn ``serve()`` helper (extra
      ``[serve]``).
    - ``Channel`` protocol + ``ChannelContribution`` — the surface a channel
      package implements (routes, lifespans, identity hooks, …).
    - ``ChannelRequest`` / ``ChannelSession`` / ``ChannelIdentity`` /
      ``ChannelPush`` / ``ChannelCommand[Context]`` / ``ChannelRunHook`` /
      ``ChannelStreamTransformHook`` / ``DeliveryReport`` /
      ``HostedRunResult`` / ``ResponseTarget`` / ``ResponseTargetKind`` /
      ``apply_run_hook`` — channel-side dataclasses + helpers.
    - ``IsolationKeys`` + ``ISOLATION_HEADER_USER`` / ``..._CHAT`` +
      ``get/set/reset_current_isolation_keys`` — the host's ASGI middleware
      reads the ``x-agent-{user,chat}-isolation-key`` headers off each
      inbound request and exposes them to the agent stack via a
      ``ContextVar`` so storage-side providers (e.g.
      ``FoundryHostedAgentHistoryProvider``) can apply per-tenant
      partitioning without channels having to forward anything.
    
    Includes 45 unit tests covering the host, channel contributions,
    isolation contextvar, and shared types. Registers the package in
    ``python/pyproject.toml`` ``[tool.uv.sources]`` and adds the matching
    pyright ``executionEnvironments`` entry for tests.
    
    Hypercorn is an optional dependency (``[serve]`` extra); the soft import
    in ``serve()`` is annotated for pyright since it isn't on the default
    install.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix(hosting): address PR-2 review comments
    
    Source-code changes
    - _suppress_already_consumed: narrow contract — RuntimeError now logs
      at WARNING with exc_info; non-RuntimeError still logs at exception().
      Docstring clarifies that any non-clean teardown is observable.
    - _BoundResponseStream: add aclose() and route __await__ through
      get_final_response() so the binding is always released — fixes
      contextvar leak when channels abandon the stream or use the
      await-the-stream convenience.
    - Lifespan: aggregate startup/shutdown callback errors; every callback
      runs, all failures are logged with their qualname, and the first
      error is re-raised so Starlette still aborts boot.
    - _build_run_kwargs: switch session-cache write to dict.setdefault so
      concurrent racers cannot orphan a session if create_session ever
      yields.
    - _deliver_response: introduce DeliveryReport.failed for push outages
      vs explicit "no link" drops; an outage no longer triggers an
      originating fallback so the channel can decide degraded behaviour.
    
    Test additions
    - tests/test_isolation.py (new): full coverage of IsolationKeys, the
      contextvar helpers, header constants, and end-to-end ASGI
      middleware lift / reset / passthrough.
    - tests/test_host.py: TestBindRequestContext, TestBoundResponseStream
      (aclose / __await__ / __getattr__ forwarding / double-close
      idempotency), TestWrapInputListMessages (list[Message] LAST
      precedence), TestLifespanAggregation (startup + shutdown).
    - tests/test_types.py: TestApplyRunHook (sync/async/None), and
      TestDeliveryReport (new failed field).
    - Updated test_push_exception_marks_skipped ->
      test_push_exception_lands_in_failed_no_fallback to match the new
      delivery contract.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix(hosting): address PR-2 round-2 review comments
    
    - Refactor workflow checkpoint restoration into shared helpers
      (_restore_workflow_checkpoint for blocking; the streaming sibling
      drains the rehydration stream) so the blocking and streaming paths
      rehydrate identically — clarifies the previously inline _maybe_restore
      by hoisting the pattern next to the blocking call site.
    - Document that blocking workflow output is text-only by design;
      richer modalities ride the streaming AgentResponseUpdate channel,
      which preserves all content parts.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * review: address PR-4 _host.py round 2 feedback
    
    These review comments were filed on PR-4 (#5640) but target lines that
    live in the hosting-core package (PR-2 / #5638), so the fixes land here
    and PR-4's stack will pick them up on rebase.
    
    - _suppress_already_consumed: narrow the RuntimeError catch to the two
      documented benign messages (`Inner stream not available`, `Event loop
      is closed`); any other RuntimeError now logs at ERROR with a full
      traceback so executor bugs / runner-context state errors / checkpoint
      RuntimeErrors during the post-run flush no longer masquerade as
      benign cleanup noise. Still no propagation (we're in an
      async-generator finally during teardown) — see the docstring.
    - _restore_workflow_checkpoint{,_streaming}: log a WARNING when a
      non-None latest checkpoint drains to zero events, so a stale or
      partially-written checkpoint_id surfaces as an operator signal
      instead of a silent state-loss.
    
    (The `deliver_response` "no destinations resolvable" vs "every
    destination errored" concern raised in 3198268038 is already addressed
    by the existing `failed` vs `skipped` distinction surfaced through
    `DeliveryReport.failed` — see lines 1080-1102 and the
    `DeliveryReport` docstring.)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix(hosting): reject path-traversal patterns in checkpoint isolation_key
    
    The host's `_resolve_checkpoint_storage` joined `request.session.isolation_key`
    directly into the configured `checkpoint_location`. The key is caller-
    controlled — sourced from inbound headers (`x-agent-{user,chat}-isolation-key`
    injected by the Foundry runtime), from channel-supplied derivations such as
    `telegram:<chat_id>` / `entra:<oid>`, or from values set by a channel
    `run_hook`. A value like `../../../etc/foo` or an absolute path would let
    the resulting checkpoint directory escape the configured root (CWE-22).
    This matches the path-traversal class fixed upstream in #5851 for the
    foundry_hosting checkpoint storage.
    
    New `_checkpoint_path_for_isolation_key(root, isolation_key)` helper:
    
    - Uses a denylist (not allowlist) so legitimate namespaced keys
      (`telegram:42`, `entra:abc-def`) continue to pass through unmodified.
    - Rejects path separators (`/`, `\`), NUL, all-dot reductions (`.`, `..`,
      `...`, ...), absolute paths (`os.path.isabs`), and drive-letter prefixes
      (`os.path.splitdrive` plus an explicit `^[A-Za-z]:` check so payloads
      crafted on a POSIX host still fail closed if the resulting directory
      ever round-trips to Windows storage).
    - After joining, resolves both sides and verifies
      `target.is_relative_to(root)` as defence-in-depth.
    
    `_resolve_checkpoint_storage` now logs a WARNING and returns `None` for
    invalid keys rather than crashing the request — checkpointing is best-
    effort and we prefer dropping it to letting one malformed key abort an
    otherwise valid agent run.
    
    Tests:
    
    - `TestCheckpointPathForIsolationKey` exercises the helper directly with
      legitimate keys (alphanumeric, `:`-namespaced, dotted, 200-char), all
      rejected traversal patterns from #5851's MSRC repro list, and
      non-string input.
    - `TestHostWorkflowCheckpointingPathTraversal` verifies the end-to-end
      request path: a traversal key (`../escape`) and an in-key separator
      (`evil/sub`) both produce a successful agent response with no files
      written under `checkpoint_location`, and the traversal case logs a
      WARNING citing `isolation_key`.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix(hosting): address PR-2 round-3 review feedback + add response hooks
    
    Round-3 review comment fixes:
    
    - _types.py: drop the _EMPTY_MAPPING sentinel; ChannelIdentity.attributes
      uses plain dict() as the default — simpler, no extra symbol to track.
    - _host.py: drop the local `import asyncio` + `from typing import cast as
      _cast` inside `serve()`; rely on the module-level imports.
    - _host.py: switch `_log_incoming` to structured `extra={...}` payloads
      for both INFO and DEBUG so log aggregators get queryable fields.
    - _host.py: delete `_flat_context_providers` and stop descending into a
      `.providers` attribute. Aggregator providers (AggregateContextProvider /
      ContextProviderBase) are responsible for forwarding `response_context`
      to their children themselves; the host treats whatever
      `agent.context_providers` exposes as the final, flat list.
    - _host.py: stop collapsing agent / workflow output to text. `_invoke`
      forwards `AgentResponse.messages` (and `raw_response`) on the
      `HostedRunResult`. `_invoke_workflow` builds a per-event message list
      via a new `_workflow_output_to_messages` helper that preserves
      AgentResponse / AgentResponseUpdate / Message / Content branches and
      falls back to text only for arbitrary objects.
    - _host.py: `_workflow_event_to_update` carries Content payloads through
      unchanged so multi-modal workflow outputs (images, function-call
      metadata, ...) survive into channels.
    
    New features (per design discussion in the PR thread):
    
    - HostedRunResult: rebuilt around `messages: list[Message]` with
      `.text` / `.contents` as projections, a `raw_response` slot for the
      underlying AgentResponse, and a `replace(messages=..., raw_response=...)`
      clone helper used by the delivery layer for per-destination isolation.
      The `HostedRunResult(text="...")` ctor is preserved as a back-compat
      shim that synthesises a single assistant text message.
    - ResponseTarget: gain `echo_input: bool = False` (also exposed on
      `.channel(name, *, echo_input=...)` / `.channels([...], *, echo_input=...)`).
      When set, the host pushes the originating user message to each
      non-originating destination before the agent reply. Channels can
      filter or transform echoes via their response_hook.
    - DeliveryReport: add `echoed` / `echo_failed` tuples to surface
      per-destination outcomes of the new echo phase. Echo failures do not
      abort the corresponding response push on the same destination.
    - ChannelResponseHook + ChannelResponseContext + apply_response_hook:
      duck-typed `response_hook` attribute on channels for per-destination
      post-processing. Receives a clone of the HostedRunResult and a
      context carrying the request, channel name, destination identity,
      originating flag, and `is_echo` phase flag. Channels stay
      modality-aware (text-only wires flatten via the hook; card-capable
      channels render structured contents directly).
    - _deliver_response: clone-before-hook fan-out so a hook mutating one
      channel's payload cannot leak into another destination's view.
    
    Tests:
    
    - Update _FakeAgentResponse to expose `.messages` (single assistant text
      message synthesised from `text`) so existing tests pass unchanged on
      the new multi-modal _invoke path.
    - Replace the obsolete `test_bind_descends_one_level_into_providers_attribute`
      with a regression guard asserting the host does NOT descend into
      `.providers` (matches new contract).
    - New tests for HostedRunResult multi-modal preservation, echo_input
      fan-out with success + failure, response_hook applied per destination,
      per-destination mutation isolation, and is_echo phase observability.
    
    Docs:
    
    - spec 002: rewrite Canonical flow with the new input → run_hook → host
      → target → wrap → per-destination clone → response_hook → push
      pipeline; document multi-modality contract and per-destination
      cloning; add `echo_input` row to ResponseTarget table; rewrite
      HostedRunResult/HostedStreamResult row; add ChannelResponseHook /
      ChannelResponseContext / apply_response_hook table; log decisions
      Q28 (no host-side text collapse), Q29 (duck-typed response_hook),
      Q30 (opt-in `echo_input` on ResponseTarget).
    - ADR 0026: add ChannelResponseHook + multi-modality bullets;
      surface `echo_input` on the ResponseTarget bullet.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix(hosting): drop HostedRunResult(text=...) back-compat shim; use from_text()
    
    Pre-release cleanup — no released callers to break, so consolidate on one
    canonical entry point plus a classmethod for the ergonomic
    single-text-message case:
    
    - HostedRunResult.__init__ takes ``messages`` positionally (required); no
      more ``text=`` kwarg overload, no more "synthesise an empty message
      when no args" path.
    - New HostedRunResult.from_text(text, *, role="assistant", raw_response=None)
      classmethod for the common "wrap a single text content as one message"
      case (tests, channels emitting plain strings, the echo-input phase
      wrapping a user's text turn).
    - ``_build_echo_payload`` uses ``HostedRunResult.from_text(raw, role="user")``
      for the ``str`` and fallback branches; the other branches use the plain
      ctor with explicit ``Message`` lists.
    - Tests rewritten to use ``from_text("reply")`` everywhere
      ``HostedRunResult(text="reply")`` appeared. Added an explicit
      ``test_from_text_role_kwarg_overrides_default`` regression guard.
    - spec 002: HostedRunResult row updated to describe the
      ``from_text(text, *, role="assistant")`` classmethod instead of the
      removed back-compat shim.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * refactor(hosting-core): reshape HostedRunResult into generic typed envelope
    
    Replace the flattened multi-modal HostedRunResult (carrying
    messages/raw_response/.text projections) with a typed generic
    envelope around the target's full-fidelity output:
    
      class HostedRunResult(Generic[TResult]):
          result: TResult
          session: AgentSession | None
    
    - Agent targets produce HostedRunResult[AgentResponse]; channels
      read result.messages, result.text, result.value, result.response_id,
      result.usage_details directly off the underlying response.
    - Workflow targets produce HostedRunResult[WorkflowRunResult];
      channels iterate result.get_outputs() and inspect
      result.get_final_state() themselves (the host no longer collapses
      workflow outputs onto a synthesised message list).
    - The echo-input phase synthesises a HostedRunResult[AgentResponse]
      wrapping the user's turn so the same per-destination delivery
      machinery applies.
    - replace() is now {result, session} only; the host's clone is
      shallow — channels that need to mutate result itself are
      responsible for their own deep copy.
    
    Rationale: the earlier shape pre-shaped target output (collapsing
    workflows onto a Message list, losing per-executor outputs, final
    state, and structured value affordances). Carrying the target output
    unchanged keeps the host modality-agnostic, gives channel authors
    static typing where they want it, and removes 30+ lines of
    host-side projection helpers.
    
    Also updates ADR 0026 + spec 002 (Q3, Q28, Q29 amended; new Q31
    captures the generic-envelope decision and rationale).
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * docs(hosting-core): document echo vs response distinction for push channels
    
    The host already encodes the echo-vs-response phase via the
    underlying Message.role on the pushed HostedRunResult:
    
    - echo phase: payload.result.messages[*].role == "user"
    - response phase: payload.result.messages[*].role == "assistant"
    
    Both pushes go through the same ChannelPush.push(identity, payload)
    entry point. Channels distinguish either by inspecting role (which
    works for any push-capable channel) or — when a response_hook is
    wired — by branching on ChannelResponseContext.is_echo directly.
    
    Expand the ChannelPush Protocol docstring to make this discoverable
    for channel implementers (esp. chat bots that cannot impersonate
    the user on their wire and need to render echoes as quoted /
    prefixed blocks rather than as bot replies).
    
    Mirror the explanation into the spec's echo_input section.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * docs(hosting-core): fix quickstart to use current Agent API
    
    ChatAgent was renamed to Agent and the preferred construction pattern
    is client.as_agent(...). Also drop the sibling channel import so the
    snippet imports only modules declared as dependencies of this package;
    point readers at the sibling packages instead.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * test(hosting-core): drop redundant @pytest.mark.asyncio decorators
    
    asyncio_mode = "auto" is configured in pyproject.toml, so individual
    @pytest.mark.asyncio decorators are unnecessary.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * docs(hosting): add authorization profiles + IdentityAllowlist seam to ADR/spec
    
    Composes `require_link` + `allowlist` into three named profiles (open,
    forced-link, allowlist) with the allowlist itself keyed on either the
    channel-native id (pre-link) or a verified IdP claim (post-link), plus
    `AnyOf`/`AllOf` combinators for mixed setups. Lifts the design into
    an explicit host seam (`host.authorize(...)` → `AuthorizationOutcome`
    of `Allowed` / `LinkRequired` / `Denied`) instead of leaving each
    channel to roll its own.
    
    Key contract bits:
    - Tri-state `AllowlistDecision` (ALLOW / DENY / ABSTAIN) so claim-based
      lists can ABSTAIN until claims are available without composition
      silently flipping that into DENY.
    - `AuthorizationContext` carries explicit `phase` + `claim_source`
      so allowlists can tell pre-link from post-link without overloading
      `verified_claims is None`.
    - Channel-side `allowlist: ... | Literal["inherit"] | None` with an
      explicit inheritance sentinel, so the host-level `default_allowlist`
      is opt-out, not opt-in.
    - Construction-time validator rejects silent-deny configurations
      (`LinkedClaimAllowlist` without a claim source) with a typed
      `ChannelConfigurationError`.
    - Group-chat denial mirrors the existing `LinkChallenge` DM-redirect
      pattern; only the redacted `user_message` reaches the wire,
      structured `log_details` stay in telemetry.
    
    Ships in two waves: the Protocol + `NativeIdAllowlist` + config
    validator land with the next core PR ahead of the linker; the full
    pipeline + `LinkedClaimAllowlist` enforcement land with the
    `IdentityLinker` core PR.
    
    Updates: ADR 0026 (summary bullet + conceptual-API table row + resolved
    Q16), spec 002 (new req #22, renumbered v1 fast-follow #23..#29 and
    stretch #30..#31, new "Authorization profiles and the IdentityAllowlist
    seam" subsection, inbound-ownership row, resolved Q32, follow-up entry).
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * feat(hosting): add DurableTaskRunner seam + runtime_mode auto-detect
    
    Introduces the explicit long-running vs ephemeral runtime distinction
    and a generic DurableTaskRunner Protocol that owns non-originating
    push dispatch — collapsing the previous deliveries[] per-destination
    state machine, SupportsDeliveryTracking provider capability, and
    Foundry update_item service ask down to a single immutable
    intended_targets[] write on the message.
    
    Spec / ADR:
    - New §"Runtime modes" with auto-detect markers + defaults matrix.
    - Rewrites §"Delivery tracking" → §"Intended targets + durable
      delivery": intent-only on the message, operational state lives in
      the runner.
    - New §"Durable task runner" defining DurableTaskRunner / RetryPolicy
      / TaskHandle / TaskStatus.
    - Drops §SupportsDeliveryTracking and §Foundry update_item gap.
    - Resolved Qs: 12, 18, 21, 26 revised; new 17/18/19 (ADR) and
      33/34/35 (spec).
    
    Code:
    - New _runner.py with InProcessTaskRunner (asyncio + bounded retry,
      bounded terminal-status cache, register-after-start guard,
      shutdown drain).
    - _host.py: runtime_mode + durable_task_runner ctor params;
      auto-detect via FOUNDRY_HOSTING_ENVIRONMENT /
      AZURE_FUNCTIONS_ENVIRONMENT / AWS_LAMBDA_FUNCTION_NAME;
      HOSTING_PUSH_TASK_NAME handler registered eagerly so
      _deliver_response can be called outside the lifespan;
      _handle_push_task does echo-then-response inline per destination;
      _deliver_response now schedules one task per destination via the
      runner (DeliveryReport.pushed = scheduled; .failed = schedule-time
      outage only).
    - _types.py: new DurableTaskRunner Protocol + RetryPolicy /
      TaskHandle / TaskStatus; DeliveryReport drops echoed /
      echo_failed (echo outcome owned by the runner).
    - __init__.py exports the new public surface.
    
    Tests: 132 passing, 90% coverage. New test_runner.py covers
    InProcessTaskRunner success/retry/terminal-failure/cancellation/
    register-after-start, runtime-mode auto-detect with synthetic env,
    and the warning-on-ephemeral-without-runner path. test_host.py
    delivery tests use a sync runner fake for deterministic assertions
    and validate the new "schedule succeeded vs runner backend
    unreachable" semantics.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * feat(hosting): rubber-duck round-5 — strict ephemeral, codec seam, allowlist Wave-1, drop DeliveryReport
    
    Adopts the rubber-duck-approved package of changes from the round-5
    review of PR #5638 (modulo DeliveryReport.failed — the value type is
    removed entirely now that durable delivery covers the failure
    surface, per user direction).
    
    Code:
    - Drop DeliveryReport value type; host-internal _deliver_response
      returns bool. Failure observability is now logs (in-process) /
      runner backend (durable adapters).
    - Strict ephemeral default: ephemeral runtime_mode with the default
      in-process runner raises RuntimeError; opt-in via
      allow_in_process_runner=True (warns).
    - ChannelPushCodec Protocol + DurableTaskPayloadMode enum +
      _validate_runner_codec_pairing so JSON-mode runners can be safely
      paired with channels via codecs; _handle_push_task accepts both
      object- and JSON-envelope shapes.
    - ResponseTarget.identity(...) / .identities([...]) builders +
      IDENTITIES kind for explicit caller-supplied recipients; field
      rename identities → _target_identities (private) with a
      target_identities property to resolve the classmethod collision.
    - Intent-only audit: _annotate_intended_targets writes
      hosting.intended_targets / skipped_targets / includes_originating /
      originating_channel onto assistant messages — single immutable
      write per the runner-owned operational-state model.
    - InProcessTaskRunner: 2-phase drain on shutdown
      (shutdown_grace_seconds, default 5.0) so a clean shutdown does not
      abandon work mid-retry; payload_mode = OBJECT class-level.
    - Echo idempotency: _handle_push_task tracks an echo_done cursor on
      runner-owned task state so a retry that fires after the echo
      phase succeeded does not double-echo.
    
    Wave-1 authorization seam (full landing):
    - New _authorization.py with AllowlistDecision tri-state,
      AuthorizationContext, IdentityAllowlist Protocol, AllowAll /
      NativeIdAllowlist (with async loader cache + channel-scope ABSTAIN) /
      LinkedClaimAllowlist (raise-until-Wave-2) / AnyOfAllowlists /
      AllOfAllowlists / CallableAllowlist built-ins, Allowed /
      LinkRequired / Denied outcomes, ChannelConfigurationError.
    - Host(default_allowlist=..., identity_linker=...) + per-channel
      allowlist parameter with 'inherit' / None semantics.
    - _validate_channel_authorization enforces all three rules at
      construction: claim-source requirement, linker presence for
      require_link=True (elevated from no-op — must not ship
      unenforced), and NativeIdAllowlist(channel=...) typo detection.
      Combinator-walking via _flatten_allowlists catches nested
      misconfigs.
    - host.authorize(...) for the native-id pipeline: open path returns
      Allowed with auto-issued <channel>:<native_id> isolation key (or
      the existing key when the identity has been seen); ABSTAIN on a
      claim-required allowlist maps to
      Denied(reason_code='allowlist_requires_link') until Wave 2 wires
      the linker to convert it to LinkRequired.
    
    Spec / ADR:
    - docs/specs/002-python-hosting-channels.md: Wave-1 status updated
      to reflect the linker-presence rule elevation and the
      host.authorize landing; new sub-sections (codec contract, drain,
      echo cursor); Qs 18 / 21 DeliveryReport references purged; new
      resolved Qs 36–40 covering the strict-ephemeral default, codec
      contract, DeliveryReport removal, echo cursor, and drain.
    - docs/decisions/0026-hosting-channels.md: Q12 DeliveryReport
      reference purged; Q16 updated to reflect Wave-1 landing; new
      resolved Qs 20 (codec contract) + 21 (strict ephemeral / drain /
      echo cursor).
    
    Tests:
    - New tests/test_authorization.py (35 cases) covering every Wave-1
      built-in, the three validator rules, combinator decision
      semantics, and host.authorize across open / allow / deny /
      abstain-with-claim-dep / abstain-without-claim-dep paths plus
      existing-key reuse and verified-claims propagation.
    - tests/test_host.py: TestDeliverResponse rewritten for the bool
      return + runner.scheduled-count assertions; new tests for
      IDENTITIES variant + echo idempotency.
    - tests/test_runner.py: strict-ephemeral now expects RuntimeError;
      allow_in_process_runner opt-in tests; shutdown drain test;
      payload_mode default test.
    - tests/test_types.py: TestDeliveryReport removed; new
      TestDurableTaskPayloadMode + TestResponseTargetIdentities.
    
    Validation: 178 tests pass, 91% coverage, fmt + lint + pyright +
    mypy clean.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * docs(hosting): add mermaid flow diagrams to ADR, spec, README
    
    Insert the 10 hosting flow diagrams reviewed in
    python/.user/hosting-diagrams.md into the public docs:
    
    - README: runtime topology (1a) + cross-link to the spec for the
      richer set.
    - ADR: runtime topology, channel contribution shape, and authorization
      decision (1a, 1b, 3) at the end of 'Conceptual API shape'.
    - Spec: all 10 diagrams — 1a/1b at the top of API Surface, 2 in
      Canonical flow, 3 in Authorization profiles, 4-7 in Scenarios 6-8,
      8 in Codec contract, 9 in Echo idempotency, 10 in Scenario 9.
    
    Doc-only; no API or behaviour change.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * feat(hosting): add opt-in disk persistence via state_dir
    
    Long-running hosts (always-on container, single-VM bot, local dev) lose
    state on every restart today. Add an opt-in disk persistence layer under
    a new `state_dir` constructor parameter on `AgentFrameworkHost` that
    survives process restarts without taking on a heavyweight database
    dependency.
    
    Backed by `diskcache` (installed via the new `[disk]` optional extra).
    An OS-level advisory file lock guarantees single-owner semantics so two
    hosts pointed at the same directory cannot double-execute scheduled
    pushes.
    
    What persists when `state_dir` is set:
    
    - Pending durable-task records — scheduled-but-not-yet-completed pushes
      replay on the next host startup via `InProcessTaskRunner.resume()`.
      Records that crashed mid-attempt resume with the already-consumed
      retry budget (no full-budget re-grant).
    - `_session_aliases` — per-isolation-key session-id rewrites.
    - `_active` — most-recently-active channel per isolation key.
    - `_identities` — `ChannelIdentity` rows for fan-out targeting,
      including nested mutations of the form
      `self._identities[ik][channel] = identity`.
    
    The `state_dir` parameter accepts any of:
    
    - `None` — today's purely in-memory behaviour.
    - `str` / `PathLike` — single root; host auto-creates `runner/` and
      `sessions/` subfolders.
    - `HostStatePaths` TypedDict / plain mapping — per-component overrides
      routed to different roots. Unknown keys raise `ValueError` to surface
      typos early.
    
    Unpicklable push payloads raise `PushPayloadNotPicklable` eagerly from
    `schedule()` so issues surface at the call site rather than on the
    next restart. Corrupt on-disk records are quarantined-and-logged; the
    runner never crashes on resume.
    
    Live `AgentSession` objects stay in memory and are rehydrated lazily
    by the history provider on the next turn.
    
    - New modules: `_persistence.py` (lock + normalisation),
      `_state_store.py` (session-bookkeeping store).
    - Runner rewrite: 4-state model (`pending` / `succeeded` / `failed`
      / `cancelled`); the transient `running` state was a bug that caused
      resume to skip records that crashed mid-handler.
    - New tests: `test_runner_disk.py` (8 tests), `test_host_disk.py` (8
      tests). 194 passed total. pyright + mypy + ruff clean.
    - README: new "Optional disk persistence" section with code samples.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * feat(hosting): add checkpoints to state_dir + fix host docstring
    
    Three related polish changes on top of the disk-persistence landing:
    
    1. Extend `state_dir` to cover workflow checkpoints. Adds
       `checkpoints` as a third `HostStatePaths` key. Single-path form
       (`state_dir="/foo"`) now also auto-derives `/foo/checkpoints/`
       for workflow targets (equivalent to passing
       `checkpoint_location="/foo/checkpoints"`). The mapping form lets
       workflow callers opt out by omitting the key, or route checkpoints
       to a different volume.
    
       Conflict / precedence rules:
       * Explicit `checkpoint_location` always wins over the state_dir
         derived path; a warning surfaces the double-config.
       * Single-path `state_dir` + non-Workflow target → checkpoints path
         silently ignored (no eager directory creation either).
       * Mapping form with `checkpoints` + non-Workflow target → warn
         (almost certainly dead config).
       * Derived path with a workflow that already has its own
         `checkpoint_storage` → same `RuntimeError` as the explicit
         parameter triggers, so ownership stays unambiguous.
    
       Checkpoint persistence uses `FileCheckpointStorage` from the
       framework core — no extra dependency. Only `runner` and
       `sessions` require the `[disk]` extra.
    
    2. Move `AgentFrameworkHost.__init__` parameter docs from `Args:` to
       `Keyword Args:` for every parameter after the `*`. Only `target`
       remains under `Args:`. Brings the docstring in line with the
       actual signature (the params have always been keyword-only).
    
    3. `HostStatePaths` already existed as a TypedDict but did not cover
       `checkpoints`; updated to document the new key with the same
       per-attribute docstring style as `runner` / `sessions` so editors
       can surface help on the keys.
    
    Validation: 201 tests pass (was 194; +7 checkpoint integration tests
    in test_host_disk.py). pyright + mypy + ruff + bandit clean.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * feat(hosting): add core IdentityLinker authorization seam
    
    Fold the core IdentityLinker pieces into the hosting-core PR so the
    authorization surface no longer has a deferred Wave-2 placeholder.
    Provider-specific linkers (for example Entra OAuth helpers) can now plug
    into core without core depending on an IdP SDK.
    
    Core additions:
    - Add LinkChallenge, LinkedIdentity, LinkResolution, and IdentityLinker.
      IdentityLinker.resolve(identity) is a single-call decision that returns
      either a linked identity with verified claims or a challenge the channel
      can render.
    - Enable LinkedClaimAllowlist end-to-end. It now abstains pre-link and
      allows/denies post-link against verified claims, including multi-valued
      claims such as groups.
    - Add AuthPolicy factories for common allowlist shapes.
    - Extend Allowed with verified_claims and claim_source for audit/telemetry
      without requiring callers to re-derive how the decision was made.
    
    Host behavior:
    - identity_linker is now typed as IdentityLinker | None.
    - authorize() supports open, native-id, forced-link, and linked-claim
      profiles end-to-end.
    - require_link=True resolves via the linker and returns LinkRequired when
      the identity is not linked.
    - claim-based allowlists use channel-emitted verified_claims when present,
      or linker-resolved claims otherwise.
    - authorize() remains decision-only and does not mutate _identities/_active;
      identity registry writes remain on the actual request execution path.
    
    Docs/tests:
    - Remove Wave-1/Wave-2 language from core/spec/ADR surfaces touched here.
    - Update the spec/ADR to describe the core linker seam and provider-specific
      linker packages.
    - Add authorization tests for linker challenges, linked identities, linked
      claim allowlists, channel-emitted claims, AuthPolicy factories, and the
      no-mutation contract.
    
    Validation: 214 tests pass, pyright/mypy/ruff clean.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * feat(hosting): add link-store path to state_dir
    
    Identity linking introduces host-adjacent state that needs the same state_dir treatment as runner, session, and checkpoint state. Add a links component to the host state paths so applications and linker packages have a typed, discoverable persistence location.
    
    Changes:
    - Extend HostStatePaths with links and include it in state_dir normalization (state_dir/links/ for the single-path form).
    - Add SupportsLinkStorePath, an optional protocol for identity linkers that accept a host-provided link-store path.
    - AgentFrameworkHost now offers state_dir links to compatible linkers, warns when an explicit links path is supplied without a linker, and warns when the configured linker manages persistence directly instead of implementing SupportsLinkStorePath.
    - Update README and spec text to document the link-store component and clarify that concrete linkers still own the storage format.
    - Add disk-state tests for compatible, missing, and non-configurable linkers.
    
    Validation: 217 tests pass, pyright/mypy/ruff clean.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>