Files
agent-framework/docs/decisions/0026-hosted-session-identity-context.md
Roger Barreto ad95f2f2fa .NET: Add Hosted-MemoryAgent sample with isolation key plumbing (#5692) (#5702)
* .NET: Add Hosted-MemoryAgent sample with isolation key plumbing (#5692)

Adds HostedSessionContext + HostedSessionIsolationKeyProvider in Microsoft.Agents.AI.Foundry.Hosting so AIContextProviders (notably FoundryMemoryProvider) can scope per user via the platform's x-agent-user-isolation-key / x-agent-chat-isolation-key headers.

- New types: HostedSessionContext (sealed), HostedSessionContextExtensions (public Get, internal Set), abstract HostedSessionIsolationKeyProvider (async), internal PlatformHostedSessionIsolationKeyProvider mapping ResponseContext.Isolation.

- AgentFrameworkResponseHandler now resolves the provider, tags fresh sessions, and validates resumed sessions against the live request (strict 403 'Hosted session identity context mismatch' on any mismatch; 500 on null keys).

- New shared sample project Hosted_Shared_Contributor_Setup hosts DevTemporaryTokenCredential and DevTemporaryLocalSessionIsolationKeyProvider plus AddDevTemporaryLocalContributorSetup. All 9 existing responses samples migrated to consume it so local runs keep working under the strict isolation contract.

- New Hosted-MemoryAgent sample: travel assistant wired through FoundryMemoryProvider with stateInitializer reading session.GetHostedContext().UserId. Includes Dockerfile, smoke.ps1, agent.yaml/manifest.

- New IT scenario 'memory' in Foundry.Hosting.IntegrationTests + MemoryHostedAgentFixture + MemoryHostedAgentTests. Verified end to end against the tao Foundry project.

- ADR 0026 captures the design tree.

* Address PR review feedback

- Dockerfile: add header noting it targets NuGet builds; contributors must use Dockerfile.contributor for ProjectReference source builds.

- PlatformHostedSessionIsolationKeyProvider: doc said 'returns context with empty values'; corrected to 'returns null' which the handler treats as 500.

- FakeHostedSessionIsolationKeyProvider: doc clarifies that null configurations are allowed for testing the handler error path.

- HostedSessionContextExtensions.SetHostedContext: enforce write-once with InvalidOperationException; doc + xml exception updated.

- AgentFrameworkResponseHandler: cache PlatformHostedSessionIsolationKeyProvider as static readonly to avoid per-request allocation.

- MemoryHostedAgentTests: tighten waits from 20s to 5s (FoundryMemoryProvider defaults UpdateDelay=0; ingestion ~3s).

- Sample Program.cs imports reordered to satisfy IDE0005.

* Add HostedFoundryMemoryProviderScopes built-in helpers (#5692)

Addresses review feedback from @lokitoth on Hosted-MemoryAgent/Program.cs:54.

- New HostedFoundryMemoryProviderScopes static class with PerUser, PerChat, PerUserAndChat factories returning Func<AgentSession?, FoundryMemoryProvider.State>.

- All helpers throw InvalidOperationException when GetHostedContext() is null, with a message pointing at writing a custom stateInitializer for non-hosted scenarios.

- New HostedFoundryMemoryScope enum and AddHostedFoundryMemoryProvider DI extension (two overloads: explicit AIProjectClient and DI-resolved). Singleton lifetime. Default scope = PerUser.

- Hosted-MemoryAgent sample and the memory IT scenario container both swap their inline lambdas for HostedFoundryMemoryProviderScopes.PerUser().

- 14 new unit tests (241/241 hosting unit tests pass).

* Replace HostedFoundryMemoryScope enum with Func<...> parameter (#5692)

Address PR review feedback from @westey-m: enums are a breaking-change hazard when extended, and the enum was redundant with the existing HostedFoundryMemoryProviderScopes static class.

- Delete HostedFoundryMemoryScope.cs.

- AddHostedFoundryMemoryProvider DI extensions now take Func<AgentSession?, FoundryMemoryProvider.State>? stateInitializer = null. When null, default to HostedFoundryMemoryProviderScopes.PerUser().

- Callers pick a built-in helper (PerUser/PerChat/PerUserAndChat) or pass a custom delegate. New built-ins are a single static method addition with zero impact on existing callers.

- Tests updated; 244/244 hosting unit tests pass.

* Fix isolation context resume for externally-created conversations (#5692)

Branch on the session's existing hosted-context (not on conversation_id presence) so a conversation provisioned externally (e.g. via conversations.CreateProjectConversationAsync) is treated as fresh on first hosted-agent request and stamped, rather than rejected with 403 hosted_session_identity_mismatch. Strict equality is preserved on real resume of an already-stamped session.

Also tighten dotnet/global.json to version 10.0.204 + rollForward latestPatch so local builds match the CI Docker image SDK and avoid 10.0.300 dotnet format stripping required usings.

* Revert global.json SDK pin to upstream (#5692)

The 10.0.204 + latestPatch pin from the previous commit broke the dotnet-format CI job (hostfxr_resolve_sdk2 could not find a compatible SDK in the mcr.microsoft.com/dotnet/sdk:10.0 image). Restore upstream 10.0.200 + minor; local Release builds with SDK 10.0.300 should set GITHUB_ACTIONS=true to bypass the auto-format-on-build target.
2026-05-15 05:42:12 +00:00

7.4 KiB

status, contact, date, deciders, consulted, informed
status contact date deciders consulted informed
accepted rogerbarreto 2026-05-07 rogerbarreto

Hosted session identity context for Foundry Hosting

Context and Problem Statement

Server-hosted Foundry agents need a way to scope per-user state (most notably FoundryMemoryProvider memories) by the end user that initiated the request. The Foundry platform already injects x-agent-user-isolation-key and x-agent-chat-isolation-key headers on every Responses request, but the agent-framework hosting layer did not surface those values to AIContextProvider instances. The provider's stateInitializer only received an AgentSession? with no identity attached, so per-user scoping was impossible without out-of-band plumbing.

Decision Drivers

  • Memory and any future user-private context must be partitioned per end user without per-sample boilerplate.
  • The identity must be read-only from the perspective of AIContextProviders, so a buggy or hostile provider cannot escalate or leak across users.
  • The persisted session must validate against the live request on every resume to defend against session-id leak and in-process tampering.
  • The change must work for every existing hosted-agent type (ChatClientAgent, FoundryAgent, future ones) without per-type refactoring of cast-heavy code paths in Microsoft.Agents.AI.
  • Local Docker debugging must remain possible when the platform headers are absent.

Considered Options

  1. HostedSessionContext stored in AgentSessionStateBag, exposed via a public read accessor and an internal setter. Hosting writes once on session creation and validates on every resume.
  2. Specialised HostedAgentSession : AgentSession wrapper that carries UserId/ChatId properties, with GetService<ChatClientAgentSession>() as the unwrap escape hatch.
  3. New property on AgentSession base class (HostedSessionContext? HostedContext { get; internal set; }).
  4. AsyncLocal middleware that reads the headers and stuffs them into a per-request AsyncLocal<HostedSessionContext> consumed by the provider.

For the source of identity:

  • A. The platform-injected IsolationContext exposed by ResponseContext.Isolation (typed UserIsolationKey/ChatIsolationKey).
  • B. The OpenAI Responses spec's top-level request.User field.
  • C. A custom HTTP header x-client-user.

Decision Outcome

Option 1 was chosen for the storage shape, sourced from Option A (ResponseContext.Isolation).

Rationale:

  • Wrapper rejected (Option 2). ChatClientAgentSession is sealed and ChatClientAgent rejects any other session type via direct is not ChatClientAgentSession checks at multiple call sites. Wrapping would force non-trivial refactors across Microsoft.Agents.AI and a corresponding repeat for every other agent type.
  • Base-class property rejected (Option 3). Leaks "hosted" semantics into the universal AgentSession abstraction used by Durable, A2A, and CopilotStudio agents that have no notion of a hosted user.
  • AsyncLocal rejected (Option 4). Surfaces the concept only locally, requires every consumer to re-implement the bridge, and cannot be enforced as read-only.
  • request.User rejected (Option B). Set by the caller, not the platform. Forging it client-side trivially defeats per-user partitioning.
  • x-client-user rejected (Option C). Non-standard, requires custom HTTP plumbing, and duplicates the platform-provided isolation contract.

Implementation summary in Microsoft.Agents.AI.Foundry.Hosting:

Type Visibility Purpose
HostedSessionContext public sealed Captures UserId and ChatId (both required, non-whitespace).
HostedSessionContextExtensions.GetHostedContext public Read accessor for AIContextProviders.
HostedSessionContextExtensions.SetHostedContext internal Writer reserved for the hosting assembly. Backed by AgentSessionStateBag under a well-known key for serialisation.
HostedSessionIsolationKeyProvider (abstract) public DI-resolvable factory. Async signature: ValueTask<HostedSessionContext?> GetKeysAsync(ResponseContext, CreateResponse, CancellationToken).
PlatformHostedSessionIsolationKeyProvider internal sealed Default implementation. Maps context.Isolation.UserIsolationKey and context.Isolation.ChatIsolationKey. Returns null when either is absent.

Behaviour added to AgentFrameworkResponseHandler.CreateAsync:

  1. Resolve HostedSessionIsolationKeyProvider from DI; fall back to PlatformHostedSessionIsolationKeyProvider.
  2. Call GetKeysAsync(context, request, cancellationToken). A null result throws InvalidOperationException (becomes 500). A null/whitespace UserId or ChatId is rejected by HostedSessionContext's constructor.
  3. Branch on the session's existing context, not on whether a conversation_id was supplied:
    • No session (session is null): nothing to stamp; skip.
    • Session present but un-stamped (GetHostedContext() is null): treat as fresh. This covers both newly-created sessions and pre-existing sessions whose conversation_id was provisioned externally (e.g. via conversations.CreateProjectConversationAsync()) before the first hosted-agent request. Stamp the resolved identity now.
    • Session present with stamped context: strict resume. The persisted UserId and ChatId must equal the resolved values exactly. Mismatch throws ResponsesApiException with status 403 and body Hosted session identity context mismatch.

Consequences

Positive:

  • Per-user memory partitioning works out of the box for any agent that consumes a Microsoft.Agents.AI.Foundry.FoundryMemoryProvider configured to read session.GetHostedContext().UserId.
  • Cross-user session-id leak and in-process tampering of the persisted identity both surface as a 403 with a deliberately uninformative body.
  • The identity is opaque to the framework, matching the platform's semantics. The framework never inspects user identity; the IsolationContext keys are pre-partitioned per agent.

Negative:

  • Every existing hosted sample fails locally without a HostedSessionIsolationKeyProvider registered, because the platform headers are absent outside the platform. Mitigated by shipping Hosted_Shared_Contributor_Setup with DevTemporaryLocalSessionIsolationKeyProvider and AddDevTemporaryLocalContributorSetup, and migrating all 9 existing responses samples.
  • An attacker who can plant an un-stamped session under a victim's conversation_id before the victim's first hosted-agent request would be stamped with the attacker's identity on that first request. This is not a regression vs. behaviour without this contract, and is mitigated in practice because the conversation_id namespace is allocated by the platform per project. Once a session is stamped, the strict equality check fully defends the resume path.

Out of scope

  • Per-request User field on CreateResponse is intentionally not consumed; only the platform IsolationContext headers carry trustworthy identity.
  • Generic (non-Foundry) hosting layers can re-define an equivalent type if needed; nothing in this ADR is moved into Microsoft.Agents.AI.Hosting because Microsoft.Agents.AI.Foundry.Hosting does not depend on it.
  • HMAC tamper signatures over the persisted context are not implemented; comparison against ResponseContext.Isolation on every request is sufficient because the platform sets those headers at the trust boundary.