Purview: Parallelize PSPC cold-cache scope refresh (#5832)

* Parallelize Purview PSPC cold cache path

* Cache Purview payment-required state for scope refresh

* Cache Purview payment-required state for scope refresh

* Align Purview policy action dedupe and 402 caching

 Deduplicate combined policy actions by action and restriction action so restriction-only actions are preserved
without duplicating identical entries. Cache tenant-level payment-required state from background scope refresh so
subsequent calls short-circuit consistently.

* .NET: Implement best-effort caching for background job scope retrieval and add unit tests for cache write failures

* Purview - feat: Enhance ScopedContentProcessor to queue ContentActivityJob when no applicable scopes are found and update related tests

* docs: Update purview package README and AGENTS documentation to reflect caching optimizations and policy enforcement scenarios

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
Taisir Hassan
2026-06-09 11:01:21 -07:00
committed by GitHub
Unverified
parent 2a345e5d3b
commit 383d551b86
16 changed files with 917 additions and 228 deletions
@@ -3,7 +3,7 @@
This getting-started sample shows how to attach Microsoft Purview policy evaluation to an Agent Framework `Agent` using the **middleware** approach.
**What this sample demonstrates:**
1. Configure an Azure OpenAI chat client
1. Configure a Foundry chat client
2. Add Purview policy enforcement middleware (`PurviewPolicyMiddleware`)
3. Add Purview policy enforcement at the chat client level (`PurviewChatPolicyMiddleware`)
4. Implement a custom cache provider for advanced caching scenarios
@@ -17,8 +17,8 @@ This getting-started sample shows how to attach Microsoft Purview policy evaluat
| Variable | Required | Purpose |
|----------|----------|---------|
| `AZURE_OPENAI_ENDPOINT` | Yes | Azure OpenAI endpoint (https://<name>.openai.azure.com) |
| `AZURE_OPENAI_MODEL` | Optional | Model deployment name (defaults inside SDK if omitted) |
| `FOUNDRY_PROJECT_ENDPOINT` | Yes | Azure AI Foundry project endpoint, for example `https://<resource>.services.ai.azure.com/api/projects/<project>` |
| `FOUNDRY_MODEL` | Optional | Model deployment name (defaults to `gpt-4o-mini`) |
| `PURVIEW_CLIENT_APP_ID` | Yes* | Client (application) ID used for Purview authentication |
| `PURVIEW_USE_CERT_AUTH` | Optional (`true`/`false`) | Switch between certificate and interactive auth |
| `PURVIEW_TENANT_ID` | Yes (when cert auth on) | Tenant ID for certificate authentication |
@@ -31,7 +31,8 @@ This getting-started sample shows how to attach Microsoft Purview policy evaluat
Opens a browser on first run to sign in.
```powershell
$env:AZURE_OPENAI_ENDPOINT = "https://your-openai-instance.openai.azure.com"
$env:FOUNDRY_PROJECT_ENDPOINT = "https://<resource>.services.ai.azure.com/api/projects/<project>"
$env:FOUNDRY_MODEL = "gpt-4o-mini"
$env:PURVIEW_CLIENT_APP_ID = "00000000-0000-0000-0000-000000000000"
```
@@ -64,22 +65,27 @@ If interactive auth is used, a browser window will appear the first time.
## 4. How It Works
The sample demonstrates three different scenarios:
The sample demonstrates four integration scenarios. Each scenario runs the same three-message sequence via `run_policy_flow(...)`:
1. **good (cold cache)** - a benign prompt that exercises the cold-cache parallel ProtectionScopes warmup + foreground ProcessContent path.
2. **expected block** - a sensitive prompt containing the Visa test credit card number `4111 1111 1111 1111`. If the tenant has a DLP policy for `Microsoft 365 Copilot and AI apps` targeting the Credit Card sensitive info type with a Block action, this prompt returns the configured `blocked_prompt_message` (default: `Prompt blocked by policy`). If no DLP policy applies, the prompt is allowed (the LLM may still decline on its own, but that is a model-level response, not a Purview block).
3. **good (warm cache)** - a second benign prompt that exercises the warm-cache path. The custom cache provider scenario prints `Cache HIT` for the same protection-scopes key, confirming the cache and middleware state survive a prior block.
### A. Agent Middleware (`run_with_agent_middleware`)
1. Builds an Azure OpenAI chat client (using the environment endpoint / deployment)
1. Builds a Foundry chat client (using the environment project endpoint / deployment)
2. Chooses credential mode (certificate vs interactive)
3. Creates `PurviewPolicyMiddleware` with `PurviewSettings`
4. Injects middleware into the agent at construction
5. Sends two user messages sequentially
6. Prints results (or policy block messages)
5. Runs the three-message `good -> block -> good` orchestration
6. Prints `ALLOWED` or `BLOCKED` per message, plus the model response
7. Uses default caching automatically
### B. Chat Client Middleware (`run_with_chat_middleware`)
1. Creates a chat client with `PurviewChatPolicyMiddleware` attached directly
2. Policy evaluation happens at the chat client level rather than agent level
3. Demonstrates an alternative integration point for Purview policies
4. Uses default caching automatically
4. Runs the same `good -> block -> good` orchestration
5. Uses default caching automatically
### C. Custom Cache Provider (`run_with_custom_cache_provider`)
1. Implements the `CacheProvider` protocol with a custom class (`SimpleDictCacheProvider`)
@@ -88,9 +94,27 @@ The sample demonstrates three different scenarios:
- `async def get(self, key: str) -> Any | None`
- `async def set(self, key: str, value: Any, ttl_seconds: int | None = None) -> None`
- `async def remove(self, key: str) -> None`
4. Runs the `good -> block -> good` orchestration and prints `Cache MISS`/`Cache HIT` traces alongside policy outcomes, showing the cold-cache warmup populating the cache and warm-cache requests skipping ProtectionScopes.
### D. Default Cache (`run_with_default_cache`)
1. Same as the agent middleware path but with explicit cache TTL and size limits in `PurviewSettings`
2. Uses the default in-memory `CacheProvider`
3. Runs the `good -> block -> good` orchestration
**Policy Behavior:**
Prompt blocks set a system-level message: `Prompt blocked by policy` and terminate the run early. Response blocks rewrite the output to `Response blocked by policy`.
Prompt blocks substitute the configured `blocked_prompt_message` (default `Prompt blocked by policy`) and terminate the agent run early. Response blocks substitute `blocked_response_message`. The LLM is never called for a blocked prompt.
**Seeing a real `BLOCKED` outcome:**
The middle prompt only returns `BLOCKED` if the tenant actually has a Purview DLP policy that matches the request. Specifically, all of the following must be true:
1. The Entra app id used by `PURVIEW_CLIENT_APP_ID` (the same id Agent Framework sends as `policyLocationApplication.value`) is registered as an integrated AI app in Purview (Settings -> AI app and agent locations).
2. A DLP policy in the tenant targets the location `Microsoft 365 Copilot and AI apps`, scoped to that app id (or `All apps`).
3. The policy has a rule with the condition `Content contains -> Sensitive info types -> Credit Card Number` and an action of `Restrict access to Microsoft 365 Copilot and AI apps -> Block`.
4. The policy is `On` (not `Test mode without notifications`).
5. The signed-in user is in the policy's user scope.
6. Required Graph delegated permissions are admin-consented: `ProtectionScopes.Compute.All`, `Content.Process.All`, `ContentActivity.Write`.
If any of those are missing, the credit card prompt is allowed at the Purview layer. The model itself may still decline on its own; that response is a model-level refusal, not a Purview block. The cold/warm cache orchestration is still demonstrated either way - the `Cache MISS -> Cache HIT` trace from the custom cache scenario does not depend on a block firing.
---
@@ -11,8 +11,8 @@ Shows:
Note: Caching is automatic and enabled by default.
Environment variables:
- AZURE_OPENAI_ENDPOINT (required)
- AZURE_OPENAI_MODEL (optional, defaults to gpt-4o-mini)
- FOUNDRY_PROJECT_ENDPOINT (required) - Azure AI Foundry project endpoint URL
- FOUNDRY_MODEL (optional, defaults to gpt-4o-mini)
- PURVIEW_CLIENT_APP_ID (required)
- PURVIEW_USE_CERT_AUTH (optional, set to "true" for certificate auth)
- PURVIEW_TENANT_ID (required if certificate auth)
@@ -45,6 +45,37 @@ load_dotenv()
JOKER_NAME = "Joker"
JOKER_INSTRUCTIONS = "You are good at telling jokes. Keep responses concise."
# Sequential prompts to demonstrate good -> block -> good orchestration.
# The sensitive prompt contains a Visa test credit card number that matches Purview's
# built-in Credit Card sensitive information type. If the tenant has a DLP policy that
# blocks credit card content for Microsoft 365 Copilot and AI apps, the second message
# will be blocked and the third will verify that subsequent calls still flow normally
# after a block.
GOOD_PROMPT_PRIMARY = "Tell me a joke about a pirate."
SENSITIVE_PROMPT = "My corporate credit card is 4111 1111 1111 1111. Please confirm receipt."
GOOD_PROMPT_FOLLOWUP = "Another light joke please."
async def run_policy_flow(
label: str,
agent: Agent,
user_id: str | None,
blocked_text: str,
) -> None:
"""Run a good -> block candidate -> good sequence and report each outcome."""
blocked_marker = blocked_text.lower()
prompts = [
("good (cold cache)", GOOD_PROMPT_PRIMARY),
("expected block", SENSITIVE_PROMPT),
("good (warm cache)", GOOD_PROMPT_FOLLOWUP),
]
for tag, text in prompts:
response: AgentResponse = await agent.run(
Message("user", [text], additional_properties={"user_id": user_id})
)
outcome = "BLOCKED" if blocked_marker in str(response).lower() else "ALLOWED"
print(f"[{label}] {tag}: {outcome}\n{response}\n")
# Custom Cache Provider Implementation
class SimpleDictCacheProvider:
@@ -138,21 +169,17 @@ def build_credential() -> Any:
async def run_with_agent_middleware() -> None:
endpoint = os.environ.get("AZURE_OPENAI_ENDPOINT")
endpoint = os.environ.get("FOUNDRY_PROJECT_ENDPOINT")
if not endpoint:
print("Skipping run: AZURE_OPENAI_ENDPOINT not set")
print("Skipping run: FOUNDRY_PROJECT_ENDPOINT not set")
return
deployment = os.environ.get("AZURE_OPENAI_MODEL", "gpt-4o-mini")
deployment = os.environ.get("FOUNDRY_MODEL", "gpt-4o-mini")
user_id = os.environ.get("PURVIEW_DEFAULT_USER_ID")
client = FoundryChatClient(model=deployment, endpoint=endpoint, credential=AzureCliCredential())
client = FoundryChatClient(model=deployment, project_endpoint=endpoint, credential=AzureCliCredential())
purview_agent_middleware = PurviewPolicyMiddleware(
build_credential(),
PurviewSettings(
app_name="Agent Framework Sample App",
),
)
settings = PurviewSettings(app_name="Agent Framework Sample App")
purview_agent_middleware = PurviewPolicyMiddleware(build_credential(), settings)
agent = Agent(
client=client,
@@ -162,39 +189,26 @@ async def run_with_agent_middleware() -> None:
)
print("-- Agent MiddlewareTypes Path --")
first: AgentResponse = await agent.run(
Message("user", ["Tell me a joke about a pirate."], additional_properties={"user_id": user_id})
)
print("First response (agent middleware):\n", first)
second: AgentResponse = await agent.run(
Message(
role="user", contents=["That was funny. Tell me another one."], additional_properties={"user_id": user_id}
)
)
print("Second response (agent middleware):\n", second)
blocked_text = settings.get("blocked_prompt_message") or "Prompt blocked by policy"
await run_policy_flow("agent middleware", agent, user_id, blocked_text)
async def run_with_chat_middleware() -> None:
endpoint = os.environ.get("AZURE_OPENAI_ENDPOINT")
endpoint = os.environ.get("FOUNDRY_PROJECT_ENDPOINT")
if not endpoint:
print("Skipping chat middleware run: AZURE_OPENAI_ENDPOINT not set")
print("Skipping chat middleware run: FOUNDRY_PROJECT_ENDPOINT not set")
return
deployment = os.environ.get("AZURE_OPENAI_MODEL", default="gpt-4o-mini")
deployment = os.environ.get("FOUNDRY_MODEL", default="gpt-4o-mini")
user_id = os.environ.get("PURVIEW_DEFAULT_USER_ID")
settings = PurviewSettings(app_name="Agent Framework Sample App (Chat)")
client = FoundryChatClient(
model=deployment,
endpoint=endpoint,
project_endpoint=endpoint,
credential=AzureCliCredential(),
middleware=[
PurviewChatPolicyMiddleware(
build_credential(),
PurviewSettings(
app_name="Agent Framework Sample App (Chat)",
),
)
PurviewChatPolicyMiddleware(build_credential(), settings)
],
)
@@ -205,43 +219,27 @@ async def run_with_chat_middleware() -> None:
)
print("-- Chat MiddlewareTypes Path --")
first: AgentResponse = await agent.run(
Message(
role="user",
contents=["Give me a short clean joke."],
additional_properties={"user_id": user_id},
)
)
print("First response (chat middleware):\n", first)
second: AgentResponse = await agent.run(
Message(
role="user",
contents=["One more please."],
additional_properties={"user_id": user_id},
)
)
print("Second response (chat middleware):\n", second)
blocked_text = settings.get("blocked_prompt_message") or "Prompt blocked by policy"
await run_policy_flow("chat middleware", agent, user_id, blocked_text)
async def run_with_custom_cache_provider() -> None:
"""Demonstrate implementing and using a custom cache provider."""
endpoint = os.environ.get("AZURE_OPENAI_ENDPOINT")
endpoint = os.environ.get("FOUNDRY_PROJECT_ENDPOINT")
if not endpoint:
print("Skipping custom cache provider run: AZURE_OPENAI_ENDPOINT not set")
print("Skipping custom cache provider run: FOUNDRY_PROJECT_ENDPOINT not set")
return
deployment = os.environ.get("AZURE_OPENAI_MODEL", "gpt-4o-mini")
deployment = os.environ.get("FOUNDRY_MODEL", "gpt-4o-mini")
user_id = os.environ.get("PURVIEW_DEFAULT_USER_ID")
client = FoundryChatClient(model=deployment, endpoint=endpoint, credential=AzureCliCredential())
client = FoundryChatClient(model=deployment, project_endpoint=endpoint, credential=AzureCliCredential())
custom_cache = SimpleDictCacheProvider()
settings = PurviewSettings(app_name="Agent Framework Sample App (Custom Provider)")
purview_agent_middleware = PurviewPolicyMiddleware(
build_credential(),
PurviewSettings(
app_name="Agent Framework Sample App (Custom Provider)",
),
settings,
cache_provider=custom_cache,
)
@@ -254,38 +252,28 @@ async def run_with_custom_cache_provider() -> None:
print("-- Custom Cache Provider Path --")
print("Using SimpleDictCacheProvider")
blocked_text = settings.get("blocked_prompt_message") or "Prompt blocked by policy"
await run_policy_flow("custom cache", agent, user_id, blocked_text)
first: AgentResponse = await agent.run(
Message(
role="user", contents=["Tell me a joke about a programmer."], additional_properties={"user_id": user_id}
)
)
print("First response (custom provider):\n", first)
second: AgentResponse = await agent.run(
Message("user", ["That's hilarious! One more?"], additional_properties={"user_id": user_id})
)
print("Second response (custom provider):\n", second)
async def run_with_default_cache() -> None:
"""Demonstrate using the default built-in cache."""
endpoint = os.environ.get("AZURE_OPENAI_ENDPOINT")
endpoint = os.environ.get("FOUNDRY_PROJECT_ENDPOINT")
if not endpoint:
print("Skipping default cache run: AZURE_OPENAI_ENDPOINT not set")
print("Skipping default cache run: FOUNDRY_PROJECT_ENDPOINT not set")
return
deployment = os.environ.get("AZURE_OPENAI_MODEL", "gpt-4o-mini")
deployment = os.environ.get("FOUNDRY_MODEL", "gpt-4o-mini")
user_id = os.environ.get("PURVIEW_DEFAULT_USER_ID")
client = FoundryChatClient(model=deployment, endpoint=endpoint, credential=AzureCliCredential())
client = FoundryChatClient(model=deployment, project_endpoint=endpoint, credential=AzureCliCredential())
# No cache_provider specified - uses default InMemoryCacheProvider
purview_agent_middleware = PurviewPolicyMiddleware(
build_credential(),
PurviewSettings(
app_name="Agent Framework Sample App (Default Cache)",
cache_ttl_seconds=3600,
max_cache_size_bytes=100 * 1024 * 1024, # 100MB
),
settings = PurviewSettings(
app_name="Agent Framework Sample App (Default Cache)",
cache_ttl_seconds=3600,
max_cache_size_bytes=100 * 1024 * 1024, # 100MB
)
purview_agent_middleware = PurviewPolicyMiddleware(build_credential(), settings)
agent = Agent(
client=client,
@@ -296,16 +284,8 @@ async def run_with_custom_cache_provider() -> None:
print("-- Default Cache Path --")
print("Using default InMemoryCacheProvider with settings-based configuration")
first: AgentResponse = await agent.run(
Message("user", ["Tell me a joke about AI."], additional_properties={"user_id": user_id})
)
print("First response (default cache):\n", first)
second: AgentResponse = await agent.run(
Message("user", ["Nice! Another AI joke please."], additional_properties={"user_id": user_id})
)
print("Second response (default cache):\n", second)
blocked_text = settings.get("blocked_prompt_message") or "Prompt blocked by policy"
await run_policy_flow("default cache", agent, user_id, blocked_text)
async def main() -> None:
@@ -326,6 +306,11 @@ async def main() -> None:
except Exception as ex: # pragma: no cover - demo resilience
print(f"Custom cache provider path failed: {ex}")
try:
await run_with_default_cache()
except Exception as ex: # pragma: no cover - demo resilience
print(f"Default cache path failed: {ex}")
if __name__ == "__main__":
asyncio.run(main())