Commit Graph

31 Commits

  • feat(ai): switch xiaomi default to api billing, add per-region token plan providers (#4112)
    Built-in `xiaomi` provider now targets the API billing endpoint (https://api.xiaomimimo.com/anthropic) — a single stable URL for keys issued at platform.xiaomimimo.com. The Token Plan endpoints are exposed as three sibling providers, each with its own env var:
    
    - xiaomi-token-plan-cn: XIAOMI_TOKEN_PLAN_CN_API_KEY
    - xiaomi-token-plan-ams: XIAOMI_TOKEN_PLAN_AMS_API_KEY
    - xiaomi-token-plan-sgp: XIAOMI_TOKEN_PLAN_SGP_API_KEY
    
    BREAKING CHANGE: users who previously set XIAOMI_API_KEY against the Token Plan AMS endpoint must move to xiaomi-token-plan-ams and set XIAOMI_TOKEN_PLAN_AMS_API_KEY. This also resolves the 401 reported by on #4005, where a platform.xiaomimimo.com key fails against the Token Plan endpoint.
    
    closes #4082
  • feat(ai): add Xiaomi MiMo provider (#4005)
    * fix(ai): include minimax-cn in cross-provider-handoff matrix
    
    * feat(ai): add Xiaomi MiMo provider
    
    Adds Xiaomi MiMo as an openai-completions-compatible provider.
    
    - packages/ai: register provider in types/KnownProvider, env-api-keys (XIAOMI_API_KEY), generate-models, models.generated.ts, overflow util, README, CHANGELOG
    - packages/ai/test: extend stream, tokens, abort, empty, context-overflow, overflow, image-tool-result, tool-call-without-result, total-tokens, unicode-surrogate, cross-provider-handoff matrices with Xiaomi
    - packages/coding-agent: default model (mimo-v2.5-pro), display name (Xiaomi MiMo), CLI env var docs, README, docs/providers.md
    
    closes #3912
    
    ---------
    
    Co-authored-by: Mario Zechner <badlogicgames@gmail.com>
  • feat(ai): add Cloudflare AI Gateway as a provider (#3856)
    * feat(ai): add Cloudflare AI Gateway as a provider
    
    Routes through Cloudflare's Unified API (`/compat`) for Workers AI and
    Anthropic models, and through the provider-specific `/openai` subpath
    for OpenAI models so reasoning models (gpt-5.x, o-series) can hit
    `/v1/responses` natively. Once `/compat` adds Responses-API support,
    the OpenAI subpath can be folded back in.
    
    Catalog layout:
      workers-ai/@cf/...  -> openai-completions, gateway/.../compat
      anthropic/...       -> openai-completions, gateway/.../compat
      <native-id>         -> openai-responses,   gateway/.../openai
                             (gpt-5.1, claude-... no, sorry: gpt-5.x and o-series only;
                              prefix stripped because the OpenAI SDK posts native ids)
    
    Touches:
      packages/ai/src/types.ts                       add cloudflare-ai-gateway to KnownProvider
      packages/ai/src/env-api-keys.ts                map to CLOUDFLARE_API_KEY
      packages/ai/src/providers/cloudflare.ts        add CLOUDFLARE_AI_GATEWAY_COMPAT_BASE_URL
                                                     and CLOUDFLARE_AI_GATEWAY_OPENAI_BASE_URL
      packages/ai/src/providers/openai-responses.ts  one-line dispatch through resolveCloudflareBaseUrl
                                                     (matches what openai-completions.ts already does)
      packages/ai/scripts/generate-models.ts         branch openai/* vs workers-ai/anthropic/*
      packages/ai/src/models.generated.ts            spliced 34 entries
      packages/ai/test/stream.test.ts                3 e2e blocks (one per upstream)
      packages/coding-agent/*                        defaultModelPerProvider, login, env docs,
                                                     README, providers.md
    
    Verified end-to-end against a real Cloudflare account with unified
    billing: 9/9 e2e tests pass across all three upstreams (Workers AI
    Kimi K2.6, OpenAI gpt-5.1 reasoning, Anthropic claude-sonnet-4-5).
    
    * refactor(ai): move AI Gateway User-Agent and per-route session-affinity flag to catalog
    
    Mirrors the same per-model metadata refactor done for Workers AI in the
    parent branch. All cloudflare-ai-gateway entries get the User-Agent
    header. Only workers-ai/* gateway entries set
    `compat.sendSessionAffinityHeaders: true` because the gateway
    forwards that header to the underlying Workers AI runtime; anthropic/*
    upstream and openai/* (openai-responses) don't use it.
    
      packages/ai/scripts/generate-models.ts: emit headers (always) and
      per-upstream compat (workers-ai only) on each cloudflare-ai-gateway
      entry.
      packages/ai/src/models.generated.ts: re-spliced 35 entries with
      headers + conditional compat.
    
    Behavior unchanged - 9/9 e2e tests pass across all three upstream
    families.
    
    * fix(ai): align AI Gateway with telemetry-aware UA helper
    
    Adapts to badlogic/pi-mono#3851's follow-up fix ("honor telemetry for
    Cloudflare attribution headers", fbb5eed) which moved the
    'User-Agent: pi-coding-agent' header out of per-model catalog metadata
    and into a centralized telemetry-honoring helper
    (coding-agent/src/core/sdk.ts:getAttributionHeaders).
    
    - packages/coding-agent/src/core/sdk.ts: extend the cloudflare branch of
      getAttributionHeaders to also match cloudflare-ai-gateway and
      gateway.ai.cloudflare.com.
    
    - packages/ai/scripts/generate-models.ts and src/models.generated.ts:
      drop 'headers' from the 35 cloudflare-ai-gateway entries (constant
      CLOUDFLARE_STATIC_HEADERS no longer exists). Per-route
      compat.sendSessionAffinityHeaders is unchanged.
    
    End-to-end behavior unchanged: 9/9 tests still pass across all three
    upstream families (Workers AI, Anthropic, OpenAI Responses).
    
    ---------
    
    Co-authored-by: Mario Zechner <badlogicgames@gmail.com>
  • feat(ai): add Cloudflare Workers AI as a provider (#3851)
    * feat(ai): add Cloudflare Workers AI as a provider
    
    Cloudflare Workers AI hosts open-weight LLMs (Kimi K2.6, GPT-OSS,
    GLM-4.7, Llama 4, Gemma 4, Nemotron 3) on Cloudflare's GPU network with
    an OpenAI-compatible endpoint. Reuses the openai-completions API
    protocol; the per-account URL contains a {CLOUDFLARE_ACCOUNT_ID}
    placeholder resolved at request time by a small helper.
    
    Pi automatically sets x-session-affinity for prefix caching:
    https://developers.cloudflare.com/workers-ai/features/prompt-caching/
    
    Auth: CLOUDFLARE_API_KEY (matches pi's *_API_KEY convention) +
    CLOUDFLARE_ACCOUNT_ID. The User-Agent identifies traffic as
    'pi-coding-agent' in Cloudflare analytics.
    
    Verified end-to-end against a real Cloudflare account: 17 e2e tests
    pass across stream/empty/tokens/unicode/tool-call-without-result/
    total-tokens against @cf/moonshotai/kimi-k2.6.
    
    Cloudflare AI Gateway is a separate, larger change (it requires routing
    through provider-specific subpaths with the matching API protocol per
    upstream) and will land in a follow-up PR.
    
    * refactor(ai): move Cloudflare User-Agent and session-affinity flag to per-model metadata
    
    Instead of conditionally setting them in openai-completions.ts based on
    provider detection, declare them as model-level fields in the catalog
    (headers + compat). This is consistent with how the github-copilot and
    kimi-coding entries already declare their static headers.
    
      packages/ai/scripts/generate-models.ts: emit headers and compat fields
      on each cloudflare-workers-ai entry (CLOUDFLARE_STATIC_HEADERS).
      packages/ai/src/providers/openai-completions.ts: drop the
      isCloudflareProvider conditional that injected User-Agent and the
      isCloudflareWorkersAI override of sendSessionAffinityHeaders.
      packages/ai/src/models.generated.ts: re-spliced 8 cloudflare-workers-ai
      entries with headers + compat.
    
    Behavior is unchanged - verified via fetch interceptor that User-Agent
    and x-session-affinity / session_id / x-client-request-id are still sent
    on outbound requests. 5/5 e2e tests pass.
  • fix(ai): support prompt caching for Bedrock application inference profiles (#2346)
    Add AWS_BEDROCK_FORCE_CACHE=1 environment variable support. When the
    model ID doesn't contain a recognizable Claude model name, users can
    set this variable to force cache point injection.
  • docs: add auth.json keys to table and reference to envMap (#1412)
    * docs: add auth.json keys to table and reference to envMap
    
    * docs: adjust envMap reference links
  • fix(ai): move AWS_BEDROCK_SKIP_AUTH inside Node.js environment check
    The process.env access was outside the typeof process check, which
    would throw in browser environments. Moved inside the Node.js/Bun
    block for consistency with other env var access.
    
    Also added changelog entry for #1320 and improved docs clarity.
  • feat(coding-agent): support shell commands and env vars in auth.json API keys
    API keys in auth.json now support the same resolution as models.json:
    - Shell command: "\!command" executes and uses stdout (cached)
    - Environment variable: uses the value of the named variable
    - Literal value: used directly
    
    Extracted shared resolveConfigValue() to new resolve-config-value.ts module.
  • Update providers.md to include opencode provider (#1094)
    add 'opencode' provider configuration to the documentation.
  • feat(ai): add Kimi For Coding provider support
    - Add kimi-coding provider using Anthropic Messages API
    - API endpoint: https://api.kimi.com/coding/v1
    - Environment variable: KIMI_API_KEY
    - Models: kimi-k2-thinking (text), k2p5 (text + image)
    - Add context overflow detection pattern for Kimi errors
    - Add tests for all standard test suites
  • feat(ai): add Hugging Face provider support
    - Add huggingface to KnownProvider type
    - Add HF_TOKEN env var mapping
    - Process huggingface models from models.dev (14 models)
    - Use openai-completions API with compat settings
    - Add tests for all provider test suites
    - Update documentation
    
    fixes #994