- Add the Claude Desktop provider guide in English, Chinese, and Japanese.
- Add localized screenshots for import, provider setup, model mapping, and local routing.
- Link the guide from the v3.15.0 release notes and user manual indexes.
- Add Claude Desktop Official to the Claude Desktop preset list.
- Treat selected official presets as official mode in the form.
- Cover the official preset with a preset-order regression test.
- Mark `should_force_identity_encoding` as test-only.
- Keep runtime forwarding behavior unchanged.
- Verified with local CI checks and no-bundle Tauri build.
Remove the category-based grouping logic from ProviderPresetSelector,
letting the array position in each preset config file be the single
source of truth for display order. Move partner presets (PatewayAI,
火山Agentplan, BytePlus, DouBaoSeed) right after Shengsuanyun across
all 6 config files so they appear earlier in the UI.
Add BytePlus (international Volcengine) to Claude, Claude Desktop,
Hermes, OpenCode, and OpenClaw with byteplus icon, 256K context window,
and trilingual promotion text.
Switch Anthropic-format base URL from /api/coding to /api/compatible,
update website/apiKey URLs to Volcengine console with tracking params,
and promote DouBaoSeed to partner with trilingual promotion text.
Add RelaxyCode as a new third-party provider with support for:
- Claude Code preset (Anthropic native protocol)
- Codex preset (gpt-5.5 model)
- Claude Desktop preset (direct mode with passthrough routes)
RelaxyCode is an enterprise-grade AI programming platform providing
unified access to Claude Code, Codex, and Gemini CLI models.
Add RunAPI as a new partner provider with support for:
- Claude Code preset (Anthropic native protocol)
- Codex preset (gpt-5.5 model)
- Claude Desktop preset (direct mode with passthrough routes)
- OpenCode preset (@ai-sdk/anthropic)
- OpenClaw preset (anthropic-messages protocol)
- Hermes preset (anthropic_messages mode)
- Icon configuration (runapi.jpg)
- i18n support (zh/en/ja) with ¥14 free credit promotion
RunAPI is a high-performance AI model API gateway supporting 150+
mainstream models (OpenAI, Claude, Gemini, DeepSeek, Grok) with
prices as low as 10% of official rates.
Add ClaudeCN as a new partner provider with support for:
- Claude Code preset (Anthropic native protocol)
- Codex preset (gpt-5.5 model)
- Claude Desktop preset (direct mode with passthrough routes)
- OpenCode preset (@ai-sdk/anthropic)
- OpenClaw preset (anthropic-messages protocol)
- Hermes preset (anthropic_messages mode)
- Icon configuration (claudecn.png)
- i18n support (zh/en/ja) with enterprise service promotion
ClaudeCN is an enterprise-grade AI gateway operated by a registered
company, supporting enterprise procurement processes with corporate
payments, contracts, and compliance guarantees.
Add ClaudeAPI as a new partner provider with support for:
- Claude Code preset (using ANTHROPIC_AUTH_TOKEN field)
- Claude Desktop preset (direct mode with passthrough routes)
- Icon configuration (ClaudeApi.png)
- i18n support (zh/en/ja) with test credit promotion
ClaudeAPI provides official Anthropic API keys and AWS Bedrock
routing with support for Tool Use and 1M context.
- Change mode from "proxy" to "direct" for 20 third-party/aggregator providers
- Simplify PipeLLM from mappedRoutes to passthroughRoutes for consistency
- Reduces unnecessary proxy layer overhead for providers that support direct API calls
Affected providers: ShengSuanYun, AIHubMix, DMXAPI, PackyAPI, PatewayAI,
Cubence, AIGoCode, RightCodes, AICodeMirror, AICoding, CrazyRouter,
SSSAICode, ModelVerse, CompShare, MicuAPI, CTOK, E-FlowCode,
VibeCodingAPI, LemonData, PipeLLM
Add PatewayAI as a new partner provider with support for:
- Claude Code preset (using ANTHROPIC_API_KEY field)
- Codex preset (gpt-5.5 model)
- Claude Desktop preset (proxy mode with passthrough routes)
- Icon configuration (pateway.jpg)
- i18n support (zh/en/ja) with $3 registration bonus promotion
PatewayAI provides reliable API routing services for Claude Code,
Codex, and Gemini models.
- Forwarder buffers non-streaming bodies and primes streaming first
chunk before signaling success, so body timeouts and SSE first-chunk
failures route through the circuit breaker instead of being recorded
as success on response-header arrival
- Atomic enable-failover: switch to P1 before persisting the flag, and
roll back auto-added queue entries when the switch is rejected
(e.g. official providers)
- Hot-reload circuit breaker config on per-app proxy config change
instead of waiting for a proxy restart
- FailoverToggle / FailoverQueueManager / AutoFailoverConfigPanel
require proxy takeover for the active app; the backend command also
rejects enabling when takeover is off
- ProviderHealthBadge consumes the backend is_healthy flag instead of
hardcoding the 5-failure threshold
Cleanup:
- impl From<&AppProxyConfig> for CircuitBreakerConfig and use it from
the command layer
- Collapse three identical TabsContent blocks into a single map
Add visual indicators for routing capabilities on provider cards:
- Claude Code: "Needs Routing" badge for non-official providers with non-anthropic API formats
- Claude Code: "No Routing Support" badge for official providers
- Codex: "No Routing Support" badge for official providers
The badges help users understand which providers support format conversion through routing.
Replace all openclaudecode.cn URLs with micuapi.ai across all
provider presets (Claude, Codex, Hermes, OpenClaw, OpenCode,
Claude Desktop). This includes website URLs, API key URLs, and
base URLs.
Update all CrazyRouter baseURL configurations from crazyrouter.com
to cn.crazyrouter.com across all supported applications (Claude,
Codex, Gemini, Hermes, OpenClaw, OpenCode, Claude Desktop).
Website and registration URLs remain unchanged.
Most third-party Claude Code providers now reject requests from
non-official clients, so the model test button would just produce
noisy failures (or worse, trigger risk controls on the provider
side). Treat third-party Claude providers the same way as official /
Copilot / Codex OAuth: pass onTest=undefined so ProviderActions
renders the test button in its existing disabled visual state.
- Add `get_codex_oauth_models` Tauri command reusing the managed OAuth
access token to hit `chatgpt.com/backend-api/codex/models`; HTTP and
multi-shape JSON parsing live in `services::codex_oauth_models` so the
command stays thin.
- Unify the Claude form's "fetch models" button across normal / Copilot /
Codex OAuth presets, drop the auto-load effect for Copilot in favor of
explicit clicks, and guard against stale responses with a requestId ref.
- Add Vitest coverage for both Copilot and Codex OAuth paths asserting no
request on mount and the correct account id on click; add Rust unit
tests for the four model-list payload shapes.
When proxy takeover is active, write per-role *_MODEL aliases for routing
and *_MODEL_NAME with the upstream provider's real model name so the
Claude Code model menu reflects the active provider instead of stale
display names from a previous switch. Preserves the [1M] capability marker
for Sonnet/Opus, and strips it from implicit display names.
* model pricing routing: extend prefix-match families (gpt-/o1-o5/
gemini-/deepseek-/qwen-/glm-/kimi-/minimax-) with per-family dash
thresholds so short base IDs like gpt-5 no longer mis-match
gpt-5-mini; strip ISO and 8-digit date suffixes via UTF-8-safe
byte matching so claude-haiku-4-5-20251001 falls back to
claude-haiku-4-5 pricing
* SSE collector: SseUsageFinishGuard (RAII) guarantees finish() on
early return or panic; AtomicBool fast path lets push() skip the
Mutex once first-event time is recorded
* validation: shared validate_cost_multiplier / validate_pricing_source
helpers across DAO and service layers; PRICING_SOURCE_RESPONSE /
PRICING_SOURCE_REQUEST constants replace string literals; price
fields in update_model_pricing now reject empty / non-decimal /
negative input before INSERT
* backfill: add backfill_missing_usage_costs_for_model so a single
price edit only scans matching rows instead of the full log table;
startup backfill remains full-scan
* session_usage{,_codex,_gemini}: share find_model_pricing helper from
usage_stats; metadata_modified_nanos centralizes mtime precision
* frontend: NON_NEGATIVE_DECIMAL_REGEX + isNonNegativeDecimalString
replace three copies of the same multiplier regex; isUnpricedUsage
surfaces zero-cost rows that have usage tokens (cached per row to
avoid double evaluation); invalidate usageKeys.all on pricing mutate
so backfilled rows refresh
* stream_check: thread Result from get_auth_headers via map_err so
the workspace builds again
* forwarder: scope rectifier / budget-rectifier flags per-provider so
failover can still apply rectification on the next attempt
* forwarder: categorize before record_result; route NonRetryable and
ClientAbort through release_permit_neutral so client-side failures
don't pollute circuit breaker or DB health
* handler_context: parse Gemini model from uri.path() and strip both
?query and :action verb defensively in extract_gemini_model_from_path
* forwarder + response_processor + handlers: introduce
ActiveConnectionGuard (RAII) so active_connections decrement covers
the full streaming body lifetime, not just response headers
* claude_desktop_config: use sort_by_key to clear the clippy gate
The signature (RECT-003) and budget (RECT-012) rectifier branches each
carried ~50 lines of identical "provider error -> record + continue /
client error -> release permit + return" handling. The only piece that
varied between them was a log label ("整流" vs "budget 整流").
Move the shared logic into RequestForwarder::handle_rectifier_retry_failure
that returns Option<ForwardError> — None means "continue to the next
provider", Some(err) means "terminal failure, return to the client".
Each call site shrinks from ~50 lines to ~17, drops one level of
indentation, and the two branches now provably cannot drift apart.
forwarder.rs nets ~40 lines smaller.
claude.rs and gemini.rs each defined an identical `hv` closure that wrapped
`HeaderValue::from_str` into a ProxyError::AuthError result, and codex.rs
spelled the same conversion out inline. /simplify reviewers flagged this
as drift-prone copy-paste.
Move the conversion into a single `pub fn auth_header_value` in
providers/adapter.rs and have the three adapters import it locally. Same
error wording everywhere, one place to update if HeaderValue semantics
ever change.
The backend already understands `::` -> `::1` and wraps IPv6 literals
in brackets (services/proxy.rs), but the panel's save-time validator
only accepted localhost, 0.0.0.0, and IPv4 dotted-quads. Users who
wanted to listen on an IPv6 loopback had to bypass the UI and edit
config directly.
Add an isValidIpv6 helper that requires at least one ':' and round-trips
through `new URL('http://[<addr>]/')` so the platform's built-in IPv6
parser does the heavy lifting (covers compressed `::`, full 8-group
form, zone IDs). Update the invalidAddress copy in zh / en / ja so the
error message reflects the new accepted set.
The forwarder used to call client.post(&url) / http::Method::POST in
both the reqwest and hyper paths, and the Gemini route table only
registered POST /v1beta/*. As a result anything the Gemini SDK / CLI
sent as GET (models list, models/<id> info) hit a 404 at the router
and bypassed the local proxy's stats, rectifiers, and failover.
Thread the request method end-to-end:
- ProviderAdapter forwarder API now takes the http::Method by reference
per attempt and dispatches client.request(method, &url) for reqwest
and method.clone() for the hyper raw path.
- All five callers in handlers.rs (handle_messages_for_app for Claude /
Claude Desktop, handle_chat_completions, handle_responses,
handle_responses_compact, handle_gemini) pull the method out of the
incoming axum::extract::Request and pass it on.
- handle_gemini tolerates an empty body (GET endpoints have none) and
the forwarder skips serializing / sending a body for GET / HEAD —
attaching JSON to a GET makes Gemini reject the request.
- server.rs swaps the Gemini routes to any(handle_gemini) so the same
handler handles GET / POST / PUT / DELETE, and adds /gemini/v1/*
for the GA path version.
Three statistics-shape issues fixed together so the dashboard reflects
client requests, not provider attempts:
1. active_connections never moved off zero — the field had no caller in
the entire crate. Wrap forward_with_retry into a thin entry point
that saturating_add(1) on enter and saturating_sub(1) on exit; every
inner return path is covered automatically.
2. total_requests counted attempts, not requests. A single client call
that failed over P1 -> P2 -> success was recorded as
total=2 / success=1 -> 50% success rate. Move the increment and the
last_request_at refresh into the wrapper so they fire once per
client request regardless of how many providers were tried.
3. current_provider / current_provider_id stay inside the inner loop
because they are intentionally per-attempt ("what am I trying right
now?") — moving them would break the live-failover indicator.
Refactor: split forward_with_retry into a public wrapper + private
forward_with_retry_inner. Every existing `return Err(...)` inside inner
remains correct because the wrapper always runs the decrement on its
return.
The UI has exposed "请求失败时的重试次数 (0-10, default 3)" since the
auto-failover panel was added, but the value was silently dropped —
RequestForwarder never received it and the per-provider loop walked the
whole list regardless. From the user's perspective the setting was
inert.
Thread AppProxyConfig.max_retries through create_forwarder into
RequestForwarder, derive max_attempts = max_retries + 1 (so max_retries=0
matches the UI copy "0 retries" = single attempt), and break the loop
once attempts hit the cap. The check is placed before the circuit
breaker allow-permit so an over-cap iteration does not waste a HalfOpen
probe slot.
When auto-failover is disabled we also force max_retries to 0, mirroring
how timeouts already bypass in that mode — "no failover" should mean
"one provider, one try", not "limited retries against the same list".
The Chat-Completions transformer used to forward tool_choice verbatim,
but the two APIs disagree on shape:
Anthropic "any" | {"type":"tool","name":"X"}
OpenAI Chat "required" | {"type":"function","function":{"name":"X"}}
Pass-through made the upstream return 400 for any tool-forcing client
(Claude Code, Copilot, etc.). The Responses-API transformer already had
the equivalent map_tool_choice_to_responses helper; this commit adds a
sibling map_tool_choice_to_chat with the chat-specific *nested* function
selector and five regression tests covering string / object × any /
auto / none / tool.
The two helpers are intentionally not merged: the difference between
flat and nested function selectors is exactly what the original bug
was, so keeping them as separate self-documenting functions reduces the
chance of the same regression returning.
Two related changes to make per-provider failover behave correctly.
1. Bucket UpstreamError by status code in categorize_proxy_error.
The old "every UpstreamError is Retryable" rule meant a malformed
client request (400 / 422) would be replayed against every provider
in the queue: errors amplified N-fold, the circuit breaker accrued
unwarranted failure counts, and quota was burned. Now
400 / 405 / 406 / 413 / 414 / 415 / 422 / 501 are NonRetryable since
the request itself is wrong and no provider will accept it.
401 / 403 / 404 / 408 / 409 / 429 / 451 and all 5xx remain Retryable
because the next provider may carry a different key, quota, region,
or model mapping.
2. Make the rectifier-retry path participate in failover.
Both the signature (RECT-003) and budget (RECT-012) rectifier branches
used to "return Err(...)" after the retry failed, short-circuiting the
per-provider loop. A provider-side failure (5xx / Timeout /
ForwardFailed) now records the circuit breaker, accumulates into
last_error / last_provider, and "continue"s to the next provider —
matching the normal Retryable arm. Client-side failures still return
immediately since a different provider cannot fix a malformed payload.
Two related drift bugs in the takeover state machine:
1. The "already taken over?" guard used has_backup OR live_taken_over, so
either condition alone would short-circuit. After a user or anomalous
flow restores Live manually the backup row still made set_takeover
return success, leaving the UI claiming takeover while requests bypass
the local proxy. Tighten to AND so the rebuild branch repairs the two
"split brain" states (backup-only and placeholder-only).
2. Disabling takeover called the bare restore_live_config_for_app, which
silently Ok()s when the backup is missing. If the backup was lost while
Live still held proxy placeholders (PROXY_MANAGED token / local proxy
URL), the client config was left broken with no error surfaced. Route
the disable path through the already-existing
restore_live_config_for_app_with_fallback (backup → SSOT → cleanup).
The line 354 takeover-failure rollback intentionally keeps the bare
variant since that path must preserve the backup for retry.
split('/') strips the slashes, so find(|s| s.starts_with("models/")) never
matched any segment and request_model fell through to "unknown" for every
Gemini call, poisoning usage records, per-request billing, and logs.
Match the literal "models" segment and take the next one, stripping any
:action suffix and query string. The extraction is now a pub(crate) free
function so it can be unit-tested directly; seven regression tests cover
action suffixes, dotted versions, the /gemini/ proxy prefix, query
strings, the bare list endpoint, and missing-segment paths.
User-pasted API keys can contain control chars or CR/LF that make
HeaderValue::from_str return Err; the previous unwrap inside every
adapter turned such input into a process-wide panic instead of a request
error. The trait now returns Result<_, ProxyError>; Claude/Codex/Gemini
impls propagate ProxyError::AuthError so the client sees a 401 with the
underlying parse error instead of a crash. Adds a regression test that
pastes a CRLF-containing key and asserts AuthError.
- Split CostCalculator into per-app cache semantics: Anthropic's
input_tokens is already fresh input, while Codex/Gemini include
cached tokens in their prompt count. The old shared formula
double-subtracted cache_read for Claude, under-billing input cost.
- Backfill now reads cost_multiplier from the per-log snapshot column
instead of re-querying providers.meta, so historical rows are no
longer rewritten with the current multiplier.
- Move the "pricing not found" warn out of find_model_pricing_row;
emit it only when a brand new log is written, and skip placeholder
models (unknown / empty / null / none) entirely.
- Broaden model id normalization: strip namespace prefixes
(anthropic./openai./global./bedrock.), bedrock-style -vN suffixes,
reasoning effort suffixes (-low/-medium/-high/-xhigh/-minimal),
Claude Desktop's claude-<non-anthropic> wrapper, dot-to-dash for
Claude, and try a LIKE prefix match for Claude short route ids
(e.g. claude-haiku-4-5 -> claude-haiku-4-5-20251001).
- Fall back to request_model when the stored model is missing, so
early Codex session rows with model=unknown can still be priced.
- Replace the four flat env inputs with a Sonnet/Opus/Haiku role table.
Each row exposes ANTHROPIC_DEFAULT_*_MODEL plus a new display name
field ANTHROPIC_DEFAULT_*_MODEL_NAME, and Sonnet/Opus gain a
"Declare 1M" checkbox that toggles the [1M] suffix.
- Strip the [1M] context-capability marker before forwarding non-Copilot
requests upstream. Copilot keeps its existing [1m]->-1m normalization.
- Claude Desktop import now consumes ANTHROPIC_DEFAULT_*_MODEL_NAME as
label_override, closing the Claude Code -> Claude Desktop displayName
pipeline; add_route's merge logic is shared between hashmap branches.
- Unify the [1M] marker as ONE_M_CONTEXT_MARKER across
claude_desktop_config and proxy::model_mapper; rename the strip
helper to strip_one_m_suffix_for_upstream.
- Collapse useModelState's seven duplicated useState initializers and
the useEffect parse block into a single parseModelsFromConfig call.
- Add tests/hooks/useModelState.test.tsx and a Claude Desktop import
test covering Kimi K2 -> label_override. i18n (en/ja/zh) updated.
Adapt to Claude Desktop 1.6259.1+ fail-all validation which only
accepts claude-(sonnet|opus|haiku)-* route IDs. Branded model names
(DeepSeek, Kimi, GLM, etc.) now live in a new labelOverride field
instead of being embedded in route IDs.
- Backend auto-repairs legacy unsafe routes to the next free
sonnet/opus/haiku slot instead of erroring
- Frontend swaps the free-form route input for a role dropdown plus
menu display name field
- Add CLAUDE_DESKTOP_ROLE_ROUTE_IDS as the single source of truth
for role-to-route mapping; presets and form both consume it
- Drop the dead displayName alias on ClaudeDesktopModelRoute and the
ineffective /v1/models display_name injection (UI ignores it)
- Update i18n (en/ja/zh) and form focus test for the new fields
- Normalize OpenAI/Gemini input_tokens semantics in SQL via the new
fresh_input_sql helper (cache_read subtracted at query time, no data
migration). Recovers correct cache hit rates for Codex/Gemini.
- Add get_usage_summary_by_app endpoint for per-app split (single
UNION ALL + GROUP BY, avoids N+1).
- Replace UsageSummaryCards + AppBreakdownRail with a single
filter-driven UsageHero card; clicking a filter button now truly
changes the displayed numbers and the title accent color.
- Tighten KNOWN_APP_TYPES to the 3 app_types whose token data is
reliably collected (claude/codex/gemini); hide claude-desktop,
hermes, opencode, openclaw filter buttons and i18n keys.
- Flag cache_creation as N/A for OpenAI-style protocols (Codex,
Gemini); show a "partial" tooltip when the All view mixes both
protocol families.
- Derive route keys from the upstream model name (pass-through style)
instead of fixed Claude aliases, and translate the legacy [1M] suffix
into the supports1m field at the import boundary. Three Claude aliases
mapped to the same upstream now collapse to a single route (e.g.
MiniMax-M2 across SONNET/OPUS/HAIKU env produces one
claude-MiniMax-M2 -> MiniMax-M2 row), with [1M] OR-aggregated.
- Add an import-time safety net that rebuilds claude-desktop-official
when missing, so users who deleted it can recover via the normal
import button without losing customizations on other providers.
- Hide API key and endpoint URL inputs in the official provider edit
form to mirror Claude Code's behavior and prevent user confusion.
- Reword the empty-state import button label for clarity.
inferenceModels entries now emit {name, supports1m: true} objects when
1M is enabled (plain strings otherwise), instead of appending a " [1M]"
suffix to model IDs. Route IDs and upstream model IDs are stored
verbatim; the suffix is rejected on input rather than silently stripped,
and proxy request mapping now requires an exact route_id match.
The Monitor glyph's visual weight skews upward (screen rect dominates
while the stand is two thin lines), making it appear off-center inside
the 11px Claude Desktop badge. Add a per-badge offsetY config and
apply translateY(0.5px) to compensate.