cc-switch

feat(providers): add preset search and sorting (#3975 )

Peng Steam · 2026-06-11 17:14:53 +08:00

c1aa6c3917

feat(presets): add Unity2.ai partner provider across seven apps

Add Unity2.ai, a high-performance AI API relay partner, as a preset for
Claude, Codex, Gemini, OpenCode, OpenClaw, Claude Desktop, and Hermes.
Each preset carries the referral signup link as apiKeyUrl.

- Register the unity2 icon via iconUrls (PNG URL import) + metadata
- Add partnerPromotion copy in zh/en/ja/zh-TW; backfill the missing
  zh-TW ccsub entry
- List Unity2.ai in the sponsor section of all README locales
- Codex uses the bare base URL (gateway exposes /responses at root);
  OpenCode/OpenClaw/Hermes use the /v1 chat-completions endpoint with
  gpt-5.5 as the only preset model
- Trim CCSub OpenCode/OpenClaw/Hermes model lists to gpt-5.5 to match
- Normalize unity2/ccsub banners to the standard 2.41 aspect ratio

Jason · 2026-06-11 11:05:32 +08:00

daa5595f36

chore: ignore AGENTS.md alongside CLAUDE.md

Suggested in #4015.

Jason · 2026-06-11 08:43:33 +08:00

819c2e5dfe

fix(proxy): aggregate mislabeled SSE bodies in transform fallback (#2234 )

The Claude/Codex format-transform non-stream branch returned an opaque 422
"Failed to parse upstream response" whenever a 2xx upstream body was not
valid JSON. The common case: MaaS gateways force-stream a stream:false
request and return an SSE body with a non-SSE Content-Type, defeating the
header-only is_sse() check.

On serde failure, sniff for SSE and aggregate the chunks into a single
JSON, then run the existing converter so clients still receive a valid
non-stream response.

- chat_sse_to_response_value: aggregate chat.completion.chunk SSE
  (content / reasoning / refusal / tool_calls / legacy function_call),
  tool_calls index-keyed via BTreeMap to avoid unbounded densification,
  first-wins finish_reason, message-snapshot override, completeness and
  error-event guards; synthesize an id when the upstream omits one
- responses_sse_to_response_value: process the residual trailing block,
  tolerating truncation and skipping it once a completed event was seen
- enrich remaining parse failures with content-type / content-encoding /
  body-snippet diagnostics
- deflate: try zlib (RFC 9110) before raw; keep the content-encoding
  header for unsupported encodings
- gate zero-usage rows on the Claude transform path

Jason · 2026-06-11 08:29:29 +08:00

a6d718d0fc

feat(provider-form): consolidate codex form into advanced options section

- Fold local routing toggle, model mapping, reasoning overrides and custom
  User-Agent into a single collapsible advanced section, mirroring the
  Claude form (auto-expands when UA is set or local routing is enabled)
- Custom User-Agent becomes configurable for native Responses providers;
  it was previously reachable only when openai_chat routing was on
- Collapsed hint names local routing as the entry point for Chat
  Completions / non-GPT providers
- Backfill all missing codexConfig keys in zh-TW locale

Jason · 2026-06-11 08:29:29 +08:00

e776160912

feat(provider-form): custom User-Agent presets dropdown in advanced settings

Polish the provider-level User-Agent override UI on the Claude and Codex forms.

- Add a shared CustomUserAgentField (label + input + preset dropdown + live
  validation) so both forms stay in sync.
- Provide curated UA presets (Claude Code / Kilo Code families that pass
  coding-plan UA whitelists per #3671); the first is Claude Code's real
  `claude-cli/x (external, cli)` format. Whitelists gate on the name prefix,
  not the version, so static values stay valid across upgrades.
- Expose presets via a dropdown to the right of the input (z-[200] so it
  renders above the dialog layers) instead of inline chips.
- Move the field into the existing advanced/reasoning collapsibles.
- userAgent.ts mirrors the backend byte rule (reject only control chars;
  non-ASCII is allowed) for a non-blocking inline hint.
- i18n for all four locales (zh/en/ja/zh-TW).

Jason · 2026-06-11 08:29:29 +08:00

596019505f

feat(proxy): honor custom User-Agent across stream check and model fetch

Extract a shared `parse_custom_user_agent` helper in provider.rs returning
`Result<Option<HeaderValue>>`, and reuse it in the forwarder, stream check,
and model fetch paths so detection, forwarding, and model listing all apply
the same provider-level User-Agent. Previously only the forwarder honored it,
so stream check could fail (or model listing 403) on UA-gated upstreams that
the proxy itself handled fine.

- stream_check injects the provider's custom UA on the claude/codex paths and
  still skips the GitHub Copilot fingerprint UA.
- model_fetch service + command and the model-fetch.ts wrapper thread an
  optional UA through to GET /v1/models.
- runtime callers silently ignore invalid values via `.ok().flatten()`
  (no save-time block, so deeplink imports stay lenient).

Jason · 2026-06-11 08:29:29 +08:00

8b925c2f2f

fix: omit customUserAgent when provider category is official

Stale custom UA values from non-official presets were persisted even
after switching to an official preset, silently altering request headers.

RoromoriYuzu · 2026-06-11 08:29:29 +08:00

25983f3420

feat: add provider user agent override

RoromoriYuzu · 2026-06-11 08:29:29 +08:00

ff706e9e96

feat(usage): claude-desktop filter and pricing-model audit display

- add claude-desktop to AppType/KNOWN_APP_TYPES and the dashboard app
  filter; it was hidden because its rows looked like pure failure
  noise, which was the app_type attribution bug fixed on the backend
- request detail panel now shows the requested model and the pricing
  model when they differ from the response model, making route-takeover
  bills auditable from the UI
- locale keys added for zh/en/ja/zh-TW

Jason · 2026-06-11 08:29:29 +08:00

e8b07cb2a5

fix(proxy): bill route-takeover traffic by the real upstream model

The model mapped for takeover (env mapping, Claude Desktop routes,
Copilot normalization, Codex chat override) was discarded inside the
forwarder, so usage attribution depended entirely on the upstream
echoing it back. When the upstream omitted the model or mirrored the
client alias, kimi/glm tokens were recorded and priced as claude-*
(roughly 5-25x overstatement).

- capture the final outbound model in forward(), return it via
  ForwardResult, and store it on the request context
- attribution fallback order is now: upstream echo (empty string
  treated as missing) -> outbound model -> client-requested model
- 'request' pricing mode anchors to the outbound model instead of the
  pre-mapping client alias; unchanged when no mapping applies
- persist the resolved pricing_model on every usage row
- Claude Desktop rows now log app_type "claude-desktop" on streaming
  and transform paths too (was hardcoded "claude", silently dropping
  desktop provider pricing overrides and splitting the cost basis by
  the stream flag); its global pricing defaults inherit the claude
  config since proxy_config only allows claude/codex/gemini rows

Jason · 2026-06-11 08:29:29 +08:00

feea81e5bb

feat(usage): persist pricing basis and takeover dimensions in storage (schema v11)

- proxy_request_logs: add pricing_model column recording the basis actually
  used at write time (NULL = pre-v11 rows, '' = unpriced error rows)
- cost backfill recomputes strictly by the persisted basis; the
  request_model fallback now only applies to placeholder models, so
  real-but-unpriced takeover rows stay at zero cost until pricing is
  added instead of being permanently frozen at the alias's price
- backfill_missing_usage_costs_for_model can locate rows by pricing_model
- usage_daily_rollups: rebuild with request_model + pricing_model in the
  primary key so the alias-to-real-model mapping and the pricing basis
  survive the 30-day prune; legacy rows migrate with ''
- rollup_and_prune backfills costs before pruning: prune is irreversible
  and used to run before the startup backfill, permanently booking
  then-unpriced rows as zero
- get_model_stats groups by the effective pricing model
  (COALESCE(NULLIF(pricing_model,''), model)) so costs aggregate under
  the model whose prices produced them; response-mode behavior unchanged

Jason · 2026-06-11 08:29:29 +08:00

4282856683

fix(coding-plan): classify Zhipu quota windows by unit field instead of reset-time order (#3036 )

The Zhipu quota API returns two TOKENS_LIMIT entries whose identity was
inferred by sorting nextResetTime ascending (nearest = five_hour). In the
last hours of each weekly cycle the weekly window resets sooner than the
current 5-hour session window, so the two buckets were swapped exactly
when users check their weekly quota most.

Classify by the explicit unit field instead (3 = hour window -> five_hour,
6 = week window -> weekly_limit; same shape on bigmodel.cn and api.z.ai,
weekly observed with number 7 and 1 so only unit is matched), falling back
to the old reset-time heuristic when the field is missing.

Jason · 2026-06-11 08:29:29 +08:00

65d6929993

feat(usage): refresh model pricing seed — add Fable 5 + 8 models, fix 28 prices

Full audit of seed_model_pricing against current official vendor pricing.

New models: claude-fable-5 (10/50), grok-4.3, step-3.7-flash,
mistral-medium-3.5, mistral-small-4, devstral-small-2-2512, magistral-small,
qwen3.7-max, qwen3.7-plus.

Price fixes (Chinese vendors standardized on official list price, CNY/~7.14):
- GLM 4.6/4.7 -> Z.ai official 0.6/2.2/0.11 (were reseller/OpenRouter rates)
- Grok 4.20 reasoning/non-reasoning -> 1.25/2.50 (xAI price cut)
- MiMo v2.5 / v2.5-pro / v2-pro -> post-2026-05-27 rates + cache
- Doubao Seed 2.0 lite corrected + cache-hit prices across the family
- Kimi k2.5 output 3.00, MiniMax m2.5 input 0.15, Mistral devstral-2 output 2
- Qwen 3.5/3.6-plus + coder-plus/flash cache_read (official 20%-of-input rule)

Each fix updates the seed value (fresh installs) and adds an old->new guard to
repair_current_model_pricing (existing DBs; won't clobber user-edited rows).

Jason · 2026-06-11 08:29:29 +08:00

1ca01bcd10

feat(usage): app-aware hero icon and neutral Codex theme

- Replace the fixed Zap glyph in the usage hero with the selected app's
  brand icon via a new AppGlyph component, reusing APP_ICON_MAP
  (cloneElement scales 14px -> 20px); falls back to Zap for the "all" view.
- Recolor the Codex title theme from emerald to neutral gray to match
  OpenAI's monochrome branding. neutral-500/10 stays visible in both
  light and dark modes, unlike a flat black tint.

Jason · 2026-06-11 08:29:29 +08:00

bc01f44514

fix(proxy): extend image rectifier to Codex /responses text-only path

Codex /responses requests routed to text-only OpenAI-chat upstreams
(e.g. DeepSeek deepseek-v4-flash) failed with HTTP 400 "unknown variant
image_url" when images were sent: the responses->chat conversion turns
input_image items into image_url blocks the model rejects. The media
rectifier previously covered only the Claude adapter, so neither the
proactive strip nor the reactive retry fired for Codex.

- media_retry_should_trigger: accept "Codex" adapter, not just "Claude"
- contains_image_blocks / replace_images: also scan responses `input`
  (input_image) in addition to chat `messages`
- is_image_block_type: match image | image_url | input_image
- is_unsupported_image_error: add "unknown variant" hint for the
  deserialize error
- forward(): proactively run apply_media_prevention for Codex after the
  responses->chat conversion

Proactively strips images for known text-only models (heuristic on by
default) and reactively retries with images replaced on upstream
image-unsupported errors. Adds tests for chat image_url, codex
input_image, the reactive trigger, and the deserialize error match.

Jason · 2026-06-11 08:29:29 +08:00

3390fe7ea0

fix(proxy): exclude cache_read and cache_creation from input on Claude←OpenAI paths

Builds on #2774 (which fixed cache_read for the streaming openai_chat path).
Two gaps remained, both double-counting cache tokens when a Claude client
meters as app_type="claude" (input_includes_cache_read=false):

1. cache_read was still added to input on the non-streaming openai_chat path
   (transform.rs openai_to_anthropic) and the whole openai_responses family
   (transform_responses.rs build_anthropic_usage_from_responses, covering the
   non-streaming call site and both streaming_responses call sites).

2. cache_creation was never subtracted on any converted path, including the
   streaming openai_chat path #2774 had already touched. Claude billing treats
   cache_creation as a separate bucket, so an inclusive upstream carrying a
   direct cache_creation_input_tokens field billed it twice.

All four metering points now compute:
  input = prompt_tokens - cache_read - cache_creation
restoring the invariant input + cache_read + cache_creation == prompt_tokens.
Pure OpenAI upstreams are unaffected (no cache_creation concept/field).

Tests: update direct-cache assertions (40->20), add a streaming conservation
regression test, and pin prompt<cache underflow (saturating clamp to 0) for all
three metering functions. cargo test 1573 pass, clippy clean.

Note: fix is forward-only; historical rows are not recomputed (cost is frozen at
log time and app_type="claude" mixes native + converted rows).

Jason · 2026-06-11 08:29:29 +08:00

cb01593f7d

fix(proxy): correct usage accounting on format-conversion paths

Audited all proxy format-conversion paths (Chat<->Message, Chat<->Response,
Gemini<->Message) for usage/cache metering. Five issues found and fixed.
The dedup mechanism (request_id PK, proxy/session source isolation) is
untouched, so no double-counting is introduced.

- A (Claude + openai_chat, streaming): inject stream_options.include_usage
  so OpenAI-compatible upstreams emit usage in the SSE tail. Without it the
  converted Anthropic message_delta was all-zero and the whole request's
  input/output/cache was dropped. Same root cause as the already-fixed
  Codex Chat path; the injection is extracted into a shared helper
  (transform::inject_openai_stream_include_usage) reused by both paths.

- C (Claude + gemini_native): subtract cachedContentTokenCount from
  input_tokens in build_anthropic_usage so input becomes fresh input
  (Anthropic semantics). Previously the cache-hit tokens were billed twice
  because this path meters as app_type="claude" (input_includes_cache_read
  = false) while Gemini's promptTokenCount includes the cache.

- D (Codex + openai_chat, streaming): gate log_usage on
  has_billable_tokens() to skip the synthetic all-zero usage the converter
  emits when a non-compliant upstream omits usage, preventing empty-row
  request-count inflation.

- P2 (from_claude_stream_events): use has_billable_tokens() for the return
  gate instead of input>0||output>0, so a fully-cached streamed request
  (cache_read>0, input==output==0) is still recorded. Affects all
  Claude-streaming paths, not just Gemini.

- P3 (Codex Chat->Responses, non-streaming): apply the same
  has_billable_tokens() filter the streaming branch got, since the
  synthesized all-zero usage makes from_codex_response return Some and
  bypass the `if let Some` guard.

Add TokenUsage::has_billable_tokens() as the unified predicate. New tests
cover include_usage injection, gemini input subtraction, the gate itself,
cache-only stream recording, and synthetic all-zero codex usage.
Full lib suite: 1569 passed.

Jason · 2026-06-11 08:29:29 +08:00

36a103bbe4

fix(usage): import billable session messages without stop_reason

The local session-log scanner dropped any assistant message that lacked
a stop_reason or had output_tokens==0. Claude Code Workflow / sub-agent
fan-out frequently produces messages that only wrote a message_start
snapshot (output=1, stop_reason=None) without a final block, yet their
input + cache_read + cache_creation tokens are already billed by
Anthropic (charged once the request is accepted). Dropping them
under-counted usage by ~4.1% overall, 92% concentrated in
workflow/subagent transcripts.

Replace the stop_reason/output gate with a billable-token check (any of
input/output/cache_read/cache_creation > 0). The per-message-id dedup
selection is unchanged, and request_id = "session:"+msg_id PRIMARY KEY
with INSERT OR IGNORE keeps each message single-inserted, so relaxing
the gate cannot double-count. Add a regression test covering a
stop_reason-less message with real cache cost plus an all-zero skip.

This is the parser-layer half of the Workflow under-counting fixed at
the collector layer in 8d332925.

Jason · 2026-06-11 08:29:29 +08:00

05bc14e82b

fix(usage): count Claude Code Workflow sub-agent token usage

collect_jsonl_files only walked <project>/<session>/subagents/*.jsonl,
so it missed Workflow sub-agent transcripts which live one level deeper
at subagents/workflows/wf_*/agent-*.jsonl. As a result all Workflow
token usage was invisible to the no-proxy session-log accounting.

Descend into subagents/workflows/wf_*/ as well, via a new
push_jsonl_children helper that keeps the fixed-depth, no-recursion
design. journal.jsonl carries no assistant rows so it is skipped at
parse time and needs no filename special-casing. Existing dedup
(request_id PK + INSERT OR IGNORE + should_skip_session_insert) keeps
the next sync's backfill idempotent.

Add test_collect_jsonl_files_includes_workflow_subagents.

Jason · 2026-06-11 08:29:29 +08:00

0396cd5491

docs(release): restore contributor mentions in release notes

Jason · 2026-06-11 08:29:29 +08:00

f97347fe6e

fix: usage script provider credential resolution (#1479 )

The JS-script usage path resolved {{apiKey}}/{{baseUrl}} with env-only
field guessing, so apps that store credentials elsewhere (Codex:
auth.OPENAI_API_KEY + config.toml base_url) always got empty values and
custom-template queries failed despite a fully configured provider.

- query_usage / test_usage_script now delegate to
  Provider::resolve_usage_credentials, the same per-app resolver used by
  the native balance/coding-plan path and mirrored by the frontend
  getProviderCredentials; explicit non-empty script values still win
- test_usage_script loads the provider and applies the same fallback,
  so testing matches what a saved script does
- the custom-template variable preview shows the effective values
  (script overrides first, then provider config) instead of always
  showing provider credentials
- extract_codex_base_url documents and test-locks the frontend-mirror
  invariant: non-active [model_providers.*] sections are never read

Reworked from the original patch to reuse the existing resolver instead
of duplicating per-app extraction.

Co-authored-by: Jason <farion1231@gmail.com>

pa001024 · 2026-06-10 22:57:27 +08:00

9ea303b224

fix: prevent duplicate YAML keys in Hermes config (#3267 )

* fix: prevent duplicate YAML keys in Hermes config

Three changes in hermes_config.rs:
1. deduplicate_top_level_keys() - scan and remove duplicate top-level
   keys before YAML parsing, preventing "duplicate entry" parse errors
2. remove_all_sections() - helper to strip all occurrences of a given
   top-level key from raw YAML text
3. replace_yaml_section() now calls remove_all_sections() on the
   remainder after replacing the primary occurrence, preventing
   duplicate sections from accumulating on repeated writes

Fixes the issue where mcp_servers (or any top-level key) gets
duplicated in config.yaml, causing "Failed to parse Hermes config
as YAML: duplicate entry with key" errors.

Co-Authored-By: que3sui <204201112+que3sui@users.noreply.github.com>

* fix: handle CRLF and LF line endings in top-level key deduplication

is_top_level_key_line only accepted empty, space, or tab after the colon,
but deduplicate_top_level_keys uses split_inclusive('\n'), so lines end
with \n (LF) or \r\n (CRLF). Without accepting \r and \n as valid
post-colon characters, the dedup safety net never activates.

Add \r and \n checks to is_top_level_key_line, and three tests covering
LF, CRLF, and first-occurrence preservation.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* refactor(hermes): keep last occurrence when healing duplicate YAML keys

Reworks the healing layers on top of the CRLF root-cause fix:

- deduplicate_top_level_keys: keep the LAST occurrence of each duplicated
  key instead of the first. Duplicates come from section replacement
  degrading into appends (#3633), so the last block is the newest data --
  and Hermes itself reads the config with PyYAML, whose duplicate-key
  semantics are last-wins. Keeping the first occurrence would silently
  roll users back to stale config and diverge from what Hermes runs with.
  Healthy files take a fast path and are returned untouched.
- Drop the unused dup_key variable (fails cargo clippy -- -D warnings,
  which CI enforces).
- replace_yaml_section: clean residual duplicate sections from the
  remainder via remove_all_sections; values come from the keep-last
  healed read, so dropping all stale on-disk copies loses nothing.
- Add regression tests for the actual root cause (find/replace on CRLF
  input must replace in place, not append), keep-last semantics,
  identity on healthy files, end-to-end heal-then-parse, and duplicate
  cleanup on write.

Fixes #3633 #2973 #2529 #3310 #3762

---------

Co-authored-by: que3sui <204201112+que3sui@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Jason <farion1231@gmail.com>

que3sui · 2026-06-10 21:42:54 +08:00

4f911727d2

修复 Completions转Anthropic时不记录实际返回模型、Input token记录错误问题 (#2774 )

* fix(proxy): 修复completions转claude格式流式响应未记录实际命中模型

* style: cargo fmt fix

* fix(proxy): 修复completions转claude格式时input与cache_read重复计费

* fix(proxy): 修复完全缓存命中时input_tokens计算错误

* test: 更新input_tokens期望值匹配去重逻辑

LaoYueHanNi · 2026-06-09 20:30:05 +08:00

edc597ab23

fix(presets): add Kimi affiliate links (#3809 )

Problem: Kimi and Moonshot preset links were user-clickable without the cc-switch affiliate query.\n\nDecision: Update only UI-facing preset website/API-key links and leave API request endpoints untouched.\n\nChange: Add aff=cc-switch to Kimi/Moonshot websiteUrl values and Codex/OpenCode API-key links.

Co-authored-by: xumingyuan <xumingyuan@msh.team>

oriengy · 2026-06-08 23:25:28 +08:00

v3.16.2 955ea26da9

refactor(presets): align CCSub to end of partner block across apps

Move the CCSub preset to sit right after DouBaoSeed, at the end of the
partner block and before the first non-partner provider, so its position
is consistent across all six apps:

- Codex / OpenCode: moved up from the 2nd slot (between Shengsuanyun and
  the next partner) to the block tail
- OpenClaw / Hermes: moved up from the aggregator section to the block tail
- Claude / Claude Desktop: already at the block tail

Also add the missing CHANGELOG entry for the CCSub preset, and drop the
provider preset order test that enforced a now-unneeded ordering invariant.

Jason · 2026-06-08 23:07:50 +08:00

5beb63e67d

feat(presets): add CCSub provider across six apps

Add CCSub, a multi-model aggregator partner, as a preset for Claude, Codex, OpenCode, OpenClaw, Claude Desktop, and Hermes. Each preset carries the referral signup link as apiKeyUrl.

- Register the ccsub icon via iconUrls (1.1MB SVG URL import) + metadata
- Add partnerPromotion copy in zh/en/ja
- List CCSub in the sponsor section of all README locales
- Use gpt-5.5 and gemini-3.1-pro as the OpenAI/Gemini model ids

Jason · 2026-06-08 22:04:16 +08:00

fa17194d84

chore(release): prepare v3.16.2

Add the v3.16.2 CHANGELOG entry covering the 41 commits since v3.16.1,
bump the version across package.json, tauri.conf.json, Cargo.toml, and
Cargo.lock, and add trilingual (zh/en/ja) release notes.

Jason · 2026-06-08 12:39:50 +08:00

f1118d370f

fix(proxy): strip cache_control from OpenAI format conversion (#3841 )

* fix(proxy): strip cache_control from OpenAI format conversion (#3805)

- Remove cache_control passthrough from system messages, text blocks,
  and tools to prevent 400 errors on strict OpenAI-compatible endpoints
- Always simplify single text block content to plain string format
- Fixes two format conversion bugs reported in issue #3805

* fix(proxy): apply cargo fmt to fix CI formatting check

cc10143 · 2026-06-08 12:38:39 +08:00

4f5250fc4d

fix(providers): only block explicit official providers under proxy takeover

The proxy-takeover block previously fell back to the isOfficial heuristic
(empty base_url / missing key) when category was absent. That misjudged
custom providers whose endpoint lives in meta or whose fields are simply
unfilled: their switch button got disabled, making users think the config
was broken. That extra UI block was also "virtual" — the executor in
useProviderActions only ever honored category === "official", so the
front end blocked more than the backend would enforce.

Gate the block solely on explicit category === "official", matching the
executor and unifying both verdicts on a single source of truth.

Also rework the blocked-state UI:
- drop the red "blocked" badge for a plain disabled Enable button
- move title/cursor onto a wrapper span (disabled buttons set
  pointer-events:none, so an on-button title/cursor never fired)
- replace the account-ban warning tooltip with a lighter hint
  (provider.blockedByProxyHint), four locales kept in sync

Jason · 2026-06-07 20:56:35 +08:00

5c36ae066b

feat(proxy): map input_file and input_audio content parts to chat

Convert Responses input_file (requiring file_id or file_data, never file_url which Chat file parts do not support) and input_audio parts into their Chat Completions equivalents, and handle top-level input_* items that previously fell through and were dropped, clearing stale pending reasoning for non-assistant messages.

Jason · 2026-06-07 20:56:35 +08:00

f59fab6c24

fix(proxy): distinguish truncated chat streams from normal completion

Replace the unconditional finalize at chat-to-responses stream end with a three-way guard: complete normally when finish_reason or [DONE] arrived, emit an incomplete response when substantive output exists without a finish_reason, and emit a failed (stream_truncated) event for empty truncation instead of masking it as completed. Also propagate late-arriving reasoning_content onto still-active tool-call items.

Jason · 2026-06-07 20:56:35 +08:00

6940a4b208

fix(proxy): cache reasoning across turns for custom_tool_call and tool_search_call

Generalize the cross-turn reasoning cache in codex chat history from function_call only to the full tool-call triad (function_call, custom_tool_call, tool_search_call) and their *_output counterparts, so apply_patch and tool-search calls keep their reasoning_content when restored via previous_response_id.

Jason · 2026-06-07 20:56:35 +08:00

ea6123adf7

chore(presets): update SSSAiCode domain and endpoint nodes

Switch website/apiKey URLs to sssaicodeapi.com and replace base URL
nodes with node-hk.sssaicodeapi.com (default), node-hk.sssaiapi.com,
and node-cf.sssaicodeapi.com across all 7 app presets.

Jason · 2026-06-07 20:56:35 +08:00

e96eab5278

fix(proxy): resolve actual port for ephemeral (port 0) listen config

When listen_port is 0 the OS assigns the port at bind time, so the
configured value can no longer be trusted for building takeover URLs.

- server: read listener.local_addr() after bind and propagate the
  actual port to the global proxy port, status, and ProxyServerInfo
- services: start the proxy before takeover when port is 0 so live
  configs get the real port instead of :0, and persist the resolved
  port back to the DB for DB-only URL paths; stop the pre-started
  server on any takeover failure
- claude_desktop: reject an unresolved :0 port instead of emitting a
  broken gateway URL
- build_proxy_urls: prefer the running server's port and error out if
  the port is still 0

Add tests for takeover with an ephemeral port and the claude_desktop
:0 rejection; switch existing codex takeover tests to an ephemeral
port for isolation.

Jason · 2026-06-07 20:56:35 +08:00

2985ad2c14

fix: normalize localhost listen address (#3016 )

Alexlangl · 2026-06-07 20:40:14 +08:00

aa09c9cb62

feat(proxy): add GET /v1/models endpoint for Codex CLI reachability check (#3818 )

* feat(proxy): add GET /v1/models endpoint for Codex CLI reachability check

Codex CLI probes GET /v1/models at startup. Without this endpoint the proxy
returns 404, causing Codex to fail before any request reaches the upstream
LLM.

Return an OpenAI-compatible model list derived from the cc-switch–managed
model catalog file.

Fixes #3812

* fix(proxy): return Codex catalog schema from /v1/models

Codex deserializes the response as a catalog with a top-level `models`
field, not the OpenAI `{"object":"list","data":[...]}` envelope.
Return the catalog file content directly so the format matches what
Codex expects.

Co-authored-by: Codex review bot

* fix(proxy): guard /v1/models against serving stale catalog

Only return the model catalog when config.toml still references it via
`model_catalog_json`.  After switching to a provider without a custom
catalog, the old file lingers on disk — serving it unconditionally
would advertise the previous provider's models to Codex.

Co-authored-by: Codex review bot

* fix(proxy): match relative model_catalog_json in stale-guard

cc-switch writes `model_catalog_json = "cc-switch-model-catalog.json"`
(relative) via set_codex_model_catalog_json_field.  Match on the
filename constant rather than the absolute path so the guard works
with both relative and absolute paths.

Co-authored-by: Codex review bot

* fix(proxy): parse model_catalog_json field instead of substring match

Replace raw config_text.contains() with proper TOML field parsing so
commented-out lines and stray mentions of the filename in other fields
don't defeat the stale guard.  Also switch from contains() to exact
filename match (Path::new(val).file_name() == Some(...)) to stay
consistent with resolve_cc_switch_catalog_path in codex_config.rs.

Add log::debug! when the guard blocks serving so the operator can
distinguish "no models configured" from "guard blocked stale catalog".

* refactor(proxy): reuse resolve_cc_switch_catalog_path in handle_models

Replace the inline config.toml parsing and filename match in
handle_models with the existing resolve_cc_switch_catalog_path helper
(now pub(crate)). This removes the duplicated stale-guard logic, keeps
a single source of truth for catalog-path ownership, and makes the
handler honor absolute model_catalog_json paths the same way Codex
live-setting import does.

---------

Co-authored-by: Jason <farion1231@gmail.com>

CSberlin · 2026-06-07 20:26:44 +08:00

27c41f7416

[codex] Fix VS Code session previews (#3593 )

* Fix Codex VS Code session previews

* fix(codex): use last IDE request heading for session previews

A markdown heading inside the active selection / open file could precede the real injected request, so matching the first "## My request for Codex:" heading picked selection content instead of the user prompt. Scan for the last matching heading (the IDE injects the real request as the final section) on both the Rust title path and the frontend TOC preview path.

Add regression tests for the selection-heading case, and pin the known best-effort limitation when the request body itself repeats the heading.

---------

Co-authored-by: Jason <farion1231@gmail.com>

ayxwi · 2026-06-07 19:23:24 +08:00

6716a4c408

fix: normalize path separators in scan_dir_recursive for Windows (#3430 )

On Windows, Path::strip_prefix produces backslash-separated relative
paths. The update-check matching logic uses rsplit('/') to extract the
install name, so subdirectory skills (e.g. skills/my-skill) never
matched and updates were silently skipped. Replace backslashes with
forward slashes when building the directory string.

c9 · 2026-06-07 17:57:37 +08:00

2626eeebe6

fix: 修复Windows退出托盘图标残留问题 (#3797 )

阿珏 · 2026-06-06 22:27:39 +08:00

ab6266f745

docs(readme): fix release note links and sponsor markup (#3772 )

lucas · 2026-06-05 22:58:07 +08:00

1392ef6238

fix(proxy): 规范化 Anthropic system 消息 (#3775 )

Dearli666 · 2026-06-05 22:50:49 +08:00

3cd9a0dec5

fix(usage): correct inflated input_tokens in Claude stream parsing

Some Anthropic-compatible SSE providers (e.g. qwen, minimax) report the
full context (fresh + cached) as input_tokens in message_start, double
counting the cached portion that is also reported in
cache_read_input_tokens. This inflated the cacheable-input denominator
and pushed the displayed cache hit rate artificially low.

When a message_delta carries a smaller positive input_tokens, prefer it
over the message_start value and adopt the cache counts from the same
usage block to avoid double counting; fall back to the start cache
values when the delta omits them. Native Claude (no input in delta) and
OpenRouter-converted (input only in delta) paths are unchanged.

Refs #3580

Jason · 2026-06-05 21:45:34 +08:00

8e0e9ac319

fix(opencode): use OpenAI-compatible SDK for APINebula preset

APINebula is an OpenAI-compatible relay (its base URL ends in /v1, matching
its Codex/OpenClaw/Hermes presets), but the OpenCode preset loaded the
@ai-sdk/openai package, which targets the OpenAI Responses API and fails
against chat-completions-only upstreams. Switch the npm field to
@ai-sdk/openai-compatible so requests use the OpenAI Chat Completions format.

Jason · 2026-06-05 20:13:33 +08:00

bda625a4f1

feat(usage): add official subscription quota template with unified tier rendering

Changes:
- Add official_subscription template type for Claude/Codex/Gemini
- Replace implicit 'category=official auto-query' with explicit opt-in template
- Default disabled; users enable via usage script modal with configurable interval
- Unify tier→label mapping across subscription and script paths via labeled_tier_parts()
- Fix tray rendering: week aliases (seven_day/opus/sonnet) now use highest utilization
- Add depth guard: official_subscription checks enabled flag in query_provider_usage_inner
- Add cache invalidation symmetry: invalidate_subscription() for disabled providers
- i18n: add templateOfficialSubscription + hint in zh/en/ja/zh-TW

Backend (Rust):
- provider.rs: add TEMPLATE_TYPE_OFFICIAL_SUBSCRIPTION branch, flatten SubscriptionQuota→UsageData
- tray.rs: extract labeled_tier_parts() shared by both summary functions, use max_by for multi-alias groups
- usage_cache.rs: add invalidate_subscription() method
- Test coverage: add week-alias highest-utilization tests for both paths

Frontend (TypeScript):
- UsageScriptModal: add official_subscription to templates, auto-detect for official providers
- ProviderCard: gate useUsageQuery with !isOfficialSubscriptionUsage, pass autoQueryInterval to footer
- SubscriptionQuotaFooter: accept autoQueryInterval prop, default 0 (disabled)
- constants.ts: add TEMPLATE_TYPES.OFFICIAL_SUBSCRIPTION

Fixes tier rendering regression where:
- Claude/Codex: seven_day was missed (only weekly_limit matched) → lost 7-day window in tray
- Gemini: gemini_pro/flash/flash_lite fell through to fallback → leaked machine names
- Multi-window (opus+sonnet): find() took first, not worst → underestimated utilization and emoji color

All tests pass (cargo test + cargo clippy clean).

Jason · 2026-06-05 19:03:40 +08:00

473f21971d

fix: polish usage statistics ui (#3426 )

* fix: improve usage statistics ui

* chore: remove unused token suffix translation

---------

Co-authored-by: Jason <farion1231@gmail.com>

Allen Xu · 2026-06-05 08:12:17 +08:00

03a9296c1f

修复任务栏图标 (#3457 )

阿南 · 2026-06-04 23:32:32 +08:00

8e7d167ace

fix: disable auto-capitalize on Input component for macOS (#3626 )

Add autoComplete, autoCorrect, autoCapitalize, and spellCheck attributes
to prevent macOS from auto-capitalizing the first letter in input fields.

ZHLH · 2026-06-04 23:11:23 +08:00

dadefdee77

fix(coding-plan): route Zhipu quota query to the user's configured base URL (#3702 )

Fixes #3701.

`query_zhipu` was hard-coded to `https://api.z.ai`, so a user who
configured the mainland China preset (`Zhipu GLM` on
`open.bigmodel.cn`) could not retrieve usage once the international
endpoint became unreachable from their network (or vice versa).

The two endpoints share the same quota path (`/api/monitor/usage/quota/limit`)
and return JSON in the same shape, and — crucially — each user only
ever uses one of them: the quota host is the same host they're already
running coding on. So we can route by the configured `base_url` and
skip the cross-host fallback entirely.

What this PR changes
--------------------

A single helper that maps the user's `base_url` to the matching quota
host, and `query_zhipu` rebuilt to take `base_url` and pick the right
host:

    fn zhipu_quota_base(base_url: &str) -> &'static str {
        if base_url.contains("bigmodel.cn") {
            "https://open.bigmodel.cn"
        } else {
            "https://api.z.ai"
        }
    }

    async fn query_zhipu(base_url: &str, api_key: &str) -> SubscriptionQuota {
        let url = format!(
            "{}/api/monitor/usage/quota/limit",
            zhipu_quota_base(base_url),
        );
        // ... original 401/403 -> Expired / make_error / parse path, unchanged
    }

The dispatcher already distinguishes `ZhipuCn` from `ZhipuEn` via
`detect_provider()` and routes the call through
`query_zhipu(base_url, api_key)` in the same match arm.

Why no cross-host fallback
--------------------------

Farion's review pointed out that adding a fallback would be
over-engineered and actively harmful:

1. Reachability is determined by the preset the user chose. Their
   configured host is the host they are already using to run coding;
   if it were unreachable, the user could not have reached the
   "query usage" step at all.

2. The fallback path required distinguishing "both 401/403" (genuine
   bad key) from "one 401/403 + one network error" (regional block),
   which silently misclassified the second case as a generic query
   failure and hid the upstream "Session expired" UX for invalid
   keys.

3. It also cost the worst-case ~10s+10s≈20s serial timeout for users
   on a working primary.

With the URL-based routing in place, 401/403 returns to the original
`CredentialStatus::Expired` semantics — same UX as `query_kimi` and
`query_minimax`.

Files changed
-------------

- `src-tauri/src/services/coding_plan.rs` — 1 file, +35 / -20

Testing
-------

- 3 new `zhipu_quota_base_*` routing tests
- 15 existing `coding_plan` parser tests still pass
- `cargo fmt --check` clean
- `cargo clippy --lib --no-deps -- -D warnings` clean

Co-authored-by: Yongmao Luo <yongmao.luo@columbia.edu>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

Yongmao Luo · 2026-06-04 23:10:45 +08:00

ad030da3b1

fix(proxy): skip backup/restore when Live is already a proxy placeholder (#3689 )

When previous stop_with_restore() failed to restore the user's original
Live (e.g. app crash mid-stop, settings.json unwritable, or any pre-existing
state where Live carries the proxy placeholders), the next
start_with_takeover would read the still-placeholder Live and overwrite the
good backup row with the proxy config itself. After that, every subsequent
stop would restore the proxy placeholder back to Live — making the proxy
toggle a no-op and leaving the client pinned at http://127.0.0.1:15721.

Fix: in both backup write paths (`backup_live_configs` and
`backup_live_config_strict`) detect that Live is already a proxy
placeholder and skip the save, preserving any existing good backup. In
`restore_live_config_for_app_with_fallback_inner`, detect the same
condition in the parsed backup and fall through to the existing
SSOT (current provider DB) path that was added in c3d810a.

Both sides share a new `live_has_proxy_placeholder_for_app` dispatch
helper so the placeholder check stays in lockstep with the existing
per-app detection functions.

Co-authored-by: Yongmao Luo <yongmao.luo@columbia.edu>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

Yongmao Luo · 2026-06-04 22:54:56 +08:00

8047f95416

1917 Commits