Commit Graph

709 Commits

  • chore(brand): surface ccswitch.io as the sole official website
    Add an "Only Official Website" header to the three READMEs, an
    About panel button, and a tray menu entry — all pointing to
    ccswitch.io. Consolidates brand and SEO signals on the canonical
    domain across docs, GUI, and system tray.
  • perf(proxy): trim per-request hot-path work and db wait
    - Guard debug body serialization with `log::log_enabled!`; previously
      serialized the filtered body to a throwaway String on every forward,
      even with debug logging off.
    - Skip SSE parse + UTF-8 buffer loop when no usage collector and debug
      is off; the per-chunk `serde_json::from_str::<Value>` ran even in
      pure passthrough mode.
    - Add cheap per-app SSE event pre-filter (string `contains`) so usage
      collectors only parse events that could contain usage (e.g. Claude
      `message_start` / `message_delta`).
    - Skip non-streaming response body JSON parse when usage logging is
      disabled.
    - Move `ProviderRouter::record_result` off the success response path
      via `tokio::spawn` for non-HalfOpen state; that call internally does
      `get_proxy_config_for_app` + `update_provider_health`, two SQLite
      ops that previously blocked TTFB.
    
    Also: dedupe `usage_logging_enabled` (was duplicated in handlers.rs)
    and merge `SseUsageCollector::{new, new_filtered}` into a single
    constructor that takes `Option<StreamUsageEventFilter>`.
  • fix(proxy): improve cache hit rate for Codex/Responses requests
    prompt_cache_key was falling back to provider.id when the client did not
    supply a session, which collapsed every conversation onto a single key
    and defeated upstream prefix caching. Only emit the key when a real
    client-provided session/thread identity is available; otherwise let the
    upstream use its default matching behaviour.
    
    Additional fixes that affect cache stability:
    - Canonicalise (sort) JSON keys in outgoing request bodies and in
      tool_call arguments / tool_result content so semantically identical
      requests produce identical byte sequences for upstream prefix caches.
    - Exempt JSON Schema property maps (properties, patternProperties,
      definitions, \$defs) from the underscore-prefix filter so user-defined
      schema keys like _id and _meta survive.
    - Add a [CacheTrace] debug log with stable hashes for instructions,
      tools, input and include to help diagnose cache misses.
    - Thread session_id into the usage logger for request correlation.
  • fix(proxy): drop empty pages from Read tool input (#2472)
    * fix(proxy): drop empty pages from Read tool input
    
    * fix(proxy): preserve Read args across duplicate tool starts
  • - fix(ci): restore frontend formatting and Linux clippy
    - Format Claude Desktop provider presets with Prettier
    
    - Gate platform-specific Claude Desktop path helpers behind cfg
  • chore(backend): satisfy cargo fmt and clippy --all-targets
    - Apply rustfmt diffs in claude_desktop_config.rs
    - Allow needless_return on current_platform_paths (cfg-mirrored arms)
    - Allow too_many_arguments on RequestForwarder::forward
    - Replace `let mut + reassign` with struct literals in tests
      (settings, backup, provider, response_processor)
    - Use Path::new instead of PathBuf::from to fix cmp_owned in misc tests
    - Replace 3.14 with 3.5 in config test to avoid approx_constant lint
  • fix(claude-desktop): match proxy model route without [1M] suffix
    Claude Desktop strips the [1M] suffix from model IDs when sending
    requests, causing route lookup to fail with "model route is not
    configured". Fall back to base-name comparison when exact match misses.
  • refactor(claude-desktop): trim duplication in proxy and switch flows
    - services/proxy.rs: collapse 10 repeated `OpenCode | OpenClaw | Hermes |
      ClaudeDesktop` match arms into `_` fallthroughs.
    - claude_desktop_config.rs: extract a `with_rollback` closure shared by
      apply_provider_to_paths and restore_official_at_paths.
    - useProviderActions.ts: replace the triple-nested ternary picking the
      switch-success toast message with a flat let/if/else block.
    
    Net -36 lines. No behavior change; cargo test and pnpm typecheck pass.
  • feat(claude-desktop): add 3P provider switching with proxy gateway
    Adds a new ClaudeDesktop AppType that writes Claude Desktop's third-party
    inference profile under configLibrary/, sharing _meta.json with other
    launchers (Ollama-compatible) so cc-switch can coexist with them.
    
    Two switch modes:
    - direct: provider already exposes claude-* / anthropic/claude-* model
      ids on Anthropic Messages, Claude Desktop connects to it directly.
    - proxy: cc-switch's local proxy acts as the inference gateway,
      presenting only claude-* route names to Claude Desktop and mapping
      them to real upstream models. Required after Anthropic restricted
      Claude Desktop to claude-family ids.
    
    Backend:
    - New module claude_desktop_config with snapshot/rollback, official seed
      bypass, /claude-desktop/v1/{models,messages} routes, and a single
      source of truth for default proxy routes.
    - Gateway token persisted in SQLite, validated on every proxied request.
    - get_claude_desktop_status surfaces drift signals (stale models,
      missing routes, proxy stopped, base URL mismatch, missing token).
    
    Frontend:
    - Slim ClaudeDesktopProviderForm independent from ProviderForm,
      controlled by a top-level appId guard.
    - ProviderList banner consumes the status query (5s polling) and
      renders actionable diagnostics.
    - ClaudeDesktopRouteToggle in the header to start/stop the local
      gateway without touching takeover state.
    - Three-locale i18n synchronised.
  • fix(proxy): reuse pooled HTTPS connections for non-Anthropic backends
    The hyper raw-write path preserves original header casing but rebuilds
    TCP+TLS on every request — there is no connection pool — which was the
    root cause of slow reverse-proxy throughput.
    
    Only Anthropic-native requests actually need exact header-case
    preservation. Route OpenAI/Copilot/Codex/Gemini/codex_oauth requests
    through the pooled reqwest client (pool_max_idle_per_host=10,
    tcp_keepalive=60s) instead, so warm connections get reused.
    
    Streaming requests get a precise first-byte timeout via
    tokio::time::timeout around reqwest's send() (which resolves on
    response headers), with the body phase handed off to response_processor.
    The streaming-detection helper now also covers Gemini SSE endpoints
    and Accept: text/event-stream, not just body.stream.
  • Fix Codex startup live import duplication (#2590)
    * Fix Codex startup live import duplication
    
    * Fix: Prevent duplicate Codex default provider on restart & add startup import tests
  • feat: return reasoning_content with tool_calls for DeepSeek models (#2543)
    * feat: return reasoning_content with tool_calls for DeepSeek models
    
    * fix: correct reasoning_content handling for DeepSeek tool_calls
    
    * test: cover DeepSeek reasoning content round trip
    
    ---------
    
    Co-authored-by: Jason <farion1231@gmail.com>
  • fix(proxy): derive Claude auth strategy from ANTHROPIC env var name
    Anthropic SDK assigns distinct semantics to the two env vars:
    
    - ANTHROPIC_API_KEY    -> x-api-key
    - ANTHROPIC_AUTH_TOKEN -> Authorization: Bearer
    
    The Claude adapter previously collapsed both into AuthStrategy::Anthropic
    and then emitted Authorization: Bearer regardless, breaking strict
    Anthropic-protocol endpoints (Anthropic official, Cloudflare AI Gateway,
    OpenCode Go, DashScope) and silently overriding the user's intended auth
    scheme.
    
    - claude::extract_auth: infer strategy from env var name
      (ANTHROPIC_AUTH_TOKEN -> ClaudeAuth, ANTHROPIC_API_KEY -> Anthropic),
      matching the precedence already used by extract_key.
    - claude::get_auth_headers: split the Anthropic arm so it emits
      x-api-key, while ClaudeAuth and Bearer continue to use Bearer.
    - stream_check: reuse ClaudeAdapter::get_auth_headers as the single
      source of truth, replacing the prior "always Bearer + maybe x-api-key"
      double injection that produced auth conflicts and false-negative
      health checks.
    - Cover each strategy -> header mapping and env-var precedence with
      new unit tests in claude.rs.
    
    Refs #2368, #2380
  • fix(proxy): strip leading billing header from system content (#2350)
    Claude Code injects a dynamic `x-anthropic-billing-header` line at the
    start of `system` content. Its rotating `cch=` token was forwarded into
    OpenAI Responses `instructions` and Chat system messages, which broke
    upstream prefix prompt cache reuse — a stable ~95k-token prefix was
    getting re-charged on every request.
    
    Strip only the leading occurrence in both anthropic_to_openai and
    anthropic_to_responses; later occurrences are preserved so user-authored
    prompt text containing the same string is not lost.
  • chore(usage): drop Hermes Agent tracking integration
    Hermes aggregates all in-process API calls into a single sessions row
    with the `model` field locked to the initial model, so the usage
    dashboard cannot cleanly surface per-call billing context. Two rounds
    of UI workarounds (raw mapping, then `<model> @ <host>` display) did
    not resolve the user-facing confusion, so the whole tracking
    integration is dropped for now.
    
    Removes session_usage_hermes service (and its 17 tests), sync wiring
    in commands/usage.rs and lib.rs, _hermes_session/hermes_session
    entries in usage_stats SQL (provider_name_coalesce CASE and
    effective_usage_log_filter IN clause), frontend Tab/banner/dropdown/
    icon entries, and four i18n keys per locale.
    
    Hermes app integration outside usage tracking (proxy routing,
    session manager, config) is preserved. Pre-existing hermes rows in
    proxy_request_logs are left as orphans — filtered out by the
    updated SQL and never surfaced in the UI.
  • feat: persist Tauri window state (#2377)
    Add the window-state plugin and explicitly save size and position across app exit, restart, and lightweight-mode transitions.
  • feat(providers): add Baidu Qianfan Coding Plan for Claude Code (#2322)
    * feat(providers): add baidu qianfan coding plan presets
    
    * refactor(providers): align qianfan presets with existing format
    
    * chore(providers): narrow qianfan coding plan scope
  • fix(coding-plan): correct zhipu weekly tier name by reset time (#2420)
    Zhipu's `data.limits[]` returns 1 entry for legacy plans (subscribed
    before 2026-02-12) and 2 entries for current plans. Previously every
    TOKENS_LIMIT entry was hardcoded as `five_hour`, so the weekly bucket
    was rendered with the 5-hour i18n label.
    
    Sort TOKENS_LIMIT entries by nextResetTime ascending and assign
    `five_hour` to index 0, `weekly_limit` to index 1. Legacy plans
    naturally degrade to a single five_hour tier.
    
    Also harden the parser: case-insensitive type match (defends against
    upstream casing changes), reuse TIER_FIVE_HOUR/TIER_WEEKLY_LIMIT
    constants, and add 8 unit tests covering both plan shapes plus
    defensive edge cases.
  • fix(dashscope): enhance usage parsing robustness to prevent VSCode cr… (#2425)
    * fix(dashscope): enhance usage parsing robustness to prevent VSCode crashes
    
    Enhanced build_anthropic_usage_from_responses() to handle null, missing, empty,
    and partial usage fields gracefully. This prevents VSCode Extension crashes with
    "Cannot read properties of null (reading 'output_tokens')" when connecting to
    DashScope (Alibaba Cloud Bailian) models.
    
    Changes:
    - Added defensive null checks and empty object detection
    - Implemented OpenAI field name fallbacks (prompt_tokens/completion_tokens)
    - Added comprehensive logging for malformed usage scenarios
    - Fixed streaming SSE event handlers with null-safe usage access
    - Preserved cache token fields even when input/output tokens are missing
    
    This ensures the proxy never crashes on malformed Responses API usage objects,
    returning valid Anthropic-compatible usage structures (input_tokens/output_tokens)
    in all cases.
    
    * fix(proxy): tighten Responses API usage fix per review
    
    - Drop redundant fallback in streaming.rs Chat Completions path; the
      existing if-let-Some guard already prevents usage:null, so the extra
      layer was dead code and caused a fmt-breaking indentation issue.
    - Demote partial-usage warn to debug. Streaming chunks legitimately
      arrive with partial token counts and the warn-level log was noisy.
    - Rewrite CHANGELOG entry: reference #2422, broaden scope from
      DashScope-only to all api_format=openai_responses users (Codex OAuth
      is the strongest signal; DashScope compatible-mode/v1/responses is
      the original report).
    - cargo fmt to clear 12 formatting differences vs main.
    
    ---------
    
    Co-authored-by: Jason <farion1231@gmail.com>
  • fix(config): sort JSON keys alphabetically for deterministic output (#2469)
    * fix(config): sort JSON keys alphabetically for deterministic output
    
    Ensures settings.json keys are written in sorted order, preventing
    non-deterministic git diffs when switching configs.
    
    * test(config): add unit tests for sort_json_keys and fix formatting
    
    Cover top-level sort, nested recursion, array order preservation,
    primitive pass-through, empty collections, and the core determinism
    guarantee (different insertion orders must yield identical output).
    
    Also fix line-length in write_json_file flagged by `cargo fmt --check`.
    
    ---------
    
    Co-authored-by: Jason <farion1231@gmail.com>
  • Fix log message for session usage codex (#2473)
    * Fix log message for session usage codex
    
    * Fix comments in session_usage_codex.rs
  • feat: support launch warp and execute session (#2466)
    * feat: support launch warp and execute session
    
    Signed-off-by: tison <wander4096@gmail.com>
    
    * other wires
    
    Signed-off-by: tison <wander4096@gmail.com>
    
    * for launch with provider
    
    Signed-off-by: tison <wander4096@gmail.com>
    
    * fixup indirection
    
    Signed-off-by: tison <wander4096@gmail.com>
    
    * clippy
    
    Signed-off-by: tison <wander4096@gmail.com>
    
    * address comments
    
    Signed-off-by: tison <wander4096@gmail.com>
    
    ---------
    
    Signed-off-by: tison <wander4096@gmail.com>
  • 修复 Codex 切换供应商后历史记录变化 (#2349)
    * Keep Codex history stable across provider switches
    
    * Restore template Codex provider id when backfilling live config
    
    Backfill writes the current Codex live config back to the previous
    provider's stored template after a switch. Because the live file now
    carries a normalized stable model_provider id, the previous provider's
    template would lose its own provider-specific id (and any matching
    [profiles.*] references) on every subsequent switch.
    
    Reverse the normalization at backfill time by rewriting model_provider,
    the active model_providers section, and matching profile references back
    to the template's original id.
    
    ---------
    
    Co-authored-by: Jason <farion1231@gmail.com>
  • feat(usage): add Hermes Agent tracking + fix zero-cost bug + perf
    Hermes:
    - Parse ~/.hermes/state.db sessions (incl. profiles/*/state.db) into
      proxy_request_logs with data_source='hermes_session', WAL-aware
      incremental sync, Hermes-reported cost preferred over model_pricing
      fallback
    
    Zero-cost bug (dashboard showed \$0 totals):
    - GPT-5.5 family default pricing (~83% of affected rows used GPT-5.5)
    - find_model_pricing_row: ASCII-lowercase normalization so
      "OpenAI/GPT-5.5@HIGH" matches seeded "gpt-5.5"
    - Startup cost backfill in async task: scan rows where total_cost <= 0
      but tokens > 0, recompute via model_pricing in a single transaction
    
    Performance:
    - Add (app_type, created_at DESC) covering index for dashboard range
      queries
    - Add expression index on COALESCE(data_source, 'proxy') so dedup EXISTS
      subqueries use index lookup instead of full scan; drop superseded
      idx_request_logs_dedup_lookup
    
    Refactor:
    - row_to_request_log_detail helper (3-way de-dup; fixes cost_multiplier
      \"1\" vs \"1.0\" drift between callers)
    - Promote get_sync_state/update_sync_state to shared session_usage
      module (4 copies -> 1)
    - run_step helper in lib.rs replaces 9 if-let-Err blocks
    - maybe_backfill_log_costs returns bool to skip duplicate total_cost
      parsing in caller
  • fix(usage): prevent double-counting between proxy and session-log sources
    Proxy writes and session-log sync wrote to proxy_request_logs with
    mismatched request_ids: only Claude on a native Anthropic backend used the
    shared `session:{message_id}` key. Codex/Gemini and Claude-through-OpenAI
    providers always produced distinct ids, so primary-key dedup never fired
    and every real request was recorded twice.
    
    Adds a 7-dim fingerprint dedup (app_type, 4 token counts, 2xx status,
    model with case-insensitive match, ±10min window) wired into three layers:
    
    - Write path: should_skip_session_insert() blocks duplicate session rows
      before INSERT, unifying the previously-divergent Claude/Codex/Gemini
      paths through a single DedupKey-based helper.
    - Read path: effective_usage_log_filter() excludes already-covered session
      rows from every aggregation query.
    - Rollup path: same filter applied so usage_daily_rollups never absorbs
      duplicates.
    
    Also adds a covering index (idx_request_logs_dedup_lookup) so the EXISTS
    subquery stays index-only, and a transform.rs regression test that pins
    openai_to_anthropic id preservation - the missing piece that lets
    Claude+OpenAI-compatible providers reuse the session: id scheme.
  • fix(balance): show USD on SiliconFlow international site (was CNY)
    The query_siliconflow function received an is_cn flag that only switched
    the request domain (.cn vs .com) but the response builder hardcoded
    unit="CNY" for both sites. International users at api.siliconflow.com
    saw their USD balance labelled as CNY. Now unit and plan_name follow
    is_cn, so the EN site shows USD and "SiliconFlow (EN)".
  • fix(proxy): preserve scoped reasoning_content for tool calls (#2367)
    - Preserve `reasoning_content` for Kimi/Moonshot OpenAI Chat compatibility paths.
    - Keep generic OpenAI-compatible requests free of non-standard `reasoning_content` fields.
    - Continue skipping thinking-only assistant messages.
    - Add regressions for generic skip and Kimi/Moonshot preservation behavior.
  • fix(proxy): dedupe streaming message_delta (#2366)
    - Deduplicate repeated upstream `finish_reason` chunks so only one Anthropic `message_delta` is emitted.
    - Preserve late `choices: []` usage-only chunks before sending the final `message_delta`.
    - Keep stream error paths from emitting successful terminal events.
    - Add regressions for duplicate finish reasons, usage-only chunks, missing `[DONE]`, and truncated streams.
  • feat(provider-form): soften validation with "save anyway" prompt (#2307)
    * feat(provider-form): soften business-rule validation with "save anyway" prompt
    
    Refactor handleSubmit so empty-field / missing-item validations (provider
    name, endpoint, API key, opencode model, template variables, provider key
    required) no longer hard-reject with toast.error. Instead they are collected
    into an issues list and presented via a ConfirmDialog; the user can cancel
    or choose "Save anyway" to proceed.
    
    Integrity constraints stay as hard rejections:
    - providerKey regex / duplicate (would corrupt other providers)
    - Copilot / Codex OAuth not authenticated (no token, cannot establish)
    - omo Other Fields JSON not an object / parse failure
    
    This aligns the frontend with the backend's existing "relaxed save / strict
    switch" split (see gemini_config.rs: validate_gemini_settings vs
    validate_gemini_settings_strict) and unblocks legitimate configs such as
    AWS Bedrock, Vertex AI, and custom Gemini base URLs that the UI previously
    refused to save.
    
    Refs: #2196, #1204
    
    * fix(provider-form): address review feedback on soft-validation
    
    P1: move empty providerKey back to hard rejection for OpenCode / OpenClaw /
    Hermes. Since providerKey is the primary identity for these apps and the
    mutations layer throws "Provider key is required" when absent, letting users
    click "save anyway" would surface a generic error toast instead of a
    precise, actionable one. Treat empty providerKey as an integrity constraint
    alongside regex / duplicate checks.
    
    P2: give the soft-confirm submit path its own submitting state. The
    confirm-dialog path bypassed react-hook-form's isSubmitting lifecycle, so
    slow or failing saves left the outer submit button responsive and could
    spawn unhandled rejections. Now the confirm handler awaits performSubmit
    inside try/catch/finally, uses an isConfirmSubmitting flag to gate both
    confirm and cancel clicks, and folds the flag into the outer disabled
    state and onSubmittingChange callback.
    
    Refs: #2307 review comments
    
    * chore(clippy): use push for single char '…' in truncate_body
    
    Clippy 1.95 added single_char_add_str which flagged the push_str("…")
    in truncate_body. Rebased onto latest upstream/main and applied the
    suggested fix so the Backend Checks clippy job passes.
    
    Unrelated to this PR's core changes; bundled in so the PR is mergeable
    without waiting for a separate upstream fix.
    
    ---------
    
    Co-authored-by: Allen <allen@AllenMacBook-M4-Pro.local>
  • feat(deepseek): switch presets to V4 (flash/pro) and add pricing
    DeepSeek released V4 flash/pro; legacy IDs deepseek-chat / deepseek-reasoner
    now alias to deepseek-v4-flash and will be deprecated.
    
    - Update claude/hermes/opencode/openclaw presets to v4-pro / v4-flash,
      context 128K -> 1M; Claude Anthropic-compat endpoint routes OPUS/SONNET
      to v4-pro and HAIKU to v4-flash, plus an explicit modelsUrl override.
    - Seed deepseek-v4-flash ($0.14/$0.28 per 1M) and deepseek-v4-pro
      ($1.68/$3.36 per 1M) into model_pricing; older v3.x / chat / reasoner
      rows kept for historical usage stats (INSERT OR IGNORE).
    - Refresh user-manual (zh/en/ja) pricing table and note that legacy model
      IDs are billed at v4-flash rates.
  • fix(model-fetch): support /models for Anthropic-compat subpath providers
    Providers like DeepSeek, Kimi, Zhipu GLM and MiniMax expose the
    Anthropic-compatible API on a subpath (e.g. /anthropic) while the
    OpenAI-style /models endpoint lives at the API root. The previous
    heuristic blindly appended /v1/models to the Base URL, so every such
    provider returned 404 and the UI mislabeled it as "provider does not
    support fetching models".
    
    Backend now generates a candidate list and tries them in order:
    preset override -> baseURL /v1/models -> stripped-subpath /v1/models ->
    stripped-subpath /models. Non-404/405 responses (auth, network) stop
    immediately so we never retry against hostile status codes. Known
    compat suffixes are kept in a length-descending constant so the
    longest match wins; response bodies are truncated to 512 chars to
    avoid HTML 404 pages bloating the error string.
    
    Preset type gains an optional modelsUrl (DeepSeek points at
    https://api.deepseek.com/models). Frontend threads the override
    through fetchModelsForConfig when the current Base URL still matches
    the preset default. A new fetchModelsEndpointNotFound i18n key
    replaces the misleading "not supported" toast for exhausted-candidate
    and 404/405 cases (zh/en/ja).
  • fix(copilot): resolve Claude model IDs against live /models list
    Copilot upstream returns model_not_supported when the client sends
    dash-form Claude IDs (claude-sonnet-4-6, claude-sonnet-4-6[1m]) while
    /models only accepts dot form (claude-sonnet-4.6, -1m suffix).
    
    - Add copilot_model_map: syntax normalize (dash->dot, [1m]->-1m) plus
      live /models exact match and family-version fallback, reusing the
      existing 5 min auth cache. Returns None when the whole family is
      absent so upstream surfaces an explicit error instead of silently
      switching families.
    - Wire into forwarder Copilot hook; runs before anthropic_to_openai
      conversion.
    - Default Opus slot in the Copilot preset maps to Sonnet 4.6: Pro
      dropped all Opus on 2026-04-20 and Pro+ bills Opus 4.7 at 7.5x.
      Users who want real Opus can switch manually in the UI.
    
    Refs: https://github.com/farion1231/cc-switch/issues/2016
  • chore(release): bump version to 3.14.1
    - Add v3.14.1 release notes (en/zh/ja) covering tray usage visibility,
      Codex OAuth stability fixes, Skills import/install reliability, and
      removal of the Hermes config health scanner
    - Cut [Unreleased] into [3.14.1] in CHANGELOG with PR references
    - Bump version in package.json, Cargo.toml, Cargo.lock, tauri.conf.json
  • feat(tray): show coding-plan usage for Kimi / Zhipu / MiniMax
    dc04165f surfaced tray usage badges for Claude/Codex/Gemini official
    OAuth only. Chinese coding-plan providers already expose 5h + weekly
    windows through coding_plan::get_coding_plan_quota, but two gaps kept
    the tray from rendering them.
    
    - format_script_summary read only data.first(), truncating the tier-
      flattened UsageResult to a single window. Detect plan_name matching
      TIER_FIVE_HOUR / TIER_WEEKLY_LIMIT and emit the "🟢 h12% w80%" layout
      used by format_subscription_summary; worst utilization drives the
      emoji. Copilot / balance / custom scripts keep the legacy single-
      bucket output via fallback.
    
    - usage_script previously required manual activation through
      UsageScriptModal. Auto-inject meta.usage_script on Claude provider
      creation when ANTHROPIC_BASE_URL matches a known coding plan, so the
      tray lights up without the user opening the modal. Does not overwrite
      existing usage_script on update.
    
    Extract the URL route table out of UsageScriptModal into a shared
    codingPlanProviders module so the modal, the creation hook, and the
    Rust coding_plan::detect_provider mirror all agree on one list.
    Add TIER_WEEKLY_LIMIT alongside TIER_FIVE_HOUR and a createUsageScript()
    factory to collapse the duplicated default fields across four call
    sites and drop the remaining stringly-typed tier names.
  • refactor(hermes): drop config health check scanner
    The Hermes config.yaml schema has stabilized and users have migrated to
    the current provider fields, so the value of scanning for model.provider
    dangling references, custom_providers shape errors, v12 migration residue
    etc. no longer justifies the maintenance surface — and the scan produces
    false positives when users keep some providers under Hermes' v12+
    providers: dict (Hermes' runtime merges both shapes, but CC Switch's
    scanner only looked at the list form).
    
    Removes the whole HermesHealthWarning type, scan_hermes_config_health
    command, HermesHealthBanner React component, useHermesHealth hook,
    warnings field on HermesWriteOutcome, and the three helper functions
    (yaml_as_non_empty_str, collect_mapping_string_keys, hermes_warning)
    that only served the scanner. Drops the matching i18n keys in
    zh/en/ja and the fixInWebUI button label that only the banner used.
  • feat(tray): show cached provider usage in the system tray menu (#2184)
    * feat: add Rust-side write-through usage cache
    
    Introduce an in-memory UsageCache on AppState that the existing usage
    query commands populate on success. The cache is read-only to the rest
    of the app today; the next commit consumes it from the tray menu.
    
    - New services::usage_cache module with split maps: subscription keyed
      by AppType, script keyed by (AppType, provider_id).
    - AppType gains Eq + Hash so it can be used as a HashMap key.
    - commands::subscription::get_subscription_quota now takes State<AppState>
      and writes through on success (signature change is invisible to the
      frontend — Tauri injects State automatically).
    - commands::provider::queryProviderUsage body extracted into an inner
      async fn; the public command wraps it with write-through, covering
      Copilot, coding-plan, balance, and generic script paths uniformly.
    
    Cache is in-memory only; auto-query interval and the upcoming tray
    refresh action rebuild it after restarts.
    
    * feat(tray): surface cached usage in the system tray menu
    
    Read UsageCache populated by the previous commit and render it in three
    places, scoped to whatever TRAY_SECTIONS covers (Claude/Codex/Gemini):
    
    1. Inline suffix on each provider submenu item
       "AnyProvider  · 🟢 5h 18% / 7d 23%"
    2. Disabled summary row per visible app under "Show Main"
       "Claude · Anthropic Official · 🟢 5h 18% / 7d 23%"
    3. "Refresh all usage" menu item that triggers get_subscription_quota +
       queryProviderUsage for every applicable provider, then rebuilds the
       tray menu via the existing refresh_tray_menu path.
    
    Color encoding uses emoji (🟢 <70% / 🟠 70-89% / 🔴 ≥90%) since Tauri 2
    tray labels are plain text. Missing cache entry leaves the label
    unchanged — tray never issues network requests when opened. Three new
    i18n-ready strings live in TrayTexts (en/zh/ja), following the existing
    pattern for tray text.
    
    Closes #2178.
    
    * feat(usage): bridge tray UsageCache writes to frontend React Query
    
    Why: tray hover triggers backend-only refresh that wrote to UsageCache but
    never notified the frontend, leaving main UI stale while tray showed fresh
    numbers. Emit a payload-carrying event after each cache write so React Query
    can setQueryData directly, keeping both views in sync without duplicate fetches.
    
    * fix(tray): skip hidden apps on hover refresh and drop stale disabled-script cache
    
    Address P2 findings from automated review on #2184:
    
    1. refresh_all_usage_in_tray now filters TRAY_SECTIONS by settings.visible_apps
       before scheduling subscription/script queries, matching create_tray_menu and
       preventing wasted external API calls (and rate-limit/auth-error log noise)
       for apps the user has hidden.
    
    2. format_usage_suffix only trusts the script cache when provider.meta.usage_script
       is still enabled; when a script is disabled/removed the cached suffix is now
       invalidated so the tray label no longer shows stale data indefinitely.
    
    * refactor: consolidate codex provider helpers and fix test semantics
    
    - Add Provider::is_codex_oauth() and Provider::codex_fast_mode_enabled()
      to eliminate duplicated meta extraction in claude.rs and stream_check.rs
    - Fix non-codex-oauth tests to pass codex_fast_mode=false (was true, harmless
      but semantically misleading)
    - Remove redundant is_dir() guard after resolve_skill_source_dir already
      guarantees the returned path is a directory
    
    * style: apply cargo fmt
    
    * fix(tray): reflect failed refreshes in cache and support Gemini flash-lite
    
    Follow-up to the tray usage-display feature addressing review feedback:
    
    - Write snapshots for both Ok(success:false) and Err paths in
      queryProviderUsage / get_subscription_quota so stale success data
      no longer persists across failed refreshes; the original Err is
      still returned to the frontend onError handler.
    - Include gemini_flash_lite tier in the tray summary with label "l".
      Matches the frontend SubscriptionQuotaFooter and keeps the worst
      emoji correct when lite is the highest utilization.
    - Add TIER_GEMINI_PRO / _FLASH / _FLASH_LITE constants in
      services/subscription.rs and reuse them in classify_gemini_model
      and sort_order.
    - Extract Provider::has_usage_script_enabled() to remove the
      duplicated meta.usage_script chain at two call sites.
    - Use db.get_provider_by_id in refresh_all_usage_in_tray instead of
      materialising the full provider map, and parallelise subscription
      and script futures via futures::future::join.
    - Narrow refresh_all_usage_in_tray to each section's effective
      current provider (script if enabled, else subscription when the
      provider is official). Hover refreshes now issue at most
      TRAY_SECTIONS.len() outbound requests.
    - Add 10 unit tests in tray::tests covering Claude/Codex h/w dispatch,
      Gemini p/f/l dispatch (including lite-only and lite-worst cases),
      and success/failure guards.
    
    ---------
    
    Co-authored-by: Jason <farion1231@gmail.com>
  • Fix/一键配置失效 (#2249)
    * style(FailoverQueueManager): 显示供应商备注信息
    
    * style(FailoverQueueItem): 添加供应商备注字段以支持备注信息显示
    
    * style(FailoverQueueManager): 显示供应商备注信息
    
    * style(FailoverQueueItem): 添加供应商备注字段以支持备注信息显示
    
    * style(FailoverQueueManager): 更新供应商备注信息的显示样式
    
    * style(FailoverQueueItem): 添加条件序列化以优化供应商备注字段
    
    * fix: 优化模型状态管理,确保配置更新时正确引用最新设置
    
    * fix(skill): improve error handling for skill source directory resolution
    
    Co-authored-by: Copilot <copilot@github.com>
    
    * fix(gemini): simplify project directory retrieval in scan_sessions function
    
    * fix(useModelState): optimize latestConfigRef assignment in useModelState hook
    
    * fix(useModelState): remove unnecessary blank line in useModelState hook
    
    ---------
    
    Co-authored-by: Copilot <copilot@github.com>
  • feat: Add Codex OAuth FAST mode toggle (#2210)
    * Add Codex OAuth FAST mode toggle
    
    * fix(codex-oauth): default FAST mode to off to avoid surprise quota burn
    
    service_tier="priority" consumes ChatGPT subscription quota at a higher
    rate. Users must now opt in explicitly rather than inherit FAST mode
    silently when this feature ships.
    
    ---------
    
    Co-authored-by: Jason <farion1231@gmail.com>
  • [codex] Stabilize Codex OAuth cache routing (#2218)
    * Stabilize Codex OAuth cache routing
    
    Codex OAuth-backed Claude proxy requests now reuse a client-provided session identity for prompt cache routing and send Codex-like session headers when that identity exists. Generated proxy UUIDs are intentionally excluded so they do not fragment cache locality.\n\nThe same path exposed two runtime issues during validation: rustls needed an explicit process crypto provider, and Codex OAuth can return Responses SSE even when the original Claude request is non-streaming. Those are handled so cache-routed requests can complete instead of panicking or being parsed as JSON.\n\nConstraint: Official Codex uses conversation identity and Responses session headers for prompt cache routing.\nRejected: Always use generated proxy session IDs | generated IDs change per request and reduce cache reuse.\nConfidence: medium\nScope-risk: moderate\nDirective: Do not remove the client-provided-session guard unless generated session IDs become stable per conversation.\nTested: cargo test codex_oauth\nTested: Local dev app health check on 127.0.0.1:15721\nTested: Local proxy logs showed cache_read_tokens after restart\nNot-tested: Full cargo test without local cc-switch port conflict\nRelated: #2217
    
    * feat(proxy): aggregate forced Codex OAuth SSE into JSON for non-streaming clients
    
    Narrow override on top of #2235's streaming fallback.
    
    Codex OAuth always forces upstream openai_responses into SSE, even
    when the original Claude request is stream:false. #2235 handles this
    by routing such responses through the streaming transform so the
    client receives text/event-stream — that avoids the 422 that JSON
    parsing would produce, and it also protects any other provider that
    unexpectedly returns SSE (the response.is_sse() guard).
    
    But for Claude SDK callers that sent stream:false, returning SSE
    still violates the Anthropic non-streaming contract. This commit
    adds an override on exactly one combination — non-streaming client
    + codex_oauth + openai_responses — to aggregate the upstream
    Responses SSE into a synthetic Responses JSON and then run the
    regular responses_to_anthropic non-streaming transform. All other
    paths, including the generic response.is_sse() fallback, remain
    on the streaming path from #2235.
    
    The aggregator reuses proxy::sse::take_sse_block / strip_sse_field,
    which support both \n\n and \r\n\r\n delimiters; a hand-rolled
    split("\n\n") would silently fail on real HTTPS upstreams.
    
    Tests cover the happy path, CRLF delimiters, response.failed
    errors, and the missing response.completed defensive branch.
    
    ---------
    
    Co-authored-by: Jason <farion1231@gmail.com>
  • fix: use TOML parser instead of regex for Codex model extraction (#2222) (#2227)
    * fix(codex): use TOML parser instead of regex for model extraction
    
    Regex only matched model=... on first line, TOML parser handles
    multiline TOML correctly.
    
    Fixes #2222
    
    * fix(stream_check): drop unused regex::Regex import
    
    The previous commit replaced the only Regex usage in stream_check.rs
    with toml::Table parsing, leaving `use regex::Regex;` orphaned.
    Without this removal, `cargo clippy -- -D warnings` (run in CI)
    fails with `unused import: regex::Regex`.
    
    ---------
    
    Co-authored-by: Jason <farion1231@gmail.com>