Commit Graph

1536 Commits

  • fix(proxy/gemini): prefer exact tool-call id over normalized-name fallback
    The shadow-turn matcher used a three-branch `||` chain (id / full name /
    normalized name). When two tools share a suffix (e.g. `server_a:search`
    and `server_b:search`), the normalized-name clause could short-circuit
    on an earlier turn whose id is actually wrong for the incoming tool_use,
    mis-routing replay state (functionCall id / thoughtSignature) for later
    tool_result resolution.
    
    Split matching into two layers: when the incoming message carries any
    tool_use ids, run id-based lookup first and return on the earliest hit.
    Only fall back to full-name / normalized-name matching when the incoming
    ids are absent or none of them resolve.
    
    Add two regressions:
    
    - shadow_replay_prefers_exact_id_match_over_normalized_name_collision
      Two shadow turns with colliding normalized names and two assistant
      messages whose ids cross the positional order; asserts each message
      replays the id-correct shadow turn (including thoughtSignature).
    
    - shadow_replay_falls_back_to_name_when_ids_absent
      Shadow turn with no id and incoming tool_use with an empty id;
      asserts the name fallback still populates the replayed part.
  • fix(proxy/gemini): reconcile synthesized tool-call ids with later real ids + preserve thoughtSignature
    Three related findings on `streaming_gemini.rs` for Gemini's cumulative
    `streamGenerateContent` stream, all centered on `merge_tool_call_snapshots`:
    
    1. (P1) Match upgraded tool-call IDs by position.
       When Gemini delivers a `functionCall` without an id on chunk 1
       (cc-switch synthesizes `gemini_synth_*`) and then upgrades it to a
       real id on chunk 2, the `Some(incoming_id)` branch only matched by
       id and missed the existing synthesized snapshot. A second entry
       would be pushed, yielding duplicate `tool_use` content blocks at
       stream end — one with the synthesized id, one with the real id —
       which could trigger duplicate tool execution and break tool_result
       correlation. Add a positional fallback: when no id match exists but
       the same-position slot holds a synthesized id, merge into it.
       `or(preserved_id)` already lets the real id win the merge.
    
    2. (P2) Preserve prior thoughtSignature when merging snapshots.
       `tool_call_snapshots[index] = tool_call` overwrote the slot
       entirely, dropping any `thoughtSignature` captured on an earlier
       chunk if the current cumulative snapshot omitted it. Since
       `build_shadow_assistant_parts` writes `thoughtSignature` into the
       shadow turn from `tool_call.thought_signature`, a dropped signature
       would cause later replay requests to Gemini to be rejected with
       invalid-signature errors. Preserve the existing signature when the
       incoming chunk does not carry one.
    
    3. (P2) Document the part-order streaming trade-off.
       All `tool_use` content blocks are emitted after the final text
       `content_block_stop`, so interleaved [text, functionCall, text,
       functionCall] parts arrive at the Anthropic client as [text(concat),
       tool_use, tool_use] — different from the non-streaming transformer,
       which preserves part order. This is intentional given the cumulative
       snapshot model and the consumers we target (claude-code-like clients
       don't depend on strict interleaving for tool execution correctness).
       Add a block comment at the flush site describing the trade-off and
       what a strict-order fix would entail, so this isn't rediscovered as
       a bug later.
    
    Regression tests:
    - upgraded_real_id_merges_into_existing_synthesized_snapshot
    - thought_signature_preserved_when_later_chunk_omits_it
    
    Test count: 868 -> 870. clippy 1.95 clean. fmt clean.
  • chore(lint): address clippy 1.95 findings in existing modules
    CI upgraded to Rust 1.95 and flagged ten pre-existing warnings that
    older toolchains did not enforce. None relate to the Gemini proxy
    integration PR itself but they block CI on the feature branch, so
    clean them up here as a separate commit for easy review:
    
    collapsible_match:
    - proxy/providers/gemini_schema.rs: `"items" if value.is_object()`
      match guard instead of nested if.
    - proxy/providers/transform_responses.rs: fold
      `map_responses_stop_reason`'s `"completed"` / `"incomplete"` arms
      into match guards, relying on the existing `_ => "end_turn"` fall-
      through for non-matching guard conditions (semantics preserved).
    - services/session_usage_codex.rs: fold
      `"session_meta" if state.session_id.is_none()` guard, relying on
      the existing `_ => {}` fall-through.
    
    unnecessary_sort_by:
    - services/provider/endpoints.rs: `sort_by_key(|ep| Reverse(ep.added_at))`.
    - services/skill.rs (backup list): same Reverse idiom on `created_at`.
    - services/skill.rs (skill listings x2): `sort_by_key(|s| s.name.to_lowercase())`.
    
    useless_conversion:
    - services/skill.rs: drop the explicit `.into_iter()` on `zip`'s argument.
    
    while_let_loop:
    - services/webdav_auto_sync.rs: `while let Some(wait_for) = ...`
      instead of `loop { let Some(...) = ... else { break }; ... }`.
    
    All changes are mechanical and preserve behavior. `cargo test --lib`
    remains green (868 passed).
  • test(proxy/gemini): pin non-full-URL versioned relay base stripping
    Adds two regression tests that lock in the intentional asymmetry
    between full-URL and non-full-URL modes:
    
    - Full-URL mode: opaque base path (e.g. `https://relay.example/custom/v1beta`)
      is preserved verbatim. Already covered by
      `preserves_opaque_full_url_with_bare_v1beta_suffix`.
    - Non-full-URL mode: base path MUST strip `/v1`, `/v1beta`, etc. so the
      standard `/v1beta/models/{model}:method` endpoint can be appended
      without producing a doubled `/v1beta/v1beta/models/...` path.
    
    The non-full-URL contract is "base URL + cc-switch appends the
    canonical Gemini endpoint". A user who needs a relay's custom
    namespace (e.g. `/v1/models/...`) must use full-URL mode and paste
    the complete method path. This commit adds regression coverage so a
    future attempt to mirror full-URL's host-whitelist gating into
    `normalize_gemini_base_path` will fail the test suite immediately.
  • fix(proxy/gemini): gate generic REST suffix stripping behind Google host in non-full-URL mode
    `build_gemini_native_url` unconditionally stripped `/v1`, `/v1beta`,
    `/models`, and `/openai` suffixes from the base path regardless of
    host. This worked for Google's own endpoints but silently rewrote
    third-party relay URLs like `https://relay.example/custom/v1` to
    `.../custom/v1beta/models/...`, breaking any relay that mounts its
    Gemini-compatible namespace under a versioned prefix.
    
    The result was also asymmetric with the previously-fixed full-URL
    branch: toggling the "full URL" switch changed the outbound URL for
    the same base_url, which is exactly the kind of invisible behavior
    that makes debugging proxy deployments painful.
    
    Align `normalize_gemini_base_path` with
    `should_normalize_gemini_full_url`'s layered model:
    
    - Unconditional: `/models/...:method` structured paths and deep
      OpenAI-compat endpoints (`/openai/chat/completions`,
      `/openai/responses` and their versioned variants) — these are
      unambiguous Gemini-specific grammar on any host.
    - Google-host gated: generic `/v1`, `/v1beta`, `/models`, `/openai`
      suffixes only get stripped on `generativelanguage.googleapis.com`,
      `aiplatform.googleapis.com`, or `*-aiplatform.googleapis.com`.
      Other hosts preserve the prefix verbatim so relays keep their
      intended routing.
    
    Adds seven regression tests for the non-full-URL flow: opaque relay
    preservation (v1 / v1beta / models / openai suffix variants), Google
    host normalization (counter-case), and boundary cases (structured
    method path and deep OpenAI-compat endpoint stripped regardless of
    host).
    
    Test count: 864 -> 873.
  • fix(proxy/gemini): trim API key before provider-type detection and OAuth parsing
    Leading whitespace on a copied oauth_creds.json (e.g. trailing newline
    when the user copies the file content as-is) would slip past the
    `starts_with("ya29.") || starts_with('{')` prefix check in
    `ClaudeAdapter::provider_type`, causing the provider to be misclassified
    as raw-API-key Gemini and fall back to `x-goog-api-key` with the raw
    JSON as the key — which upstream rejects with 401.
    
    The frontend's `handleApiKeyChange` already trims on keystrokes but
    deep-link imports, the JSON editor, and live-config backfill all bypass
    that path. Trim at every backend extraction point so the coverage is
    uniform:
    
    - `ClaudeAdapter::extract_key` (5 env / fallback branches) gets
      `.map(str::trim)` before `.filter(|s| !s.is_empty())` so that
      whitespace-only values are also treated as missing.
    - `GeminiAdapter::extract_key_raw` gets the same chain (including
      the `.filter` it was missing before).
    - `GeminiAdapter::parse_oauth_credentials` gets a defensive
      `let key = key.trim();` at the entry as a belt-and-suspenders guard.
    
    Adds two regression tests covering JSON and bare `ya29.` keys with
    leading newline/space.
  • fix(proxy/gemini): gate /v1beta behind Google host + normalize models/ model id prefix
    Two related P2 corrections to the Gemini Native URL surface, both
    folding into the existing Google-host-whitelist architecture.
    
    ## P2a — `/v1beta` suffix should not unconditionally trigger rewrite
    
    `should_normalize_gemini_full_url` placed `/v1beta` and `/v1beta/models`
    in the unconditional layer on the reasoning that `/v1beta` is
    Google-specific. In practice an opaque relay fronting a non-Gemini
    service at `https://relay.example/custom/v1beta` would still be
    silently rewritten to `/v1beta/models/{model}:generateContent`,
    breaking the deployment.
    
    Move `/v1beta`, `/v1beta/models`, and `/v1beta/openai` into the
    Google-host gated layer alongside `/v1`, `/models`, and friends. The
    unconditional layer now only accepts paths whose grammar is
    intrinsically Gemini — `/models/...:generateContent` method calls and
    the deep OpenAI-compat endpoints like `/openai/chat/completions` and
    `/openai/responses`. Pasted AI-Studio URLs such as
    `https://generativelanguage.googleapis.com/v1beta` still normalize
    because the host matches the whitelist.
    
    ## P2b — `model: "models/gemini-2.5-pro"` produced doubled path prefix
    
    Gemini SDKs (and the official `list_models` response) commonly surface
    model ids in resource-name form `models/gemini-2.5-pro`. Raw
    interpolation into `format!("/v1beta/models/{model}:...")` produced
    `/v1beta/models/models/gemini-2.5-pro:streamGenerateContent` which
    upstream rejects — yielding false-negative health checks for otherwise
    valid provider configs.
    
    Introduce `normalize_gemini_model_id(&str) -> &str` in `gemini_url`
    as the single source of truth: strips an optional leading `/` then an
    optional `models/` prefix, leaving bare ids untouched. Apply in the
    three call sites that build a Gemini method URL:
    - `services/stream_check.rs::resolve_claude_stream_url` (unified path)
    - `services/stream_check.rs::check_gemini_stream` (Gemini-only path)
    - `proxy/forwarder.rs::rewrite_claude_transform_endpoint` (production)
    
    Tests (9 new):
    - `gemini_url`: 3 regressions for opaque vs Google-host `/v1beta*`
      handling + 5 unit tests pinning `normalize_gemini_model_id` behavior
      (strip prefix, leave bare id, preserve nested slashes past the one
      stripped prefix, tolerate leading slash, pass through empty input).
    - `stream_check`: one end-to-end regression confirming
      `models/gemini-2.5-pro` collapses to the expected single-prefix URL.
    - `forwarder`: one end-to-end regression on the production rewrite
      path.
    
    All 864 lib tests pass; cargo fmt + clippy -D warnings clean.
    
    Addresses Codex P2 feedback on #1918.
  • refactor(proxy/gemini): share build_anthropic_usage between stream and non-stream paths
    `streaming_gemini::anthropic_usage_from_gemini` and
    `transform_gemini::build_anthropic_usage` were byte-for-byte identical
    (32 lines each) — both converting Gemini `usageMetadata` into the
    Anthropic `usage` shape including `cache_read_input_tokens` mapping.
    
    Promote the non-streaming version to `pub(crate)` and reuse it from the
    streaming SSE converter. Removes ~30 lines of duplication and guarantees
    the two paths cannot drift apart.
    
    No behavioral change; all 854 lib tests pass; cargo fmt + clippy -D
    warnings clean.
  • fix(proxy/gemini): align shadow id with client-visible id in non-streaming path
    When Gemini returns a `functionCall` without an id (common in 2.x
    parallel calls), `gemini_to_anthropic_with_shadow_and_hints` previously
    generated TWO independent synthesized UUIDs:
    
      1. Line 186-197 — synthesized id `A` used for the Anthropic-visible
         `content[tool_use].id` returned to the client.
      2. Line 850-881 — `extract_tool_call_meta` independently synthesized
         id `B ≠ A`, which populated `shadow_turn.tool_calls[i].id`.
    
    `shadow_content` (line 225-228, cloned from `rectified_parts`) retained
    the original missing/empty id. Result: the client sees id `A`, the
    shadow store holds id `B`.
    
    On the next turn, `convert_messages_to_contents` builds
    `tool_name_by_id` from `build_tool_name_map_from_shadow_turns`, which
    uses `tool_calls[i].id` — so the map contains `B → name` but not
    `A → name`. When the client sends back `tool_result(tool_use_id=A)`,
    resolution fails with:
    
      Unable to resolve Gemini functionResponse.name for tool_use_id `A`
    
    This affects both truncated histories (client sends only the
    tool_result) and full histories (shadow-replay branch at line 342-354
    skips `convert_message_content_to_parts`, so the assistant tool_use
    block never registers id `A` itself).
    
    Fix: make `rectified_parts` the single source of truth. After
    `rectify_tool_call_parts`, run a pre-pass that writes
    `synthesize_tool_call_id()` back into any `functionCall` that lacks a
    non-empty id. All three readers — the content builder (186-197), the
    shadow_content clone (225-228), and `extract_tool_call_meta` — then
    observe the same id. `shadow_parts()` already strips synthesized ids on
    replay (line 616-628), so the internal identifier never leaks to
    Gemini upstream.
    
    This mirrors the streaming path, which already has single-source-of-
    truth semantics via `tool_call_snapshots` in `streaming_gemini.rs` —
    no change needed there.
    
    Tests (5 new in `transform_gemini::tests`):
    - `non_stream_shadow_id_matches_client_visible_id`: asserts
      `response.content[0].id == shadow.tool_calls[0].id ==
      shadow.assistant_content.parts[0].functionCall.id`.
    - `non_stream_missing_id_scenario_a_truncated_history_resolves`: turn 2
      sends only `[tool_result(id=A)]`; resolution must succeed.
    - `non_stream_missing_id_scenario_b_full_history_replay_resolves`: turn 2
      sends `[assistant(tool_use=A), tool_result(A)]`; shadow-replay branch
      strips the synth id from outgoing `functionCall` while still
      resolving the subsequent `tool_result`.
    - `non_stream_preserves_original_gemini_id_when_present`: regression —
      genuine Gemini ids flow through unchanged.
    - `non_stream_synthesized_id_not_leaked_to_gemini_via_shadow_replay`:
      defensive — shadow-replay path must strip synth ids from both
      `functionCall.id` and `functionResponse.id`.
    
    All 854 lib tests pass; cargo fmt + clippy -D warnings clean.
    
    Addresses Codex follow-up P1 on #1918.
  • fix(proxy/gemini): gate generic REST path suffixes behind Google host whitelist
    `should_normalize_gemini_full_url` previously treated any full URL whose
    path ends with `/v1`, `/v1/models`, `/models`, `/v1/openai`, or `/openai`
    as a structured Gemini endpoint and rewrote it to
    `/v1beta/models/{model}:generateContent`. These are ubiquitous REST
    conventions — opaque relays such as `https://relay.example/custom/v1`
    legitimately use them for fixed endpoints — so the rewrite silently
    routed traffic to the wrong upstream path.
    
    Split the predicate into two layers:
    
    - **Unconditional**: `matches_structured_gemini_models_path` (i.e. a
      `/models/...:generateContent` method call anywhere in the path), the
      Google-specific `/v1beta*` family, and the deep OpenAI-compat paths
      (`/v1beta/openai/chat/completions`, `/openai/chat/completions`, and
      their `responses` siblings). These remain host-agnostic because the
      path grammar itself is Gemini-specific.
    - **Google-host gated**: `/v1`, `/v1/models`, `/models`, `/v1/openai`,
      `/openai`. Only normalized when the host is one of
      `generativelanguage.googleapis.com`, `aiplatform.googleapis.com`, or a
      real `*-aiplatform.googleapis.com` Vertex regional endpoint. The match
      is exact/suffix (not `contains`), so lookalike hosts like
      `aiplatform.example.com` are correctly treated as opaque relays.
    
    Tests (8 new in `gemini_url::tests`):
    - Four opaque-relay cases: `/custom/v1`, `/custom/models`,
      `/custom/v1/models`, `/custom/openai` — all preserved as-is.
    - Three Google-host counter-cases: `/v1`, `/models`, and
      `us-central1-aiplatform.googleapis.com/v1` still normalize.
    - One lookalike safety case: `aiplatform.example.com/v1` is NOT
      treated as Google.
    
    All 849 lib tests pass; cargo fmt + clippy -D warnings clean.
    
    Addresses Codex review P2 on #1918.
  • fix(proxy/gemini): treat empty-string functionCall id as missing in streaming path
    Follow-up to the earlier P1 fix: some Gemini relays serialize an absent
    functionCall id as `"id": ""` instead of omitting the field. The
    non-streaming `extract_tool_call_meta` already filters these via
    `.filter(|s| !s.is_empty())`, but the streaming counterpart
    `extract_tool_calls` passed the empty string straight through
    `function_call.get("id").and_then(|v| v.as_str())` into
    `GeminiToolCallMeta::new`, producing a `Some("")` id.
    
    Downstream, `merge_tool_call_snapshots` would then match two parallel
    no-id calls against each other on their shared empty-string id,
    collapsing them into a single snapshot (silent data loss for the first
    call) and emitting an Anthropic `tool_use.id: ""` that breaks tool_result
    correlation on the Claude Code client.
    
    Fix:
    - `extract_tool_calls`: apply the same `filter(|s| !s.is_empty())` guard
      used in the non-streaming path so empty strings become `None` before
      reaching the shadow meta.
    - `merge_tool_call_snapshots`: defensively collapse any incoming
      `Some("")` to `None` up front — keeps the "missing vs present" invariant
      local to the merge step for future callers that might build
      `GeminiToolCallMeta` by hand.
    
    Tests (2 new, both in streaming_gemini):
    - `parallel_empty_string_id_calls_are_treated_as_missing_and_preserved`
      covers two parallel calls with explicit `"id": ""` — asserts both
      surface, no empty tool_use id leaks, and each gets a unique
      `gemini_synth_` id.
    - `single_empty_string_id_tool_call_gets_synthesized_id` covers the
      non-parallel degraded-relay case.
    
    All 841 lib tests pass; cargo fmt + clippy -D warnings clean.
    
    Addresses Codex follow-up P1 on #1918.
  • fix(proxy/gemini): narrow URL normalization + guard empty OAuth access_token
    P2a — Preserve opaque relay URLs that contain `/v1/models/` prefixes.
    
    `should_normalize_gemini_full_url` previously flagged any full URL whose
    path merely contained `/v1beta/models/` or `/v1/models/` as a structured
    Gemini endpoint, forcing rewrite to `.../v1beta/models/{model}:method`.
    This silently dropped legitimate relay route segments (e.g.
    `https://relay.example/v1/models/invoke` → `.../v1beta/models/...:generateContent`,
    losing `/invoke`) and sent traffic to the wrong upstream path.
    
    Replace the bare `contains(...)` checks with
    `matches_structured_gemini_models_path`, which requires the
    `/models/` segment to be followed by a canonical Gemini method call
    (`*:generateContent` or `*:streamGenerateContent`). The
    `matches_bare_gemini_models_path` helper is generalized (and renamed) to
    handle both `/v1beta/models/` and `/v1/models/` alongside the original
    bare `/models/` shape.
    
    P2b — Reject empty Gemini OAuth access_tokens before they reach the
    bearer header.
    
    `GeminiAdapter::parse_oauth_credentials` accepts refresh-token-only JSON
    (and surfaces `{"access_token": "", ...}` for expired credentials) with
    `access_token` defaulting to `""`. The Claude adapter's GeminiCli branch
    then called `AuthInfo::with_access_token(key, creds.access_token)`
    unconditionally, so the bearer-header builder at
    `AuthStrategy::GoogleOAuth` resolved to `Authorization: Bearer ` — a
    deterministic 401 from upstream.
    
    CC Switch does not currently exchange the refresh_token for a fresh
    access_token (`OAuthCredentials::needs_refresh` / `can_refresh` are
    annotated `#[allow(dead_code)]`). Until that exists, only attach
    `access_token` when it is non-empty; fall back to plain GoogleOAuth
    strategy with the raw key and log a warn pointing users at
    `~/.gemini/oauth_creds.json` so the failure mode is observable.
    
    Tests:
    - gemini_url.rs: three new regressions — opaque `/v1/models/invoke`,
      opaque `/v1beta/models/route`, and the positive counter-case where a
      structured `/v1/models/...:generateContent` path still normalizes.
    - claude.rs: three new `test_extract_auth_gemini_cli_*` tests covering
      refresh-only JSON, empty-string access_token JSON, and the valid-JSON
      pass-through.
    
    All 839 lib tests pass; cargo fmt + clippy -D warnings clean.
    
    Addresses Codex review P2 findings on #1918.
  • fix(proxy/gemini): synthesize unique ids for no-id tool calls + enforce object params schema
    P1 — Parallel tool calls without Gemini-assigned ids no longer collapse.
    Gemini 2.x native parallel `functionCall` entries may omit the `id` field.
    The previous `merge_tool_call_snapshots` fell back to matching by `name`,
    which silently merged two parallel calls to the same function into one
    entry — dropping the first call's args. The non-streaming path and shadow
    store further bottlenecked on empty-string ids: multiple `tool_use` blocks
    shared the same id, and `tool_name_by_id.get("")` could only return one
    mapping, causing later `tool_result` round-trips to fail with
    `Unable to resolve Gemini functionResponse.name` or bind to the wrong tool.
    
    Fix: introduce `synthesize_tool_call_id()` producing `gemini_synth_<uuid>`.
    Both streaming and non-streaming response paths now guarantee every
    Anthropic-visible tool_use carries a unique id. `merge_tool_call_snapshots`
    matches by id first, falling back to the `parts` array position (for the
    cumulative-streaming case) while preserving the synthesized id across
    chunks. `convert_message_content_to_parts` detects the synthetic prefix
    and strips the id from outbound `functionCall`/`functionResponse` so the
    internal identifier never leaks upstream. `shadow_parts` performs the
    same strip when replaying a recorded assistant turn.
    
    P2 — Vertex AI rejects empty `parameters` schemas. When an Anthropic tool
    arrives with missing or empty `input_schema`, the proxy used to emit
    `"parameters": {}` (no `type`), which fails Vertex AI validation with
    `functionDeclaration parameters schema should be of type OBJECT`.
    Contrary to the automated-review suggestion, the fix is not to omit
    `parameters` (that too is rejected) but to normalize to the canonical
    empty-object form `{type: "object", properties: {}}`.
    Refs: google-gemini/generative-ai-python#423, BerriAI/litellm#5055.
    
    Fix: new `ensure_object_schema` helper in `gemini_schema` promotes
    missing `type` to `"object"` and adds empty `properties` when absent,
    while leaving atomic (non-object) schemas untouched.
    
    Tests: seven new regressions covering parallel no-id calls, cumulative
    chunk id reuse, synthetic-id round-trip both directions, shadow replay
    id stripping, and the three Vertex-AI schema shapes.
    
    The two existing wrapper functions (`gemini_to_anthropic` and
    `gemini_to_anthropic_with_shadow`) gain `#[allow(dead_code)]` to clear
    a pre-existing clippy -D warnings failure — they are part of the public
    transform API surface and intentionally kept for future callers.
    
    Addresses Codex review P1/P2 on #1918.
  • style: apply cargo fmt to pass Backend Checks CI
    Wrap prompt_cache_key chained call across lines per rustfmt default
    formatting. Pure formatting change, no behavior difference.
  • Keep Gemini tool replay stable across Claude request boundaries
    Claude Code follow-up requests were still falling back to locally reconstructed functionCall parts, which dropped Gemini thought signatures and triggered INVALID_ARGUMENT errors from the official Gemini API. The replay path needed to survive real Claude request boundaries, not just idealized in-process test flows.
    
    This change makes Claude requests reuse X-Claude-Code-Session-Id as the shadow session key, records streamed Gemini tool turns before tool_use events are fully drained, and matches assistant tool_use turns to shadow state by tool_use id and normalized tool name before positional fallback. Together these fixes keep thoughtSignature-bearing Gemini tool calls available for the next request in the loop.
    
    Constraint: Claude Code sends a stable X-Claude-Code-Session-Id header while metadata.session_id may be absent on follow-up requests
    Rejected: Rely on metadata-only Claude session extraction | generated fresh session ids and broke cross-request shadow replay
    Rejected: Record Gemini shadow only after streaming completes | loses the race when the client sends the next request immediately after tool_use
    Confidence: high
    Scope-risk: narrow
    Reversibility: clean
    Directive: Preserve Gemini shadow continuity across requests by keying Claude sessions from the header first and persisting tool-call shadow before yielding tool_use events downstream
    Tested: cargo fmt --manifest-path src-tauri/Cargo.toml --all; cargo test --manifest-path src-tauri/Cargo.toml test_extract_session_from_claude_header; cargo test --manifest-path src-tauri/Cargo.toml test_extract_session_from_claude_header_precedes_metadata; cargo test --manifest-path src-tauri/Cargo.toml stores_tool_shadow_before_tool_use_events_are_fully_drained; cargo test --manifest-path src-tauri/Cargo.toml shadow_replay_matches_tool_use_turn_by_id_when_position_drifts; cargo test --manifest-path src-tauri/Cargo.toml shadow_replay_aligns_to_latest_turns_after_client_truncation
    Not-tested: Full src-tauri test suite without test filters; live end-to-end Gemini relay after this exact commit hash
  • Prevent Gemini review regressions in streaming and tool rectification
    PR #1918 review feedback exposed two correctness issues in the Gemini Native adapter path. Gemini SSE buffering was still using lossy UTF-8 decoding, which could corrupt split multibyte payloads and drop streamed output. Tool arg rectification also removed top-level parameters eagerly, which broke tools that legitimately define a parameters field.
    
    This change moves Gemini SSE buffering onto the existing append_utf8_safe path and makes parameters flattening conditional on the schema actually expecting nested extraction. The old Skill rectification path stays intact, and new regression tests cover both the preserved parameters case and UTF-8-split JSON payloads.
    
    Constraint: Existing PR #1918 review feedback must be fixed without staging unrelated local docs and artifact files
    Rejected: Keep String::from_utf8_lossy in Gemini SSE buffering | corrupts split multibyte payloads and can drop JSON chunks
    Rejected: Always preserve the parameters wrapper | regresses the existing nested-parameters rectification path for Skill-style tools
    Confidence: high
    Scope-risk: narrow
    Reversibility: clean
    Directive: Keep Gemini SSE buffering on the UTF-8-safe accumulator path and only unwrap parameters when the target schema does not declare it as a legitimate field
    Tested: cargo fmt --manifest-path src-tauri/Cargo.toml --all; cargo test --manifest-path src-tauri/Cargo.toml preserves_utf8_boundaries_when_json_payload_spans_chunks; cargo test --manifest-path src-tauri/Cargo.toml gemini_to_anthropic_rectifies_tool_args_from_schema_hints; cargo test --manifest-path src-tauri/Cargo.toml rectifies_streamed_skill_args_from_nested_parameters; cargo test --manifest-path src-tauri/Cargo.toml gemini_to_anthropic_preserves_legitimate_parameters_arg
    Not-tested: Full src-tauri test suite; live end-to-end Gemini relay traffic against upstream services
  • Merge branch 'main' into feat/gemini-proxy-integration
    # Conflicts:
    #	src-tauri/src/proxy/providers/claude.rs
    #	src/i18n/locales/en.json
    #	src/i18n/locales/ja.json
    #	src/i18n/locales/zh.json
  • feat(stream-check): refresh default models and detect model-not-found errors (#2099)
    * chore(stream-check): update default health check models to latest
    
    Replaces deprecated gpt-5.1-codex@low with gpt-5.4@low and switches
    the Gemini default from gemini-3-pro-preview to gemini-3-flash-preview
    to pick the lightest variant of the latest series for fast, low-cost
    health checks.
    
    https://claude.ai/code/session_01NGWLchcTP76rJHjiP5Ehte
    
    * feat(stream-check): detect model-not-found errors with dedicated toast
    
    Health check previously classified failures purely by HTTP status code,
    which meant deprecated/invalid models showed up as a generic "Not found
    (404)" error pointing users to check the Base URL — misleading when the
    URL is fine and only the test model is wrong (e.g. gpt-5.1-codex after
    it was retired).
    
    Backend: add detect_error_category() that inspects 4xx response bodies
    for model-not-found indicators (model_not_found, does not exist,
    invalid model, not_found_error, etc.) and returns a "modelNotFound"
    category. Thread the resolved test model through build_stream_check_result
    so the failed result carries it in model_used. Add StreamCheckResult
    .error_category field (serde-skipped when None).
    
    Frontend: useStreamCheck branches on errorCategory === "modelNotFound"
    before the HTTP-status fallback and renders a toast.error with the model
    name and a description pointing to Model Test Config. Add i18n keys
    (modelNotFound / modelNotFoundHint) for zh/en/ja.
    
    Tests: unit-test detect_error_category against real OpenAI/Anthropic
    error shapes, 5xx false-positive avoidance, and plain 401 auth errors.
    
    https://claude.ai/code/session_01NGWLchcTP76rJHjiP5Ehte
    
    * fix(stream-check): add missing error_category field in fallback
    
    The error_category field was added to StreamCheckResult in this branch
    but the fallback constructor in stream_check_all_providers was not
    updated, which broke cargo build.
    
    ---------
    
    Co-authored-by: Claude <noreply@anthropic.com>
  • fix(opencode): use json5 parser for trailing comma tolerance (#2023)
    * fix(opencode): use json5 parser for trailing comma tolerance
    
    OpenCode CLI writes opencode.json with trailing commas (valid JSONC),
    but CC Switch parsed it with serde_json (strict JSON), causing errors
    like 'trailing comma at line 35 column 3'.
    
    Switch to json5::from_str which accepts both JSON and JSONC. The json5
    crate is already a project dependency. Change error type from
    AppError::json() to AppError::Config() since json5::Error differs from
    serde_json::Error.
    
    * style(opencode): apply rustfmt to satisfy cargo fmt --check
    
    The previous commit's .map_err(...) chain exceeded rustfmt's default
    100-char max_width, breaking CI's `cargo fmt --check`. Let rustfmt
    wrap the closure body as a multi-line block. No behavior change.
    
    ---------
    
    Co-authored-by: 18067889926 <ming.flute@outlook.com>
    Co-authored-by: Jason <farion1231@gmail.com>
  • fix: preserve env vars when saving Google Official Gemini provider (#2087)
    write_gemini_live() unconditionally cleared env_map for GoogleOfficial
    auth type, discarding user-configured env vars (e.g. GEMINI_MODEL).
    Remove the env_map.clear() call so the user's settings_config.env is
    written as-is, and merge identical Packycode/Generic match arms.
  • feat: classify stream check errors with color-coded toasts
    Distinguish between "provider rejects probe" (yellow warning) and
    "genuinely broken" (red error) in health check results.
    
    Backend: add AppError::HttpStatus variant to carry structured HTTP
    status codes, populate http_status on error results, classify codes
    into short labels (e.g. "Auth rejected (401)"), and truncate overly
    long response bodies.
    
    Frontend: route 401/403/400/429/5xx to toast.warning with localized
    hints explaining the error may not indicate actual unusability; route
    404/402/connection errors to toast.error. Add i18n keys for all three
    locales (zh/en/ja).
    
    Also deduplicate check_once by reusing build_stream_check_result.
  • fix: auto-expand collapsed messages when search matches hidden content
    When a search query matches text beyond the collapse point, the message
    automatically expands to show the highlighted match. Also adds
    aria-expanded for accessibility.
  • perf: collapse long session messages to reduce text layout cost
    Messages over 3000 characters are now truncated to 1500 characters by
    default, with an expand/collapse toggle. This avoids expensive browser
    text layout for large AI responses containing code or tool output.
  • perf: virtualize session message list for long conversations
    Replace full DOM rendering with @tanstack/react-virtual to only render
    visible messages (~25 DOM nodes instead of N). Wrap SessionMessageItem
    in React.memo to prevent unnecessary re-renders on state changes.
  • fix: handle root-level skill repos during installation
    When a repo itself is a single skill (SKILL.md at repo root), the
    discovery phase sets directory to the repo name, but after ZIP
    extraction (which strips the root folder), no matching subdirectory
    exists. Add a fallback to check if SKILL.md exists directly in the
    extracted temp directory before reporting SKILL_DIR_NOT_FOUND.
    
    Fixes installation of repos like zlbigger/Google-SEOs.skill.
  • fix: remove duplicate usage summary from app filter bar
    The request count and total cost were displayed both in the app filter
    bar and in the UsageSummaryCards below it. Remove the redundant inline
    summary and its unused imports.
  • refactor: remove per-provider proxy config feature
    Replace all remaining "代理服务/Proxy Service/プロキシサービス" references
    in the local routing feature context with "路由服务/Routing Service/
    ルーティングサービス". This covers service settings, status messages,
    tooltips, field descriptions, and tab labels.
    
    Global Proxy, HTTP proxy hints, and AI Agent references are unchanged.
  • docs: rename takeover docs to routing across all languages
    Rename 4.2-takeover.md to 4.2-routing.md in zh/en/ja user manuals,
    replacing all "接管/takeover" terminology with "路由/routing" to match
    the rebranded feature name. Update README index links accordingly.
  • rename: rebrand "Local Proxy Takeover" to "Local Routing" in all locales
    Replace all user-facing references to "本地代理接管/Local Proxy/Takeover"
    with "本地路由/Local Routing/ローカルルーティング" across zh/en/ja to
    eliminate naming confusion with the separate "Global Proxy" feature.
    
    Only i18n string values are changed; keys, code identifiers, and
    database schema remain untouched.
  • refactor: remove per-provider proxy config feature
    The per-provider proxy configuration (meta.proxyConfig) is removed
    because its scope is too narrow and covered by global proxy settings
    and proxy takeover mode. Users can achieve the same result via the
    global proxy panel.
    
    Changes:
    - Remove ProviderProxyConfig type (frontend TS + backend Rust)
    - Remove ProviderAdvancedConfig proxy UI block, keep testConfig/pricingConfig
    - Simplify http_client: delete build_proxy_url_from_config,
      build_client_for_provider, get_for_provider
    - Simplify forwarder/stream_check/model_fetch to use global client
    - Remove i18n keys (en/zh/ja)
    - Fix pre-existing test bug in transform.rs (extra None arg)
  • Add LemonData sponsor and update partner logo formats
    Add new sponsor LemonData to partner section. Update Crazyrouter logo
    from JPG to PNG format.
  • Update partner logos and sync across languages
    Synchronize Crazyrouter logo format and partner section updates across
    EN/JA/ZH README files.
  • fix(clippy): remove redundant closure in session ID parsing
    Replace `|uid| parse_session_from_user_id(uid)` with direct
    function reference to satisfy clippy::redundant_closure.
  • fix(proxy): reduce unnecessary Copilot premium interaction consumption
    - Fix request classification: treat messages containing tool_result as
      agent continuation instead of user-initiated, preventing false premium
      charges on every tool call
    - Add subagent detection via __SUBAGENT_MARKER__ and metadata._agent_
      fallback, setting x-interaction-type=conversation-subagent
    - Add deterministic x-interaction-id derived from session ID to group
      requests into a single billing interaction
    - Add orphan tool_result sanitization to prevent upstream API errors
      that could cause retries and duplicate billing
    - Reorder pipeline: classify (on original body) → sanitize → merge →
      warmup, ensuring classification sees raw tool_result semantics
    - Enable warmup downgrade by default with gpt-5-mini model
    - Enhance session ID extraction priority chain for Copilot cache keys
    - Detect infinite whitespace bug in streaming tool call arguments
  • fix(usage): remove unnecessary private IP restrictions from usage script
    SSRF protection (private IP blocking, suspicious hostname detection) was
    originally added for web-server threat models but is unnecessary for a
    local desktop app where the user already has full network access. This
    removal unblocks legitimate use cases like enterprise intranet APIs,
    Docker container addresses, and self-hosted services.
    
    Retained: HTTPS enforcement and same-origin checks which still provide
    meaningful security (protecting API keys in transit and preventing
    scripts from leaking keys to unrelated domains).
  • fix(usage): sync request log time range with dashboard 1d/7d/30d selector
    The RequestLogTable had a hardcoded 24-hour rolling window, ignoring the
    dashboard's time range selector. Now it accepts a timeRange prop and
    dynamically adjusts the query window, so users can view logs beyond just
    the last day.
  • feat(usage): improve pagination with first/last 3 pages and page jump input
    Show first 3 and last 3 page buttons instead of just first/last, with
    Set-based deduplication for clean edge merging. Add a page number input
    field with Go button for direct page navigation.
  • fix(sessions): strip OpenClaw message_id suffix and allow 2-line titles
    OpenClaw gateway injects `[message_id: UUID]` metadata at the end of
    every message, wasting display space. Strip this suffix from both title
    and summary fields.
    
    Also change session title display from single-line truncate to
    line-clamp-2, so longer titles (e.g. OpenClaw's timestamp-prefixed
    messages) can show more meaningful content across two lines.
  • feat(sessions): extract meaningful titles for Codex and OpenClaw sessions
    Previously Codex and OpenClaw sessions only showed the working directory
    basename as the title, making it hard to distinguish sessions in the same
    project. Now both providers extract the first real user message as the
    session title, matching the existing Claude Code behavior.
    
    - Codex: first user message → dir basename (skips AGENTS.md injection)
    - OpenClaw: displayName (sessions.json) → first user message → dir basename
    - Move TITLE_MAX_CHARS constant to shared utils.rs
    - Use Option<&HashMap> for OpenClaw parse_session to avoid leaky abstraction
  • fix(usage): deduplicate proxy and session log usage records
    Extract message_id from Claude API responses (msg_xxx) and use it to
    generate a shared request_id format (session:{msg_xxx}) between the
    proxy logger and session log sync. When session sync encounters the
    same request_id via INSERT OR IGNORE, it skips the duplicate.
    
    - Add message_id field to TokenUsage, extracted from Claude responses
    - Add TokenUsage::dedup_request_id() to generate shared request IDs
    - Define SESSION_REQUEST_ID_PREFIX constant to eliminate magic strings
    - Change proxy logger to INSERT OR REPLACE for richer-data-wins semantics
  • feat(pricing): add ~50 new model pricing entries and fix outdated prices
    Add pricing data for 4 new providers (Qwen, xAI Grok, Mistral, Cohere)
    and supplement existing providers (MiniMax M2.5/M2.7, GLM-5/5.1,
    Doubao Seed 2.0, MiMo V2 Pro, OpenAI o1/o3/codex-mini/gpt-5-mini/nano).
    
    Fix outdated prices for deepseek-chat, deepseek-reasoner, and kimi-k2.5.
    Fix display_name casing "Mimo" → "MiMo" for consistency.
    
    Use prepared statement in seed_model_pricing() to avoid recompiling SQL
    on each of ~130 INSERT iterations.
    
    Schema migration v8→v9: DELETE + re-seed model_pricing for existing users.
  • feat: block official provider switching during proxy takeover
    Prevent users from switching to official providers (Anthropic/OpenAI/Google)
    when proxy takeover is active, as using a proxy with official APIs may cause
    account bans.
    
    Defense-in-depth across 4 layers:
    - Backend: ProviderService::switch(), hot_switch_provider(), switch_proxy_provider command
    - Frontend: useProviderActions soft guard with error toast
    - UI: ProviderActions button disabled with ShieldAlert icon
    - Tray menu: official provider items disabled with  indicator
    
    Also warns when enabling proxy takeover while current provider is official.