cc-switch

mirror of https://github.com/farion1231/cc-switch.git synced 2026-05-23 22:21:00 +08:00
Files
T
Dex Miller 03a0f9661b feat(proxy): Gemini Native API proxy integration (#1918 )
* refactor(proxy): extract take_sse_block helper with CRLF delimiter support

Replace inline `buffer.find("\n\n")` SSE splitting logic across streaming,
streaming_responses, response_handler, and response_processor with a shared
`take_sse_block` function that handles both `\n\n` and `\r\n\r\n` delimiters.

* feat(proxy): add Gemini Native URL builder and full-URL resolver

Introduce gemini_url module that normalizes legacy Gemini/OpenAI-compatible
base URLs into canonical models/*:generateContent endpoints. Supports both
structured Gemini URLs (auto-normalized) and opaque relay URLs (pass-through
with query params only).

* feat(proxy): add Gemini Native schema, shadow store, transform, and streaming

- gemini_schema: Gemini generateContent request/response type definitions
- gemini_shadow: session-scoped shadow store for thinking signature and
  tool-call state replay across streaming chunks
- transform_gemini: bidirectional Anthropic Messages ↔ Gemini Native
  request/response conversion with thinking block and tool-use support
- streaming_gemini: Gemini SSE → Anthropic SSE streaming adapter with
  incremental thinking/text/tool_use delta emission

* feat(proxy): wire Gemini Native format into proxy core and Claude adapter

Integrate gemini_native api_format throughout the proxy pipeline:
- ClaudeAdapter: detect Gemini provider type, Google/GoogleOAuth auth
  strategies, and suppress Anthropic-specific headers for Gemini targets
- Forwarder: Gemini URL resolution, shadow store threading, endpoint
  rewriting to models/*:generateContent with stream/non-stream variants
- Handlers: route Gemini streaming through streaming_gemini adapter and
  non-streaming through transform_gemini converter
- Server/State: add GeminiShadowStore to shared ProxyState
- StreamCheck: support gemini_native health check with proper auth headers

* feat(ui): add Gemini Native provider preset and api format option

- Add gemini_native to ClaudeApiFormat type and ProviderMeta.apiFormat
- Add "Gemini Native" provider preset with default Google AI endpoints
- Show Gemini-specific endpoint hints and full-URL mode guidance
- Add gemini_native option to API format selector in ClaudeFormFields
- Add i18n strings for zh/en/ja

* feat(proxy): add Gemini Native tool argument rectification

* feat(proxy): update Gemini streaming and transformation logic

* fix(proxy): align shadow turns to tail on client history truncation

* fix: revert unrelated cache_key change in claude proxy transform

Restore .unwrap_or(&provider.id) fallback for cache_key to match main
branch behavior. Only gemini_native related changes should be in this branch.

* Prevent Gemini review regressions in streaming and tool rectification

PR #1918 review feedback exposed two correctness issues in the Gemini Native adapter path. Gemini SSE buffering was still using lossy UTF-8 decoding, which could corrupt split multibyte payloads and drop streamed output. Tool arg rectification also removed top-level parameters eagerly, which broke tools that legitimately define a parameters field.

This change moves Gemini SSE buffering onto the existing append_utf8_safe path and makes parameters flattening conditional on the schema actually expecting nested extraction. The old Skill rectification path stays intact, and new regression tests cover both the preserved parameters case and UTF-8-split JSON payloads.

Constraint: Existing PR #1918 review feedback must be fixed without staging unrelated local docs and artifact files
Rejected: Keep String::from_utf8_lossy in Gemini SSE buffering | corrupts split multibyte payloads and can drop JSON chunks
Rejected: Always preserve the parameters wrapper | regresses the existing nested-parameters rectification path for Skill-style tools
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep Gemini SSE buffering on the UTF-8-safe accumulator path and only unwrap parameters when the target schema does not declare it as a legitimate field
Tested: cargo fmt --manifest-path src-tauri/Cargo.toml --all; cargo test --manifest-path src-tauri/Cargo.toml preserves_utf8_boundaries_when_json_payload_spans_chunks; cargo test --manifest-path src-tauri/Cargo.toml gemini_to_anthropic_rectifies_tool_args_from_schema_hints; cargo test --manifest-path src-tauri/Cargo.toml rectifies_streamed_skill_args_from_nested_parameters; cargo test --manifest-path src-tauri/Cargo.toml gemini_to_anthropic_preserves_legitimate_parameters_arg
Not-tested: Full src-tauri test suite; live end-to-end Gemini relay traffic against upstream services

* Keep Gemini tool replay stable across Claude request boundaries

Claude Code follow-up requests were still falling back to locally reconstructed functionCall parts, which dropped Gemini thought signatures and triggered INVALID_ARGUMENT errors from the official Gemini API. The replay path needed to survive real Claude request boundaries, not just idealized in-process test flows.

This change makes Claude requests reuse X-Claude-Code-Session-Id as the shadow session key, records streamed Gemini tool turns before tool_use events are fully drained, and matches assistant tool_use turns to shadow state by tool_use id and normalized tool name before positional fallback. Together these fixes keep thoughtSignature-bearing Gemini tool calls available for the next request in the loop.

Constraint: Claude Code sends a stable X-Claude-Code-Session-Id header while metadata.session_id may be absent on follow-up requests
Rejected: Rely on metadata-only Claude session extraction | generated fresh session ids and broke cross-request shadow replay
Rejected: Record Gemini shadow only after streaming completes | loses the race when the client sends the next request immediately after tool_use
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Preserve Gemini shadow continuity across requests by keying Claude sessions from the header first and persisting tool-call shadow before yielding tool_use events downstream
Tested: cargo fmt --manifest-path src-tauri/Cargo.toml --all; cargo test --manifest-path src-tauri/Cargo.toml test_extract_session_from_claude_header; cargo test --manifest-path src-tauri/Cargo.toml test_extract_session_from_claude_header_precedes_metadata; cargo test --manifest-path src-tauri/Cargo.toml stores_tool_shadow_before_tool_use_events_are_fully_drained; cargo test --manifest-path src-tauri/Cargo.toml shadow_replay_matches_tool_use_turn_by_id_when_position_drifts; cargo test --manifest-path src-tauri/Cargo.toml shadow_replay_aligns_to_latest_turns_after_client_truncation
Not-tested: Full src-tauri test suite without test filters; live end-to-end Gemini relay after this exact commit hash

* style: apply cargo fmt to pass Backend Checks CI

Wrap prompt_cache_key chained call across lines per rustfmt default
formatting. Pure formatting change, no behavior difference.

* fix(proxy/gemini): synthesize unique ids for no-id tool calls + enforce object params schema

P1 — Parallel tool calls without Gemini-assigned ids no longer collapse.
Gemini 2.x native parallel `functionCall` entries may omit the `id` field.
The previous `merge_tool_call_snapshots` fell back to matching by `name`,
which silently merged two parallel calls to the same function into one
entry — dropping the first call's args. The non-streaming path and shadow
store further bottlenecked on empty-string ids: multiple `tool_use` blocks
shared the same id, and `tool_name_by_id.get("")` could only return one
mapping, causing later `tool_result` round-trips to fail with
`Unable to resolve Gemini functionResponse.name` or bind to the wrong tool.

Fix: introduce `synthesize_tool_call_id()` producing `gemini_synth_<uuid>`.
Both streaming and non-streaming response paths now guarantee every
Anthropic-visible tool_use carries a unique id. `merge_tool_call_snapshots`
matches by id first, falling back to the `parts` array position (for the
cumulative-streaming case) while preserving the synthesized id across
chunks. `convert_message_content_to_parts` detects the synthetic prefix
and strips the id from outbound `functionCall`/`functionResponse` so the
internal identifier never leaks upstream. `shadow_parts` performs the
same strip when replaying a recorded assistant turn.

P2 — Vertex AI rejects empty `parameters` schemas. When an Anthropic tool
arrives with missing or empty `input_schema`, the proxy used to emit
`"parameters": {}` (no `type`), which fails Vertex AI validation with
`functionDeclaration parameters schema should be of type OBJECT`.
Contrary to the automated-review suggestion, the fix is not to omit
`parameters` (that too is rejected) but to normalize to the canonical
empty-object form `{type: "object", properties: {}}`.
Refs: google-gemini/generative-ai-python#423, BerriAI/litellm#5055.

Fix: new `ensure_object_schema` helper in `gemini_schema` promotes
missing `type` to `"object"` and adds empty `properties` when absent,
while leaving atomic (non-object) schemas untouched.

Tests: seven new regressions covering parallel no-id calls, cumulative
chunk id reuse, synthetic-id round-trip both directions, shadow replay
id stripping, and the three Vertex-AI schema shapes.

The two existing wrapper functions (`gemini_to_anthropic` and
`gemini_to_anthropic_with_shadow`) gain `#[allow(dead_code)]` to clear
a pre-existing clippy -D warnings failure — they are part of the public
transform API surface and intentionally kept for future callers.

Addresses Codex review P1/P2 on #1918.

* fix(proxy/gemini): narrow URL normalization + guard empty OAuth access_token

P2a — Preserve opaque relay URLs that contain `/v1/models/` prefixes.

`should_normalize_gemini_full_url` previously flagged any full URL whose
path merely contained `/v1beta/models/` or `/v1/models/` as a structured
Gemini endpoint, forcing rewrite to `.../v1beta/models/{model}:method`.
This silently dropped legitimate relay route segments (e.g.
`https://relay.example/v1/models/invoke` → `.../v1beta/models/...:generateContent`,
losing `/invoke`) and sent traffic to the wrong upstream path.

Replace the bare `contains(...)` checks with
`matches_structured_gemini_models_path`, which requires the
`/models/` segment to be followed by a canonical Gemini method call
(`*:generateContent` or `*:streamGenerateContent`). The
`matches_bare_gemini_models_path` helper is generalized (and renamed) to
handle both `/v1beta/models/` and `/v1/models/` alongside the original
bare `/models/` shape.

P2b — Reject empty Gemini OAuth access_tokens before they reach the
bearer header.

`GeminiAdapter::parse_oauth_credentials` accepts refresh-token-only JSON
(and surfaces `{"access_token": "", ...}` for expired credentials) with
`access_token` defaulting to `""`. The Claude adapter's GeminiCli branch
then called `AuthInfo::with_access_token(key, creds.access_token)`
unconditionally, so the bearer-header builder at
`AuthStrategy::GoogleOAuth` resolved to `Authorization: Bearer ` — a
deterministic 401 from upstream.

CC Switch does not currently exchange the refresh_token for a fresh
access_token (`OAuthCredentials::needs_refresh` / `can_refresh` are
annotated `#[allow(dead_code)]`). Until that exists, only attach
`access_token` when it is non-empty; fall back to plain GoogleOAuth
strategy with the raw key and log a warn pointing users at
`~/.gemini/oauth_creds.json` so the failure mode is observable.

Tests:
- gemini_url.rs: three new regressions — opaque `/v1/models/invoke`,
  opaque `/v1beta/models/route`, and the positive counter-case where a
  structured `/v1/models/...:generateContent` path still normalizes.
- claude.rs: three new `test_extract_auth_gemini_cli_*` tests covering
  refresh-only JSON, empty-string access_token JSON, and the valid-JSON
  pass-through.

All 839 lib tests pass; cargo fmt + clippy -D warnings clean.

Addresses Codex review P2 findings on #1918.

* fix(proxy/gemini): treat empty-string functionCall id as missing in streaming path

Follow-up to the earlier P1 fix: some Gemini relays serialize an absent
functionCall id as `"id": ""` instead of omitting the field. The
non-streaming `extract_tool_call_meta` already filters these via
`.filter(|s| !s.is_empty())`, but the streaming counterpart
`extract_tool_calls` passed the empty string straight through
`function_call.get("id").and_then(|v| v.as_str())` into
`GeminiToolCallMeta::new`, producing a `Some("")` id.

Downstream, `merge_tool_call_snapshots` would then match two parallel
no-id calls against each other on their shared empty-string id,
collapsing them into a single snapshot (silent data loss for the first
call) and emitting an Anthropic `tool_use.id: ""` that breaks tool_result
correlation on the Claude Code client.

Fix:
- `extract_tool_calls`: apply the same `filter(|s| !s.is_empty())` guard
  used in the non-streaming path so empty strings become `None` before
  reaching the shadow meta.
- `merge_tool_call_snapshots`: defensively collapse any incoming
  `Some("")` to `None` up front — keeps the "missing vs present" invariant
  local to the merge step for future callers that might build
  `GeminiToolCallMeta` by hand.

Tests (2 new, both in streaming_gemini):
- `parallel_empty_string_id_calls_are_treated_as_missing_and_preserved`
  covers two parallel calls with explicit `"id": ""` — asserts both
  surface, no empty tool_use id leaks, and each gets a unique
  `gemini_synth_` id.
- `single_empty_string_id_tool_call_gets_synthesized_id` covers the
  non-parallel degraded-relay case.

All 841 lib tests pass; cargo fmt + clippy -D warnings clean.

Addresses Codex follow-up P1 on #1918.

* fix(proxy/gemini): gate generic REST path suffixes behind Google host whitelist

`should_normalize_gemini_full_url` previously treated any full URL whose
path ends with `/v1`, `/v1/models`, `/models`, `/v1/openai`, or `/openai`
as a structured Gemini endpoint and rewrote it to
`/v1beta/models/{model}:generateContent`. These are ubiquitous REST
conventions — opaque relays such as `https://relay.example/custom/v1`
legitimately use them for fixed endpoints — so the rewrite silently
routed traffic to the wrong upstream path.

Split the predicate into two layers:

- **Unconditional**: `matches_structured_gemini_models_path` (i.e. a
  `/models/...:generateContent` method call anywhere in the path), the
  Google-specific `/v1beta*` family, and the deep OpenAI-compat paths
  (`/v1beta/openai/chat/completions`, `/openai/chat/completions`, and
  their `responses` siblings). These remain host-agnostic because the
  path grammar itself is Gemini-specific.
- **Google-host gated**: `/v1`, `/v1/models`, `/models`, `/v1/openai`,
  `/openai`. Only normalized when the host is one of
  `generativelanguage.googleapis.com`, `aiplatform.googleapis.com`, or a
  real `*-aiplatform.googleapis.com` Vertex regional endpoint. The match
  is exact/suffix (not `contains`), so lookalike hosts like
  `aiplatform.example.com` are correctly treated as opaque relays.

Tests (8 new in `gemini_url::tests`):
- Four opaque-relay cases: `/custom/v1`, `/custom/models`,
  `/custom/v1/models`, `/custom/openai` — all preserved as-is.
- Three Google-host counter-cases: `/v1`, `/models`, and
  `us-central1-aiplatform.googleapis.com/v1` still normalize.
- One lookalike safety case: `aiplatform.example.com/v1` is NOT
  treated as Google.

All 849 lib tests pass; cargo fmt + clippy -D warnings clean.

Addresses Codex review P2 on #1918.

* fix(proxy/gemini): align shadow id with client-visible id in non-streaming path

When Gemini returns a `functionCall` without an id (common in 2.x
parallel calls), `gemini_to_anthropic_with_shadow_and_hints` previously
generated TWO independent synthesized UUIDs:

  1. Line 186-197 — synthesized id `A` used for the Anthropic-visible
     `content[tool_use].id` returned to the client.
  2. Line 850-881 — `extract_tool_call_meta` independently synthesized
     id `B ≠ A`, which populated `shadow_turn.tool_calls[i].id`.

`shadow_content` (line 225-228, cloned from `rectified_parts`) retained
the original missing/empty id. Result: the client sees id `A`, the
shadow store holds id `B`.

On the next turn, `convert_messages_to_contents` builds
`tool_name_by_id` from `build_tool_name_map_from_shadow_turns`, which
uses `tool_calls[i].id` — so the map contains `B → name` but not
`A → name`. When the client sends back `tool_result(tool_use_id=A)`,
resolution fails with:

  Unable to resolve Gemini functionResponse.name for tool_use_id `A`

This affects both truncated histories (client sends only the
tool_result) and full histories (shadow-replay branch at line 342-354
skips `convert_message_content_to_parts`, so the assistant tool_use
block never registers id `A` itself).

Fix: make `rectified_parts` the single source of truth. After
`rectify_tool_call_parts`, run a pre-pass that writes
`synthesize_tool_call_id()` back into any `functionCall` that lacks a
non-empty id. All three readers — the content builder (186-197), the
shadow_content clone (225-228), and `extract_tool_call_meta` — then
observe the same id. `shadow_parts()` already strips synthesized ids on
replay (line 616-628), so the internal identifier never leaks to
Gemini upstream.

This mirrors the streaming path, which already has single-source-of-
truth semantics via `tool_call_snapshots` in `streaming_gemini.rs` —
no change needed there.

Tests (5 new in `transform_gemini::tests`):
- `non_stream_shadow_id_matches_client_visible_id`: asserts
  `response.content[0].id == shadow.tool_calls[0].id ==
  shadow.assistant_content.parts[0].functionCall.id`.
- `non_stream_missing_id_scenario_a_truncated_history_resolves`: turn 2
  sends only `[tool_result(id=A)]`; resolution must succeed.
- `non_stream_missing_id_scenario_b_full_history_replay_resolves`: turn 2
  sends `[assistant(tool_use=A), tool_result(A)]`; shadow-replay branch
  strips the synth id from outgoing `functionCall` while still
  resolving the subsequent `tool_result`.
- `non_stream_preserves_original_gemini_id_when_present`: regression —
  genuine Gemini ids flow through unchanged.
- `non_stream_synthesized_id_not_leaked_to_gemini_via_shadow_replay`:
  defensive — shadow-replay path must strip synth ids from both
  `functionCall.id` and `functionResponse.id`.

All 854 lib tests pass; cargo fmt + clippy -D warnings clean.

Addresses Codex follow-up P1 on #1918.

* refactor(proxy/gemini): share build_anthropic_usage between stream and non-stream paths

`streaming_gemini::anthropic_usage_from_gemini` and
`transform_gemini::build_anthropic_usage` were byte-for-byte identical
(32 lines each) — both converting Gemini `usageMetadata` into the
Anthropic `usage` shape including `cache_read_input_tokens` mapping.

Promote the non-streaming version to `pub(crate)` and reuse it from the
streaming SSE converter. Removes ~30 lines of duplication and guarantees
the two paths cannot drift apart.

No behavioral change; all 854 lib tests pass; cargo fmt + clippy -D
warnings clean.

* fix(proxy/gemini): gate /v1beta behind Google host + normalize models/ model id prefix

Two related P2 corrections to the Gemini Native URL surface, both
folding into the existing Google-host-whitelist architecture.

## P2a — `/v1beta` suffix should not unconditionally trigger rewrite

`should_normalize_gemini_full_url` placed `/v1beta` and `/v1beta/models`
in the unconditional layer on the reasoning that `/v1beta` is
Google-specific. In practice an opaque relay fronting a non-Gemini
service at `https://relay.example/custom/v1beta` would still be
silently rewritten to `/v1beta/models/{model}:generateContent`,
breaking the deployment.

Move `/v1beta`, `/v1beta/models`, and `/v1beta/openai` into the
Google-host gated layer alongside `/v1`, `/models`, and friends. The
unconditional layer now only accepts paths whose grammar is
intrinsically Gemini — `/models/...:generateContent` method calls and
the deep OpenAI-compat endpoints like `/openai/chat/completions` and
`/openai/responses`. Pasted AI-Studio URLs such as
`https://generativelanguage.googleapis.com/v1beta` still normalize
because the host matches the whitelist.

## P2b — `model: "models/gemini-2.5-pro"` produced doubled path prefix

Gemini SDKs (and the official `list_models` response) commonly surface
model ids in resource-name form `models/gemini-2.5-pro`. Raw
interpolation into `format!("/v1beta/models/{model}:...")` produced
`/v1beta/models/models/gemini-2.5-pro:streamGenerateContent` which
upstream rejects — yielding false-negative health checks for otherwise
valid provider configs.

Introduce `normalize_gemini_model_id(&str) -> &str` in `gemini_url`
as the single source of truth: strips an optional leading `/` then an
optional `models/` prefix, leaving bare ids untouched. Apply in the
three call sites that build a Gemini method URL:
- `services/stream_check.rs::resolve_claude_stream_url` (unified path)
- `services/stream_check.rs::check_gemini_stream` (Gemini-only path)
- `proxy/forwarder.rs::rewrite_claude_transform_endpoint` (production)

Tests (9 new):
- `gemini_url`: 3 regressions for opaque vs Google-host `/v1beta*`
  handling + 5 unit tests pinning `normalize_gemini_model_id` behavior
  (strip prefix, leave bare id, preserve nested slashes past the one
  stripped prefix, tolerate leading slash, pass through empty input).
- `stream_check`: one end-to-end regression confirming
  `models/gemini-2.5-pro` collapses to the expected single-prefix URL.
- `forwarder`: one end-to-end regression on the production rewrite
  path.

All 864 lib tests pass; cargo fmt + clippy -D warnings clean.

Addresses Codex P2 feedback on #1918.

* fix(proxy/gemini): trim API key before provider-type detection and OAuth parsing

Leading whitespace on a copied oauth_creds.json (e.g. trailing newline
when the user copies the file content as-is) would slip past the
`starts_with("ya29.") || starts_with('{')` prefix check in
`ClaudeAdapter::provider_type`, causing the provider to be misclassified
as raw-API-key Gemini and fall back to `x-goog-api-key` with the raw
JSON as the key — which upstream rejects with 401.

The frontend's `handleApiKeyChange` already trims on keystrokes but
deep-link imports, the JSON editor, and live-config backfill all bypass
that path. Trim at every backend extraction point so the coverage is
uniform:

- `ClaudeAdapter::extract_key` (5 env / fallback branches) gets
  `.map(str::trim)` before `.filter(|s| !s.is_empty())` so that
  whitespace-only values are also treated as missing.
- `GeminiAdapter::extract_key_raw` gets the same chain (including
  the `.filter` it was missing before).
- `GeminiAdapter::parse_oauth_credentials` gets a defensive
  `let key = key.trim();` at the entry as a belt-and-suspenders guard.

Adds two regression tests covering JSON and bare `ya29.` keys with
leading newline/space.

* fix(proxy/gemini): gate generic REST suffix stripping behind Google host in non-full-URL mode

`build_gemini_native_url` unconditionally stripped `/v1`, `/v1beta`,
`/models`, and `/openai` suffixes from the base path regardless of
host. This worked for Google's own endpoints but silently rewrote
third-party relay URLs like `https://relay.example/custom/v1` to
`.../custom/v1beta/models/...`, breaking any relay that mounts its
Gemini-compatible namespace under a versioned prefix.

The result was also asymmetric with the previously-fixed full-URL
branch: toggling the "full URL" switch changed the outbound URL for
the same base_url, which is exactly the kind of invisible behavior
that makes debugging proxy deployments painful.

Align `normalize_gemini_base_path` with
`should_normalize_gemini_full_url`'s layered model:

- Unconditional: `/models/...:method` structured paths and deep
  OpenAI-compat endpoints (`/openai/chat/completions`,
  `/openai/responses` and their versioned variants) — these are
  unambiguous Gemini-specific grammar on any host.
- Google-host gated: generic `/v1`, `/v1beta`, `/models`, `/openai`
  suffixes only get stripped on `generativelanguage.googleapis.com`,
  `aiplatform.googleapis.com`, or `*-aiplatform.googleapis.com`.
  Other hosts preserve the prefix verbatim so relays keep their
  intended routing.

Adds seven regression tests for the non-full-URL flow: opaque relay
preservation (v1 / v1beta / models / openai suffix variants), Google
host normalization (counter-case), and boundary cases (structured
method path and deep OpenAI-compat endpoint stripped regardless of
host).

Test count: 864 -> 873.

* Revert "fix(proxy/gemini): gate generic REST suffix stripping behind Google host in non-full-URL mode"

This reverts commit d19ff09cb7.

* test(proxy/gemini): pin non-full-URL versioned relay base stripping

Adds two regression tests that lock in the intentional asymmetry
between full-URL and non-full-URL modes:

- Full-URL mode: opaque base path (e.g. `https://relay.example/custom/v1beta`)
  is preserved verbatim. Already covered by
  `preserves_opaque_full_url_with_bare_v1beta_suffix`.
- Non-full-URL mode: base path MUST strip `/v1`, `/v1beta`, etc. so the
  standard `/v1beta/models/{model}:method` endpoint can be appended
  without producing a doubled `/v1beta/v1beta/models/...` path.

The non-full-URL contract is "base URL + cc-switch appends the
canonical Gemini endpoint". A user who needs a relay's custom
namespace (e.g. `/v1/models/...`) must use full-URL mode and paste
the complete method path. This commit adds regression coverage so a
future attempt to mirror full-URL's host-whitelist gating into
`normalize_gemini_base_path` will fail the test suite immediately.

* chore(lint): address clippy 1.95 findings in existing modules

CI upgraded to Rust 1.95 and flagged ten pre-existing warnings that
older toolchains did not enforce. None relate to the Gemini proxy
integration PR itself but they block CI on the feature branch, so
clean them up here as a separate commit for easy review:

collapsible_match:
- proxy/providers/gemini_schema.rs: `"items" if value.is_object()`
  match guard instead of nested if.
- proxy/providers/transform_responses.rs: fold
  `map_responses_stop_reason`'s `"completed"` / `"incomplete"` arms
  into match guards, relying on the existing `_ => "end_turn"` fall-
  through for non-matching guard conditions (semantics preserved).
- services/session_usage_codex.rs: fold
  `"session_meta" if state.session_id.is_none()` guard, relying on
  the existing `_ => {}` fall-through.

unnecessary_sort_by:
- services/provider/endpoints.rs: `sort_by_key(|ep| Reverse(ep.added_at))`.
- services/skill.rs (backup list): same Reverse idiom on `created_at`.
- services/skill.rs (skill listings x2): `sort_by_key(|s| s.name.to_lowercase())`.

useless_conversion:
- services/skill.rs: drop the explicit `.into_iter()` on `zip`'s argument.

while_let_loop:
- services/webdav_auto_sync.rs: `while let Some(wait_for) = ...`
  instead of `loop { let Some(...) = ... else { break }; ... }`.

All changes are mechanical and preserve behavior. `cargo test --lib`
remains green (868 passed).

* fix(proxy/gemini): reconcile synthesized tool-call ids with later real ids + preserve thoughtSignature

Three related findings on `streaming_gemini.rs` for Gemini's cumulative
`streamGenerateContent` stream, all centered on `merge_tool_call_snapshots`:

1. (P1) Match upgraded tool-call IDs by position.
   When Gemini delivers a `functionCall` without an id on chunk 1
   (cc-switch synthesizes `gemini_synth_*`) and then upgrades it to a
   real id on chunk 2, the `Some(incoming_id)` branch only matched by
   id and missed the existing synthesized snapshot. A second entry
   would be pushed, yielding duplicate `tool_use` content blocks at
   stream end — one with the synthesized id, one with the real id —
   which could trigger duplicate tool execution and break tool_result
   correlation. Add a positional fallback: when no id match exists but
   the same-position slot holds a synthesized id, merge into it.
   `or(preserved_id)` already lets the real id win the merge.

2. (P2) Preserve prior thoughtSignature when merging snapshots.
   `tool_call_snapshots[index] = tool_call` overwrote the slot
   entirely, dropping any `thoughtSignature` captured on an earlier
   chunk if the current cumulative snapshot omitted it. Since
   `build_shadow_assistant_parts` writes `thoughtSignature` into the
   shadow turn from `tool_call.thought_signature`, a dropped signature
   would cause later replay requests to Gemini to be rejected with
   invalid-signature errors. Preserve the existing signature when the
   incoming chunk does not carry one.

3. (P2) Document the part-order streaming trade-off.
   All `tool_use` content blocks are emitted after the final text
   `content_block_stop`, so interleaved [text, functionCall, text,
   functionCall] parts arrive at the Anthropic client as [text(concat),
   tool_use, tool_use] — different from the non-streaming transformer,
   which preserves part order. This is intentional given the cumulative
   snapshot model and the consumers we target (claude-code-like clients
   don't depend on strict interleaving for tool execution correctness).
   Add a block comment at the flush site describing the trade-off and
   what a strict-order fix would entail, so this isn't rediscovered as
   a bug later.

Regression tests:
- upgraded_real_id_merges_into_existing_synthesized_snapshot
- thought_signature_preserved_when_later_chunk_omits_it

Test count: 868 -> 870. clippy 1.95 clean. fmt clean.

* fix(proxy/gemini): prefer exact tool-call id over normalized-name fallback

The shadow-turn matcher used a three-branch `||` chain (id / full name /
normalized name). When two tools share a suffix (e.g. `server_a:search`
and `server_b:search`), the normalized-name clause could short-circuit
on an earlier turn whose id is actually wrong for the incoming tool_use,
mis-routing replay state (functionCall id / thoughtSignature) for later
tool_result resolution.

Split matching into two layers: when the incoming message carries any
tool_use ids, run id-based lookup first and return on the earliest hit.
Only fall back to full-name / normalized-name matching when the incoming
ids are absent or none of them resolve.

Add two regressions:

- shadow_replay_prefers_exact_id_match_over_normalized_name_collision
  Two shadow turns with colliding normalized names and two assistant
  messages whose ids cross the positional order; asserts each message
  replays the id-correct shadow turn (including thoughtSignature).

- shadow_replay_falls_back_to_name_when_ids_absent
  Shadow turn with no id and incoming tool_use with an empty id;
  asserts the name fallback still populates the replayed part.

---------

Co-authored-by: Jason <farion1231@gmail.com>