Feat/usage model extraction (#455)

* feat(proxy): extract model name from API response for accurate usage tracking

- Add model field extraction in TokenUsage parsing for Claude, OpenAI, and Codex
- Prioritize response model over request model in usage logging
- Update model extractors to use parsed usage.model first
- Add tests for model extraction in stream and non-stream responses

* feat(proxy): implement streaming timeout control with validation

- Add first byte timeout (0 or 1-180s) for streaming requests
- Add idle timeout (0 or 60-600s) for streaming data gaps
- Add non-streaming timeout (0 or 60-1800s) for total request
- Implement timeout logic in response processor
- Add 1800s global timeout fallback when disabled
- Add database schema migration for timeout fields
- Add i18n translations for timeout settings

* feat(proxy): add model mapping module for provider-based model substitution

- Add model_mapper.rs with ModelMapping struct to extract model configs from Provider
- Support ANTHROPIC_MODEL, ANTHROPIC_REASONING_MODEL, and default models for haiku/sonnet/opus
- Implement thinking mode detection for reasoning model priority
- Include comprehensive unit tests for all mapping scenarios

* fix(proxy): bypass circuit breaker for single provider scenario

When failover is disabled (single provider), circuit breaker open state
would block all requests causing poor UX. Now bypasses circuit breaker
check in this scenario. Also integrates model mapping into request flow.

* feat(ui): add reasoning model field to Claude provider form

Add ANTHROPIC_REASONING_MODEL configuration field for Claude providers,
allowing users to specify a dedicated model for thinking/reasoning tasks.

* feat(proxy): add openrouter_compat_mode for optional format conversion

Add configurable OpenRouter compatibility mode that enables Anthropic to
OpenAI format conversion. When enabled, rewrites endpoint to /v1/chat/completions
and transforms request/response formats. Defaults to enabled for OpenRouter.

* feat(ui): add OpenRouter compatibility mode toggle

Add UI toggle for OpenRouter providers to enable/disable compatibility
mode which uses OpenAI Chat Completions format with SSE conversion.

* feat(stream-check): use provider-configured model for health checks

Extract model from provider's settings_config (ANTHROPIC_MODEL, GEMINI_MODEL,
or Codex config.toml) instead of always using default test models.

* refactor(ui): remove timeout settings from AutoFailoverConfigPanel

Remove streaming/non-streaming timeout configuration from failover panel
as these settings have been moved to a dedicated location.

* refactor(database): migrate proxy_config to per-app three-row structure

Replace singleton proxy_config table with app_type primary key structure,
allowing independent proxy settings for Claude, Codex, and Gemini.
Add GlobalProxyConfig queries and per-app config management in DAO layer.

* feat(proxy): add GlobalProxyConfig and AppProxyConfig types

Add new type definitions for the refactored proxy configuration:
- GlobalProxyConfig: shared settings (enabled, address, port, logging)
- AppProxyConfig: per-app settings (failover, timeouts, circuit breaker)

* refactor(proxy): update service layer for per-app config structure

Adapt proxy service, handler context, and provider router to use
the new per-app configuration model. Read enabled/timeout settings
from proxy_config table instead of settings table.

* feat(commands): add global and per-app proxy config commands

Add new Tauri commands for the refactored proxy configuration:
- get_global_proxy_config / update_global_proxy_config
- get_proxy_config_for_app / update_proxy_config_for_app
Update startup restore logic to read from proxy_config table.

* feat(api): add frontend API and Query hooks for proxy config

Add TypeScript wrappers and TanStack Query hooks for:
- Global proxy config (address, port, logging)
- Per-app proxy config (failover, timeouts, circuit breaker)
- Proxy takeover status management

* refactor(ui): redesign proxy panel with inline config controls

Replace ProxySettingsDialog with inline controls in ProxyPanel.
Add per-app takeover switches and global address/port settings.
Simplify AutoFailoverConfigPanel by removing timeout settings.

* feat(i18n): add proxy takeover translations and update types

Add i18n strings for proxy takeover status in zh/en/ja.
Update TypeScript types for GlobalProxyConfig and AppProxyConfig.

* refactor(proxy): load circuit breaker config per-app instead of globally

Extract app_type from router key and read circuit breaker settings
from the corresponding proxy_config row for each application.
This commit is contained in:
YoVinchen
2025-12-25 10:40:11 +08:00
committed by GitHub
parent 7d495aa772
commit e6f18ba801
37 changed files with 2813 additions and 1095 deletions
+69 -14
View File
@@ -3,8 +3,11 @@
//! 统一处理流式和非流式 API 响应
use super::{
handler_config::UsageParserConfig, handler_context::RequestContext, server::ProxyState,
usage::parser::TokenUsage, ProxyError,
handler_config::UsageParserConfig,
handler_context::{RequestContext, StreamingTimeoutConfig},
server::ProxyState,
usage::parser::TokenUsage,
ProxyError,
};
use axum::response::Response;
use bytes::Bytes;
@@ -17,6 +20,7 @@ use std::{
atomic::{AtomicBool, Ordering},
Arc,
},
time::Duration,
};
use tokio::sync::Mutex;
@@ -60,8 +64,12 @@ pub async fn handle_streaming(
// 创建使用量收集器
let usage_collector = create_usage_collector(ctx, state, status.as_u16(), parser_config);
// 创建带日志的透传流
let logged_stream = create_logged_passthrough_stream(stream, ctx.tag, Some(usage_collector));
// 获取流式超时配置
let timeout_config = ctx.streaming_timeout_config();
// 创建带日志和超时的透传流
let logged_stream =
create_logged_passthrough_stream(stream, ctx.tag, Some(usage_collector), timeout_config);
let body = axum::body::Body::from_stream(logged_stream);
builder.body(body).unwrap()
@@ -93,12 +101,16 @@ pub async fn handle_non_streaming(
// 解析使用量
if let Some(usage) = (parser_config.response_parser)(&json_value) {
let model = json_value
.get("model")
.and_then(|m| m.as_str())
.unwrap_or(&ctx.request_model);
// 优先使用 usage 中解析出的模型名称,其次使用响应中的 model 字段,最后回退到请求模型
let model = if let Some(ref m) = usage.model {
m.clone()
} else if let Some(m) = json_value.get("model").and_then(|m| m.as_str()) {
m.to_string()
} else {
ctx.request_model.clone()
};
spawn_log_usage(state, ctx, usage, model, status.as_u16(), false);
spawn_log_usage(state, ctx, usage, &model, status.as_u16(), false);
} else {
log::debug!(
"[{}] 未能解析 usage 信息,跳过记录",
@@ -344,21 +356,60 @@ async fn log_usage_internal(
}
}
/// 创建带日志记录的透传流
/// 创建带日志记录和超时控制的透传流
pub fn create_logged_passthrough_stream(
stream: impl Stream<Item = Result<Bytes, std::io::Error>> + Send + 'static,
tag: &'static str,
usage_collector: Option<SseUsageCollector>,
timeout_config: StreamingTimeoutConfig,
) -> impl Stream<Item = Result<Bytes, std::io::Error>> + Send {
async_stream::stream! {
let mut buffer = String::new();
let mut collector = usage_collector;
let mut is_first_chunk = true;
// 超时配置
let first_byte_timeout = if timeout_config.first_byte_timeout > 0 {
Some(Duration::from_secs(timeout_config.first_byte_timeout))
} else {
None
};
let idle_timeout = if timeout_config.idle_timeout > 0 {
Some(Duration::from_secs(timeout_config.idle_timeout))
} else {
None
};
tokio::pin!(stream);
while let Some(chunk) = stream.next().await {
match chunk {
Ok(bytes) => {
loop {
// 选择超时时间:首字节超时或静默期超时
let timeout_duration = if is_first_chunk {
first_byte_timeout
} else {
idle_timeout
};
let chunk_result = match timeout_duration {
Some(duration) => {
match tokio::time::timeout(duration, stream.next()).await {
Ok(Some(chunk)) => Some(chunk),
Ok(None) => None, // 流结束
Err(_) => {
// 超时
let timeout_type = if is_first_chunk { "首字节" } else { "静默期" };
log::error!("[{tag}] 流式响应{}超时 ({}秒)", timeout_type, duration.as_secs());
yield Err(std::io::Error::other(format!("流式响应{timeout_type}超时")));
break;
}
}
}
None => stream.next().await, // 无超时限制
};
match chunk_result {
Some(Ok(bytes)) => {
is_first_chunk = false;
let text = String::from_utf8_lossy(&bytes);
buffer.push_str(&text);
@@ -394,11 +445,15 @@ pub fn create_logged_passthrough_stream(
yield Ok(bytes);
}
Err(e) => {
Some(Err(e)) => {
log::error!("[{tag}] 流错误: {e}");
yield Err(std::io::Error::other(e.to_string()));
break;
}
None => {
// 流正常结束
break;
}
}
}