Feat/usage improvements (#508)

* i18n: update cache terminology across all languages

- Change 'Cache Read' to 'Cache Hit' in all languages
- Change 'Cache Write' to 'Cache Creation' in all languages
- Update zh: 缓存读取 → 缓存命中, 缓存写入 → 缓存创建
- Update en: Cache Read → Cache Hit, Cache Write → Cache Creation
- Update ja: キャッシュ読取 → キャッシュヒット, キャッシュ書込 → キャッシュ作成

Affected keys: cacheReadTokens, cacheCreationTokens, cacheReadCost,
cacheWriteCost, cacheRead, cacheWrite

* feat(usage): add cache metrics to trend chart

- Add cache creation tokens visualization (orange line)
- Add cache hit tokens visualization (purple line)
- Add gradient definitions for new cache metrics
- Include cache data in hourly aggregation
- Display cache metrics alongside input/output tokens

This provides better visibility into cache usage patterns over time.

* fix(usage): fix timezone handling in datetime picker

- Add timestampToLocalDatetime() to convert Unix timestamp to local datetime
- Add localDatetimeToTimestamp() with validation for incomplete input
- Fix issue where typing hours/minutes would jump to previous day
- Validate datetime format completeness before conversion
- Use local timezone instead of UTC for datetime-local input

This resolves the issue where users couldn't fine-tune time selection
and the input would jump unexpectedly when editing hours or minutes.

* feat(usage): add auto-refresh for usage statistics

- Add 30-second auto-refresh interval for all usage queries
- Disable background refresh to save resources
- Apply to: summary, trends, provider stats, model stats, request logs
- Queries automatically update when tab is active
- Pause refresh when user switches to another tab

This keeps usage data fresh without manual refresh.

* fix(proxy): improve usage logging and cache token parsing

- Log requests even when usage parsing fails (with default values)
- Add detailed debug logging for usage metrics
- Support cache_read_input_tokens field in Codex responses
- Fallback to input_tokens_details.cached_tokens if needed
- Add test case for cached_tokens in input_tokens_details
- Ensure all requests are tracked in database for analytics

This fixes missing request logs when API responses lack usage data
and improves cache token detection across different response formats.

* style(rust): use inline format args in format! macros

- Replace format!("...", var) with format!("...{var}")
- Update universal provider ID formatting
- Update error message formatting
- Update config.toml generation in Codex provider

Fixes clippy::uninlined_format_args warnings.

* feat(proxy): enhance provider router logging

- Add debug logs for failover queue provider count
- Log circuit breaker state for each provider check
- Add logs for missing current provider scenarios
- Log when no current provider is configured
- Use inline format args for better readability

This improves debugging of provider selection and failover behavior.

* feat(database): update model pricing data

- Update Claude models to full version format (e.g. claude-opus-4-5-20251101)
- Add GPT-5.2 series model pricing (10 models)
- Add GPT-5.1 series model pricing (10 models)
- Add GPT-5 series model pricing (12 models)
- Add Gemini 3 series model pricing (2 models)
- Update Gemini 2.5 series model ID format (use dot separator)
- Unify display names by removing thinking level suffixes

* fix(usage): correct Gemini output token calculation

Fix Gemini API output token parsing to use totalTokenCount - promptTokenCount
instead of candidatesTokenCount alone. This ensures thoughtsTokenCount is
included in output statistics.

- Update from_gemini_response to calculate output from total - input
- Update from_gemini_stream_chunks with same logic for consistency
- Fix from_codex_stream_events to use adjusted token calculation
- Add test case for responses with thoughtsTokenCount
- Update existing tests to match new calculation logic

* fix(usage): correct cache token billing and add Codex format auto-detection

- Avoid double-billing cache tokens by subtracting from input before calculation
- Add smart Codex parser that auto-detects OpenAI vs Codex API format
- Extract model name from Codex responses for accurate tracking

* fix(proxy): improve takeover detection with live config check

- Add live config takeover detection for hot-switch decision
- Rebuild takeover when backup is missing or placeholder remains
- Make detect_takeover_in_live_config_for_app public
- Fix is_takeover_active to use actual takeover status

* refactor(usage): simplify model pricing lookup by removing suffix fallback

Replace complex suffix-stripping fallback with direct prefix/suffix cleanup.
Model IDs are now cleaned by removing vendor prefix (before /) and colon
suffix (after :), then matched exactly against pricing table.

* feat(database): add Chinese AI model pricing data

Add pricing for domestic AI models (CNY/1M tokens):
- Doubao-Seed-Code (ByteDance)
- DeepSeek V3/V3.1/V3.2
- Kimi K2/K2-Thinking/K2-Turbo (Moonshot)
- MiniMax M2/M2.1/M2.1-Lightning
- GLM-4.6/4.7 (Zhipu)
- Mimo V2 Flash (Xiaomi)

Also fix test case to use correct model ID and remove invalid currency column.

* refactor(proxy): improve header forwarding with blacklist approach

Change from whitelist to blacklist mode for request header forwarding.
Only skip headers that will be overridden (auth, host, content-length).
This preserves client's original headers and improves compatibility.

* fix(proxy): bypass timeout and retry configs when failover is disabled

When auto_failover_enabled is false, timeout and retry configurations
should not affect normal request flow. This change ensures:

- create_forwarder: passes 0 for all timeout/retry params when failover
  is disabled, effectively bypassing these checks
- streaming_timeout_config: returns 0 for both first_byte_timeout and
  idle_timeout when failover is disabled

This prevents unnecessary timeout errors and retry attempts when users
have explicitly disabled the failover feature.

* fix(proxy): handle zero value input in failover config fields

* refactor(proxy): remove retry logic and add enabled check for failover

* refactor(proxy): distinguish circuit-open from no-provider errors

* Align usage stats to sliding windows

* feat(proxy): add body and header filtering for upstream requests

* feat(proxy): enable transparent passthrough for headers

- Passthrough anthropic-beta header as-is from client
- Passthrough anthropic-version header from client
- Passthrough client IP headers (x-forwarded-for, x-real-ip) by default
- Filter private params (underscore-prefixed fields) from request body
- No database changes required

* feat(proxy): extract session ID from client requests for logging

- Add SessionIdExtractor to parse session ID from Claude/Codex requests
- Support extraction from metadata.user_id, headers, previous_response_id
- Pass session_id through RequestContext to usage logger
- Enable request correlation by session in proxy_request_logs
This commit is contained in:
Dex Miller
2025-12-31 22:57:00 +08:00
committed by GitHub
parent d0431b66ae
commit 5376ea042b
31 changed files with 1888 additions and 583 deletions
+34 -18
View File
@@ -62,6 +62,28 @@ export function RequestLogTable() {
});
};
// 将 Unix 时间戳转换为本地时间的 datetime-local 格式
const timestampToLocalDatetime = (timestamp: number): string => {
const date = new Date(timestamp * 1000);
const year = date.getFullYear();
const month = String(date.getMonth() + 1).padStart(2, "0");
const day = String(date.getDate()).padStart(2, "0");
const hours = String(date.getHours()).padStart(2, "0");
const minutes = String(date.getMinutes()).padStart(2, "0");
return `${year}-${month}-${day}T${hours}:${minutes}`;
};
// 将 datetime-local 格式转换为 Unix 时间戳
const localDatetimeToTimestamp = (datetime: string): number | undefined => {
if (!datetime) return undefined;
// 验证格式是否完整 (YYYY-MM-DDTHH:mm)
if (datetime.length < 16) return undefined;
const timestamp = new Date(datetime).getTime();
// 验证是否为有效日期
if (isNaN(timestamp)) return undefined;
return Math.floor(timestamp / 1000);
};
const dateLocale =
i18n.language === "zh"
? "zh-CN"
@@ -153,19 +175,16 @@ export function RequestLogTable() {
className="h-8 w-[200px] bg-background"
value={
tempFilters.startDate
? new Date(tempFilters.startDate * 1000)
.toISOString()
.slice(0, 16)
? timestampToLocalDatetime(tempFilters.startDate)
: ""
}
onChange={(e) =>
onChange={(e) => {
const timestamp = localDatetimeToTimestamp(e.target.value);
setTempFilters({
...tempFilters,
startDate: e.target.value
? Math.floor(new Date(e.target.value).getTime() / 1000)
: undefined,
})
}
startDate: timestamp,
});
}}
/>
<span>-</span>
<Input
@@ -173,19 +192,16 @@ export function RequestLogTable() {
className="h-8 w-[200px] bg-background"
value={
tempFilters.endDate
? new Date(tempFilters.endDate * 1000)
.toISOString()
.slice(0, 16)
? timestampToLocalDatetime(tempFilters.endDate)
: ""
}
onChange={(e) =>
onChange={(e) => {
const timestamp = localDatetimeToTimestamp(e.target.value);
setTempFilters({
...tempFilters,
endDate: e.target.value
? Math.floor(new Date(e.target.value).getTime() / 1000)
: undefined,
})
}
endDate: timestamp,
});
}}
/>
</div>