Reorganize docs/user-manual/ from flat structure to language subdirectories (zh/, en/, ja/) with shared assets/. Move existing Chinese docs into zh/, fix image paths, add multilingual navigation README, and translate all 23 markdown files (~4500 lines each) to English and Japanese.
4.1 KiB
4.5 Model Test
Overview
The model test feature verifies whether a provider's configured model is available by sending actual API requests to test:
- Whether the model exists
- Whether the API Key is valid
- Whether the endpoint responds normally
- Whether the response latency is acceptable
Open Configuration
Settings > Advanced > Model Test Config
Test Model Configuration
Configure the model used for testing per application:
| Application | Setting | Default | Notes |
|---|---|---|---|
| Claude | Claude Model | System default | Recommend using Haiku series (low cost, fast) |
| Codex | Codex Model | System default | Recommend using mini series |
| Gemini | Gemini Model | System default | Recommend using Flash series |
Model Selection Tips
When choosing a test model, consider:
- Cost: Choose lower-priced models (e.g., Haiku, Mini, Flash)
- Speed: Choose fast-responding models
- Availability: Choose models supported by the provider
Test Parameter Configuration
Timeout
| Parameter | Description | Default | Range |
|---|---|---|---|
| Timeout | Single request timeout | 45 seconds | 10-120 seconds |
Setting it too short may cause false negatives; too long delays fault detection.
Retries
| Parameter | Description | Default | Range |
|---|---|---|---|
| Max Retries | Retries after failure | 2 times | 0-5 times |
Increase retries when the network is unstable.
Degradation Threshold
| Parameter | Description | Default | Range |
|---|---|---|---|
| Degradation Threshold | Responses exceeding this time are marked as degraded | 6000ms | 1000-30000ms |
Providers exceeding the threshold are marked as "degraded" but remain usable.
Execute Model Test
Manual Test
Click the "Test" button on the provider card:
- Sends a test request to the configured endpoint
- Uses the configured test model
- Waits for response or timeout
- Displays the test result
Test Content
The test request:
- Sends a short prompt (e.g., "Hi")
- Limits maximum output tokens (typically 10-50)
- Uses streaming response to detect time to first byte
Test Results
Health Status
| Status | Icon | Description |
|---|---|---|
| Healthy | Green | Normal response, latency within threshold |
| Degraded | Yellow | Normal response, but latency exceeds threshold |
| Unavailable | Red | Request failed or timed out |
Result Information
After testing completes, displays:
- Response latency (milliseconds)
- Time to first byte (TTFB)
- Error message (if failed)
Integration with Failover
Model testing works in conjunction with the failover feature:
Health Checks
After enabling the proxy service, the system periodically performs health checks on providers in the failover queue:
- Sends a request using the configured test model
- Updates health status based on the response
- Unhealthy providers are temporarily skipped
Circuit Breaker Recovery
When a provider recovers from a circuit-broken state:
- Performs a model test to verify availability
- If the test passes, normal status is restored
- If the test fails, the circuit breaker remains active
FAQ
Test Fails But Actually Available
Possible causes:
- The test model differs from the actually used model
- The provider doesn't support the configured test model
Solutions:
- Change the test model to one supported by the provider
- Check the provider's model list
High Latency
Possible causes:
- Network latency
- High server load on the provider
- Slow model response
Solutions:
- Use a faster test model
- Adjust the degradation threshold
- Consider using mirror endpoints
Frequent Timeouts
Possible causes:
- Timeout set too short
- Unstable network
- Unstable provider service
Solutions:
- Increase the timeout
- Increase retry count
- Check network connection
Notes
- Model testing consumes a small amount of API quota
- Recommend using low-cost models for testing
- Testing frequency should not be too high to avoid wasting quota
- Different providers may support different models