Commit Graph

26 Commits

Author SHA1 Message Date
hkfires
ecc850bfb7 feat(executor): apply payload rules using requested model 2026-01-23 16:38:41 +08:00
hkfires
0b06d637e7 refactor: improve thinking logic 2026-01-15 13:06:39 +08:00
Luis Pater
43652d044c refactor(config): replace nonstream-keepalive with nonstream-keepalive-interval
- Updated `SDKConfig` to use `nonstream-keepalive-interval` (seconds) instead of the boolean `nonstream-keepalive`.
- Refactored handlers and logic to incorporate the new interval-based configuration.
- Updated config diff, tests, and example YAML to reflect the changes.
2026-01-13 03:14:38 +08:00
Luis Pater
b1b379ea18 feat(api): add non-streaming keep-alive support for idle timeout prevention
- Introduced `StartNonStreamingKeepAlive` to emit periodic blank lines during non-streaming responses.
- Added `nonstream-keepalive` configuration option in `SDKConfig`.
- Updated handlers to utilize `StartNonStreamingKeepAlive` and ensure proper cleanup.
- Extended config diff and tests to include `nonstream-keepalive` changes.
2026-01-13 02:36:07 +08:00
hkfires
a95428f204 fix(handlers): preserve upstream response logs before duplicate detection 2025-12-28 22:35:36 +08:00
hkfires
3ca5fb1046 fix(handlers): match raw error text before JSON body for duplicate detection 2025-12-28 19:35:36 +08:00
hkfires
a091d12f4e fix(logging): improve request/response capture 2025-12-28 19:04:31 +08:00
hkfires
09455f9e85 fix(config): make streaming keepalive and retries ints 2025-12-27 20:56:47 +08:00
hkfires
e76ba0ede9 feat(logging): implement request ID tracking and propagation 2025-12-24 08:32:17 +08:00
gwizz
5bf89dd757 fix: keep streaming defaults legacy-safe 2025-12-23 00:53:18 +11:00
gwizz
4442574e53 fix: stop streaming loop on context cancel 2025-12-23 00:37:55 +11:00
gwizz
71a6dffbb6 fix: improve streaming bootstrap and forwarding 2025-12-22 23:34:23 +11:00
Luis Pater
52b6306388 feat(config): add support for model prefixes and prefix normalization
Refactor model management to include an optional `prefix` field for model credentials, enabling better namespace handling. Update affected configuration files, APIs, and handlers to support prefix normalization and routing. Remove unused OpenAI compatibility provider logic to simplify processing.
2025-12-17 01:07:26 +08:00
hkfires
3bc489254b fix(api): prevent double logging for error responses
The WriteErrorResponse function now caches the error response body in the gin context.

The deferred request logger checks for this cached response. If an error response is found, it bypasses the standard response logging. This prevents scenarios where an error is logged twice or an empty payload log overwrites the original, more detailed error log.
2025-12-15 16:36:01 +08:00
hkfires
4c07ea41c3 feat(api): return structured JSON error responses
The API error handling is updated to return a structured JSON payload
instead of a plain text message. This provides more context and allows
clients to programmatically handle different error types.

The new error response has the following structure:
{
  "error": {
    "message": "...",
    "type": "..."
  }
}

The `type` field is determined by the HTTP status code, such as
`authentication_error`, `rate_limit_error`, or `server_error`.

If the underlying error message from an upstream service is already a
valid JSON string, it will be preserved and returned directly.

BREAKING CHANGE: API error responses are now in a structured JSON
format instead of plain text. Clients expecting plain text error
messages will need to be updated to parse the new JSON body.
2025-12-15 16:19:52 +08:00
Luis Pater
6e2306a5f2 refactor(handlers): improve request logging and payload handling 2025-12-12 08:52:52 +08:00
Luis Pater
423ce97665 feat(util): implement dynamic thinking suffix normalization and refactor budget resolution logic
- Added support for parsing and normalizing dynamic thinking model suffixes.
- Centralized budget resolution across executors and payload helpers.
- Retired legacy Gemini-specific thinking handlers in favor of unified logic.
- Updated executors to use metadata-based thinking configuration.
- Added `ResolveOriginalModel` utility for resolving normalized upstream models using request metadata.
- Updated executors (Gemini, Codex, iFlow, OpenAI, Qwen) to incorporate upstream model resolution and substitute model values in payloads and request URLs.
- Ensured fallbacks handle cases with missing or malformed metadata to derive models robustly.
- Refactored upstream model resolution to dynamically incorporate metadata for selecting and normalizing models.
- Improved handling of thinking configurations and model overrides in executors.
- Removed hardcoded thinking model entries and migrated logic to metadata-based resolution.
- Updated payload mutations to always include the resolved model.
2025-12-11 03:10:50 +08:00
Luis Pater
506f1117dd **fix(handlers): refactor API response capture to append data safely**
- Introduced `appendAPIResponse` helper to preserve and append data to existing API responses.
- Ensured newline inclusion when appending, if necessary.
- Improved `nil` and data type checks for response handling.
- Updated middleware to skip request logging for `GET` requests.
2025-11-25 11:37:02 +08:00
Ben Vargas
8193392bfe Add AMP fallback proxy and shared Gemini normalization
- add fallback handler that forwards Amp provider requests to ampcode.com when the provider isn’t configured locally
- wrap AMP provider routes with the fallback so requests always have a handler
- share Gemini thinking model normalization helper between core handlers and AMP fallback
2025-11-19 18:23:17 -07:00
TUGOhost
92f4278039 feat: add auto model resolution and model creation timestamp tracking
- Add 'created' field to model registry for tracking model creation time
- Implement GetFirstAvailableModel() to find the first available model by newest creation timestamp
- Add ResolveAutoModel() utility function to resolve "auto" model name to actual available model
- Update request handler to resolve "auto" model before processing requests
- Ensures automatic model selection when "auto" is specified as model name

This enables dynamic model selection based on availability and creation time, improving the user experience when no specific model is requested.
2025-11-11 20:30:09 +08:00
tobwen
e5ed2cba4a Add support for dynamic model providers
Implements functionality to parse model names with provider information in the format "provider://model" This allows dynamic provider selection rather than relying only on predefined mappings.

The change affects all execution methods to properly handle these dynamic model specifications while maintaining compatibility with the existing approach for standard model names.
2025-10-28 01:41:54 +01:00
Luis Pater
d225558dae feat: improve error handling with added status codes and headers
- Updated Execute methods to include enhanced error handling via `StatusCode` and `Headers` extraction.
- Introduced structured error responses for cooling down scenarios, providing additional metadata and retry suggestions.
- Refined quota management, allowing for differentiation between cool-down, disabled, and other block reasons.
- Improved model filtering logic based on client availability and suspension criteria.
2025-10-22 09:01:11 +08:00
Luis Pater
ade279d1f2 Feature: #103
feat(gemini): add Gemini thinking configuration support and metadata normalization

- Introduced logic to parse and apply `thinkingBudget` and `include_thoughts` configurations from metadata.
- Enhanced request handling to include normalized Gemini model metadata, preserving the original model identifier.
- Updated Gemini and Gemini-CLI executors to apply thinking configuration based on metadata overrides.
- Refactored handlers to support metadata extraction and cloning during request preparation.
2025-10-16 11:31:18 +08:00
hkfires
c3f88126e6 refactor(provider): remove Gemini Web cookie-based support 2025-10-11 12:56:07 +08:00
hkfires
82187bffba feat(gemini-web): Add conversation affinity selector 2025-09-30 12:21:51 +08:00
Luis Pater
57c9ba49f4 refactor(config): migrate to SDKConfig and streamline proxy handling
- Replaced `config.Config` with `config.SDKConfig` across components for simpler configuration management.
- Updated proxy setup functions and handlers to align with `SDKConfig` improvements.
- Reorganized handler imports to match new SDK structure.
2025-09-27 04:50:23 +08:00