Commit Graph

1091 Commits

Author SHA1 Message Date
Luis Pater
52b6306388 feat(config): add support for model prefixes and prefix normalization
Refactor model management to include an optional `prefix` field for model credentials, enabling better namespace handling. Update affected configuration files, APIs, and handlers to support prefix normalization and routing. Remove unused OpenAI compatibility provider logic to simplify processing.
2025-12-17 01:07:26 +08:00
hkfires
521ec6f1b8 fix(watcher): simplify vertex apikey idKind to exclude base suffix 2025-12-16 22:55:38 +08:00
hkfires
b0c5d9640a refactor(diff): improve security and stability of config change detection
Introduce formatProxyURL helper to sanitize proxy addresses before
logging, stripping credentials and path components while preserving
host information. Rework model hash computation to sort and deduplicate
name/alias pairs with case normalization, ensuring consistent output
regardless of input ordering. Add signature-based identification for
anonymous OpenAI-compatible provider entries to maintain stable keys
across configuration reloads. Replace direct stdout prints with
structured logger calls for file change notifications.
2025-12-16 22:39:19 +08:00
hkfires
ef8e94e992 refactor(watcher): extract config diff helpers
Break out config diffing, hashing, and OpenAI compatibility utilities into a dedicated diff package, update watcher to consume them, and add comprehensive tests for diff logic and watcher behavior.
2025-12-16 21:45:33 +08:00
hkfires
9df96a4bb4 test(thinking): add effort to budget coverage 2025-12-16 18:34:43 +08:00
hkfires
28a428ae2f fix(thinking): align budget effort mapping across translators
Unify thinking budget-to-effort conversion in a shared helper, handle disabled/default thinking cases in translators, adjust zero-budget mapping, and drop the old OpenAI-specific helper with updated tests.
2025-12-16 18:34:43 +08:00
hkfires
b326ec3641 feat(iflow): add thinking support for iFlow models 2025-12-16 18:34:43 +08:00
Luis Pater
fcecbc7d46 Merge pull request #562 from thomasvan/fix/openai-claude-message-start-order
fix(translator): emit message_start on first chunk regardless of role field
2025-12-16 16:54:58 +08:00
Thong Van
f4007f53ba fix(translator): emit message_start on first chunk regardless of role field
Some OpenAI-compatible providers (like GitHub Copilot) may send tool_calls
in the first streaming chunk without including the role field. The previous
implementation only emitted message_start when the first chunk contained
role="assistant", causing Anthropic protocol violations when tool calls
arrived first.

This fix ensures message_start is always emitted on the very first chunk,
preventing 'content_block_start before message_start' errors in clients
that strictly validate Anthropic SSE event ordering.
2025-12-16 13:01:09 +07:00
Luis Pater
5a812a1e93 feat(remote-management): add support for custom GitHub repository for panel updates
Introduce `panel-github-repository` in the configuration to allow specifying a custom repository for management panel assets. Update dependency versions and enhance asset URL resolution logic to support overrides.
v6.6.18
2025-12-16 13:09:26 +08:00
Chén Mù
5e624cc7b1 Merge pull request #558 from router-for-me/worker
chore: ignore .bmad directory
2025-12-16 09:24:32 +08:00
Luis Pater
3af24597ee docs: remove Amp CLI integration guides and update references v6.6.17 2025-12-15 23:50:56 +08:00
hkfires
e0be6c5786 chore: ignore .bmad directory 2025-12-15 20:53:43 +08:00
Luis Pater
88b101ebf5 Merge pull request #549 from router-for-me/log
Improve Request Logging Efficiency and Standardize Error Responses
2025-12-15 20:43:12 +08:00
Luis Pater
d9a65745df fix(translator): handle empty item type and string content in OpenAI response parser v6.6.16 2025-12-15 20:35:52 +08:00
hkfires
97ab623d42 fix(api): prevent double logging for streaming responses 2025-12-15 18:00:32 +08:00
hkfires
14aa6cc7e8 fix(api): ensure all response writes are captured for logging
The response writer wrapper has been refactored to more reliably capture response bodies for logging, fixing several edge cases.

- Implements `WriteString` to capture writes from `io.StringWriter`, which were previously missed by the `Write` method override.
- A new `shouldBufferResponseBody` helper centralizes the logic to ensure the body is buffered only when logging is active or for errors when `logOnErrorOnly` is enabled.
- Streaming detection is now more robust. It correctly handles non-streaming error responses (e.g., `application/json`) that are generated for a request that was intended to be streaming.

BREAKING CHANGE: The public methods `Status()`, `Size()`, and `Written()` have been removed from the `ResponseWriterWrapper` as they are no longer required by the new implementation.
2025-12-15 17:45:16 +08:00
hkfires
3bc489254b fix(api): prevent double logging for error responses
The WriteErrorResponse function now caches the error response body in the gin context.

The deferred request logger checks for this cached response. If an error response is found, it bypasses the standard response logging. This prevents scenarios where an error is logged twice or an empty payload log overwrites the original, more detailed error log.
2025-12-15 16:36:01 +08:00
hkfires
4c07ea41c3 feat(api): return structured JSON error responses
The API error handling is updated to return a structured JSON payload
instead of a plain text message. This provides more context and allows
clients to programmatically handle different error types.

The new error response has the following structure:
{
  "error": {
    "message": "...",
    "type": "..."
  }
}

The `type` field is determined by the HTTP status code, such as
`authentication_error`, `rate_limit_error`, or `server_error`.

If the underlying error message from an upstream service is already a
valid JSON string, it will be preserved and returned directly.

BREAKING CHANGE: API error responses are now in a structured JSON
format instead of plain text. Clients expecting plain text error
messages will need to be updated to parse the new JSON body.
2025-12-15 16:19:52 +08:00
Luis Pater
f6720f8dfa Merge pull request #547 from router-for-me/amp
feat(amp): require API key authentication for management routes
v6.6.15
2025-12-15 16:14:49 +08:00
Chén Mù
e19ab3a066 Merge pull request #543 from router-for-me/log
feat(auth): add proxy information to debug logs
2025-12-15 15:59:16 +08:00
hkfires
8f1dd69e72 feat(amp): require API key authentication for management routes
All Amp management endpoints (e.g., /api/user, /threads) are now protected by the standard API key authentication middleware. This ensures that all management operations require a valid API key, significantly improving security.

As a result of this change:
- The `restrict-management-to-localhost` setting now defaults to `false`. API key authentication provides a stronger and more flexible security control than IP-based restrictions, improving usability in containerized environments.
- The reverse proxy logic now strips the client's `Authorization` header after authenticating the initial request. It then injects the configured `upstream-api-key` for the request to the upstream Amp service.

BREAKING CHANGE: Amp management endpoints now require a valid API key for authentication. Requests without a valid API key in the `Authorization` header will be rejected with a 401 Unauthorized error.
2025-12-15 13:24:53 +08:00
hkfires
f26da24a2f feat(auth): add proxy information to debug logs 2025-12-15 13:14:55 +08:00
Luis Pater
8e4fbcaa7d Merge pull request #533 from router-for-me/think
refactor(thinking): centralize reasoning effort mapping and normalize budget values
v6.6.14
2025-12-15 10:34:41 +08:00
hkfires
09c339953d fix(openai): forward reasoning.effort value
Drop the hardcoded effort mapping in request conversion so
unknown values are preserved instead of being coerced to `auto
2025-12-15 09:16:15 +08:00
hkfires
367a05bdf6 refactor(thinking): export thinking helpers
Expose thinking/effort normalization helpers from the executor package
so conversion tests use production code and stay aligned with runtime
validation behavior.
2025-12-15 09:16:15 +08:00
hkfires
d20b71deb9 fix(thinking): normalize effort mapping
Route OpenAI reasoning effort through ThinkingEffortToBudget for Claude
translators, preserve "minimal" when translating OpenAI Responses, and
treat blank/unknown efforts as no-ops for Gemini thinking configs.

Also map budget -1 to "auto" and expand cross-protocol thinking tests.
2025-12-15 09:16:15 +08:00
hkfires
712ce9f781 fix(thinking): drop unsupported none effort
When budget 0 maps to "none" for models that use thinking levels
but don't support that effort level, strip thinking fields instead
of setting an invalid reasoning_effort value.
Tests now expect removal for this edge case.
2025-12-15 09:16:14 +08:00
hkfires
a4a3274a55 test(thinking): expand conversion edge case coverage 2025-12-15 09:16:14 +08:00
hkfires
716aa71f6e fix(thinking): centralize reasoning_effort mapping
Move OpenAI `reasoning_effort` -> Gemini `thinkingConfig` budget logic into
shared helpers used by Gemini, Gemini CLI, and antigravity translators.

Normalize Claude thinking handling by preferring positive budgets, applying
budget token normalization, and gating by model support.

Always convert Gemini `thinkingBudget` back to OpenAI `reasoning_effort` to
support allowCompat models, and update tests for normalization behavior.
2025-12-15 09:16:14 +08:00
hkfires
e8976f9898 fix(thinking): map budgets to effort for level models 2025-12-15 09:16:14 +08:00
hkfires
8496cc2444 test(thinking): cover openai-compat reasoning passthrough 2025-12-15 09:16:14 +08:00
hkfires
5ef2d59e05 fix(thinking): gate reasoning effort by model support
Only map OpenAI reasoning effort to Claude thinking for models that support
thinking and use budget tokens (not level-based thinking).

Also add "xhigh" effort mapping and adjust minimal/low budgets, with new
raw-payload conversion tests across protocols and models.
2025-12-15 09:16:14 +08:00
Chén Mù
07bb89ae80 Merge pull request #542 from router-for-me/aistudio v6.6.13 2025-12-15 09:13:25 +08:00
hkfires
27a5ad8ec2 Fixed: #534
fix(aistudio): correct JSON string boundary detection for backslash sequences
2025-12-15 09:00:14 +08:00
Luis Pater
707b07c5f5 Merge pull request #537 from sukakcoding/fix/function-response-fallback
fix: handle malformed json in function response parsing
2025-12-15 03:31:09 +08:00
sukakcoding
4a764afd76 refactor: extract parseFunctionResponse helper to reduce duplication 2025-12-15 01:05:36 +08:00
sukakcoding
ecf49d574b fix: handle malformed json in function response parsing 2025-12-15 00:59:46 +08:00
Luis Pater
5a75ef8ffd Merge pull request #536 from AoaoMH/feature/auth-model-check
feat: using Client Model Infos;
v6.6.12
2025-12-15 00:29:33 +08:00
Test
07279f8746 feat: using Client Model Infos; 2025-12-15 00:13:05 +08:00
Luis Pater
71f788b13a fix(registry): remove unused ThinkingSupport from DeepSeek-R1 model 2025-12-14 21:30:17 +08:00
Luis Pater
59c62dc580 fix(registry): correct DeepSeek-V3.2 experimental model ID 2025-12-14 21:27:43 +08:00
Luis Pater
d5310a3300 Merge pull request #531 from AoaoMH/feature/auth-model-check
feat: add API endpoint to query models for auth credentials
v6.6.11
2025-12-14 16:46:43 +08:00
Luis Pater
f0a3eb574e fix(registry): update DeepSeek model definitions with new IDs and descriptions v6.6.10 2025-12-14 16:17:11 +08:00
Test
bb15855443 feat: add API endpoint to query models for auth credentials 2025-12-14 15:16:26 +08:00
Luis Pater
14ce6aebd1 Merge pull request #449 from sususu98/fix/gemini-cli-429-retry-delay-parsing
fix(gemini-cli): enhance 429 retry delay parsing
2025-12-14 14:04:14 +08:00
Luis Pater
2fe83723f2 Merge pull request #515 from teeverc/fix/response-rewriter-streaming-flush
fix(amp): flush response buffer after each streaming chunk write
2025-12-14 13:26:05 +08:00
teeverc
cd8c86c6fb refactor: only flush stream response on successful write 2025-12-13 13:32:54 -08:00
teeverc
52d5fd1a67 fix: streaming for amp cli 2025-12-13 13:17:53 -08:00
Luis Pater
b6ad243e9e Merge pull request #498 from teeverc/fix/claude-streaming-flush
fix(claude): flush Claude SSE chunks immediately
v6.6.9
2025-12-13 23:58:34 +08:00