Commit Graph

76 Commits

Author SHA1 Message Date
Luis Pater
63908869f6 Merge pull request #611 from soilSpoon/feature/antigravity
feat(antigravity): Improve Claude model compatibility
2025-12-21 16:27:29 +08:00
Luis Pater
6bd9a034f7 Merge pull request #602 from ben-vargas/fix-antigravity-propertynames
fix: remove propertyNames from JSON schema for Gemini compatibility
2025-12-20 23:32:51 +08:00
Luis Pater
ed5ec5b55c feat(amp): enhance model mapping and Gemini thinking configuration
This commit introduces several improvements to the AMP (Advanced Model Proxy) module:

- **Model Mapping Logic:** The `FallbackHandler` now uses a more robust approach for model mapping. It includes the extraction and preservation of dynamic "thinking suffixes" (e.g., `(xhigh)`) during mapping, ensuring that these configurations are correctly applied to the mapped model. A new `resolveMappedModel` function centralizes this logic for cleaner code.
- **ModelMapper Verification:** The `ModelMapper` in `model_mapping.go` now verifies that the target model of a mapping has available providers *after* normalizing it. This prevents mappings to non-existent or unresolvable models.
- **Gemini Thinking Configuration Cleanup:** In `gemini_thinking.go`, unnecessary `generationConfig.thinkingConfig.include_thoughts` and `generationConfig.thinkingConfig.thinkingBudget` fields are now deleted from the request body when applying Gemini thinking levels. This prevents potential conflicts or redundant configurations.
- **Testing:** A new test case `TestModelMapper_MapModel_TargetWithThinkingSuffix` has been added to `model_mapping_test.go` to specifically cover the preservation of thinking suffixes during model mapping.
2025-12-20 22:19:35 +08:00
hkfires
2039062845 fix(gemini): add optional skip for gemini3 thinking conversion 2025-12-19 22:07:43 +08:00
hkfires
13aa82f3f3 fix(util): disable default thinking for gemini 3 flash 2025-12-19 13:11:15 +08:00
이대희
3275494fde refactor: Use helper to extract wrapped "thinking" text
Improve robustness when handling "thinking" content by using a dedicated helper to extract the thinking text. This ensures wrapped or nested thinking objects are handled correctly instead of relying on a direct string extraction, reducing parsing errors for complex payloads.
2025-12-19 13:09:57 +09:00
이대희
c1f8211acb fix: Normalize Bash tool args and add signature caching support
Normalize Bash tool arguments by converting a "command" key into "cmd" using JSON-aware parsing, avoiding brittle string replacements that could corrupt values. Apply this conversion in both streaming and non-streaming response paths so bash-style tool calls are emitted with the expected "cmd" field.

Add support for accumulating thinking text and carrying session identifiers to enable signature caching/restore for unsigned thinking blocks, improving handling of thinking-state continuity across requests/responses.

Also perform small cleanups: import logging, tidy comments and test descriptions. These changes make tool-argument handling more robust and enable reliable signature restoration for thinking blocks.
2025-12-19 11:12:16 +09:00
이대희
e44167d7a4 refactor(util/schema): rename and extend Gemini schema cleaning for Antigravity and add empty-schema placeholders 2025-12-19 10:28:17 +09:00
이대희
1bfa75f780 feat(util): add helper to detect Claude thinking models 2025-12-19 10:28:15 +09:00
Ben Vargas
1b8cb7b77b fix: remove propertyNames from JSON schema for Gemini compatibility
Gemini API does not support the JSON Schema `propertyNames` keyword,
causing 400 errors when Claude tool schemas containing this field are
proxied through the Antigravity provider.

Add `propertyNames` to the list of unsupported keywords removed by
CleanJSONSchemaForGemini(), alongside existing removals like $ref,
definitions, and additionalProperties.
2025-12-18 12:50:51 -07:00
Ben Vargas
88798816f2 fix: require dot in gemini25Pattern regex for precise matching 2025-12-17 16:09:50 -07:00
Ben Vargas
598f0af19b fix: apply thinkingLevel from model suffix metadata for Gemini 3
The previous commit added thinkingLevel support but didn't apply it
when the reasoning effort came from model name suffix (e.g., model(minimal)).

This was because ResolveThinkingConfigFromMetadata returns nil for
level-based models, bypassing the metadata application.

Changes:
- Add ApplyGemini3ThinkingLevelFromMetadata for standard Gemini API
- Add ApplyGemini3ThinkingLevelFromMetadataCLI for CLI API format
- Update gemini_cli_executor to apply Gemini 3 thinkingLevel from metadata
- Update antigravity_executor to apply Gemini 3 thinkingLevel from metadata
- Update aistudio_executor to apply Gemini 3 thinkingLevel from metadata
- Add comprehensive test coverage for Gemini 3 thinkingLevel functions
2025-12-17 16:08:38 -07:00
Ben Vargas
a33f5d31fc feat: use thinkingLevel for Gemini 3 models per Google documentation
Per Google's official documentation, Gemini 3 models should use
thinkingLevel (string) instead of thinkingBudget (number) for
optimal performance.

From Google's Gemini Thinking docs:
> Use the thinkingLevel parameter with Gemini 3 models. While
> thinkingBudget is accepted for backwards compatibility, using
> it with Gemini 3 Pro may result in suboptimal performance.

Changes:
- Add model family detection functions (IsGemini3Model, IsGemini25Model,
  IsGemini3ProModel, IsGemini3FlashModel)
- Add ApplyGeminiThinkingLevel and ApplyGeminiCLIThinkingLevel functions
  for applying thinkingLevel config
- Add ValidateGemini3ThinkingLevel for model-specific level validation
- Add ThinkingBudgetToGemini3Level for backward compatibility conversion
- Update NormalizeGeminiThinkingBudget to convert budget to level for
  Gemini 3 models
- Update ApplyDefaultThinkingIfNeeded to not set a default level for
  Gemini 3 (lets API use its dynamic default "high")
- Update ConvertThinkingLevelToBudget to preserve thinkingLevel for
  Gemini 3 models
- Add Levels field to all Gemini 3 model definitions:
  - Gemini 3 Pro: ["low", "high"]
  - Gemini 3 Flash: ["minimal", "low", "medium", "high"]

Backward compatibility:
- Gemini 2.5 models continue to use thinkingBudget as before
- If thinkingBudget is provided for Gemini 3, it's converted to the
  appropriate thinkingLevel
- Existing configurations continue to work
2025-12-17 15:28:20 -07:00
Luis Pater
47885e3710 test(gemini): add test cases and improve compatibility for complex schema cases in CleanJSONSchemaForGemini function 2025-12-17 17:38:53 +08:00
이대희
aea337cfe2 feature: Improves schema flattening and tool use handling
Updates schema flattening logic to handle multiple non-null types, providing a more descriptive "Accepts" hint.

Removes redundant tracking of the current tool name in `Params` as it's no longer needed for streaming limits, simplifying the structure.
2025-12-17 17:30:23 +09:00
이대희
27734a23b1 Update internal/util/translator.go
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-17 17:15:11 +09:00
이대희
1b8e538a77 feature: Improves Gemini JSON schema compatibility
Enhances compatibility with the Gemini API by implementing a schema cleaning process.

This includes:
- Centralizing schema cleaning logic for Gemini in a dedicated utility function.
- Converting unsupported schema keywords to hints within the description field.
- Flattening complex schema structures like `anyOf`, `oneOf`, and type arrays to simplify the schema.
- Handling streaming responses with empty tool names, which can occur in subsequent chunks after the initial tool use.
2025-12-17 17:10:53 +09:00
hkfires
28a428ae2f fix(thinking): align budget effort mapping across translators
Unify thinking budget-to-effort conversion in a shared helper, handle disabled/default thinking cases in translators, adjust zero-budget mapping, and drop the old OpenAI-specific helper with updated tests.
2025-12-16 18:34:43 +08:00
hkfires
d20b71deb9 fix(thinking): normalize effort mapping
Route OpenAI reasoning effort through ThinkingEffortToBudget for Claude
translators, preserve "minimal" when translating OpenAI Responses, and
treat blank/unknown efforts as no-ops for Gemini thinking configs.

Also map budget -1 to "auto" and expand cross-protocol thinking tests.
2025-12-15 09:16:15 +08:00
hkfires
716aa71f6e fix(thinking): centralize reasoning_effort mapping
Move OpenAI `reasoning_effort` -> Gemini `thinkingConfig` budget logic into
shared helpers used by Gemini, Gemini CLI, and antigravity translators.

Normalize Claude thinking handling by preferring positive budgets, applying
budget token normalization, and gating by model support.

Always convert Gemini `thinkingBudget` back to OpenAI `reasoning_effort` to
support allowCompat models, and update tests for normalization behavior.
2025-12-15 09:16:14 +08:00
Luis Pater
f3f0f1717d Merge branch 'dev' into think 2025-12-12 22:16:44 +08:00
Luis Pater
9f511f0024 fix(executor): improve model compatibility handling for OpenAI-compatibility
Enhances payload handling by introducing OpenAI-compatibility checks and refining how reasoning metadata is resolved, ensuring broader model support.
2025-12-12 21:57:25 +08:00
hkfires
374faa2640 fix(thinking): map budgets to effort levels
Ensure thinking settings translate correctly across providers:
- Only apply reasoning_effort to level-based models and derive it from numeric
  budget suffixes when present
- Strip effort string fields for budget-based models and skip Claude/Gemini
  budget resolution for level-based or unsupported models
- Default Gemini include_thoughts when a nonzero budget override is set
- Add cross-protocol conversion and budget range tests
2025-12-12 21:33:20 +08:00
hkfires
e79f65fd8e refactor(thinking): use parentheses for metadata suffix 2025-12-11 18:39:07 +08:00
hkfires
facfe7c518 refactor(thinking): use bracket tags for thinking meta
Align thinking suffix handling on a single bracket-style marker.

NormalizeThinkingModel strips a terminal `[value]` segment from
model identifiers and turns it into either a thinking budget (for
numeric values) or a reasoning effort hint (for strings). Emission
of `ThinkingIncludeThoughtsMetadataKey` is removed.

Executor helpers and the example config are updated so their
comments reference the new `[value]` suffix format instead of the
legacy dash variants.

BREAKING CHANGE: dash-based thinking suffixes (`-thinking`,
`-thinking-N`, `-reasoning`, `-nothinking`) are no longer parsed
for thinking metadata; only `[value]` annotations are recognized.
2025-12-11 18:17:28 +08:00
hkfires
6285459c08 fix(runtime): unify claude thinking config resolution 2025-12-11 17:20:44 +08:00
hkfires
007572b58e fix(util): do not strip thinking suffix on registered models
NormalizeThinkingModel now checks ModelSupportsThinking before removing
"-thinking" or "-thinking-<ver>", avoiding accidental parsing of model
names where the suffix is part of the official id (e.g., kimi-k2-thinking,
qwen3-235b-a22b-thinking-2507).

The registry adds ThinkingSupport metadata for several models and
propagates it via ModelInfo (e.g., kimi-k2-thinking, deepseek-r1,
qwen3-235b-a22b-thinking-2507, minimax-m2), enabling accurate detection
of thinking-capable models and correcting base model inference.
2025-12-11 15:52:14 +08:00
hkfires
3a81ab22fd fix(runtime): unify reasoning effort metadata overrides 2025-12-11 14:35:05 +08:00
hkfires
519da2e042 fix(runtime): validate reasoning effort levels 2025-12-11 12:36:54 +08:00
hkfires
169f4295d0 fix(util): align reasoning effort handling with registry 2025-12-11 12:20:12 +08:00
hkfires
d06d0eab2f fix(util): centralize reasoning effort normalization 2025-12-11 12:14:51 +08:00
hkfires
3ffd120ae9 feat(runtime): add thinking config normalization 2025-12-11 11:51:33 +08:00
Luis Pater
423ce97665 feat(util): implement dynamic thinking suffix normalization and refactor budget resolution logic
- Added support for parsing and normalizing dynamic thinking model suffixes.
- Centralized budget resolution across executors and payload helpers.
- Retired legacy Gemini-specific thinking handlers in favor of unified logic.
- Updated executors to use metadata-based thinking configuration.
- Added `ResolveOriginalModel` utility for resolving normalized upstream models using request metadata.
- Updated executors (Gemini, Codex, iFlow, OpenAI, Qwen) to incorporate upstream model resolution and substitute model values in payloads and request URLs.
- Ensured fallbacks handle cases with missing or malformed metadata to derive models robustly.
- Refactored upstream model resolution to dynamically incorporate metadata for selecting and normalizing models.
- Improved handling of thinking configurations and model overrides in executors.
- Removed hardcoded thinking model entries and migrated logic to metadata-based resolution.
- Updated payload mutations to always include the resolved model.
2025-12-11 03:10:50 +08:00
hkfires
9b202b6c1c fix(executor): centralize default thinking config 2025-12-09 21:05:06 +08:00
hkfires
5ec9b5e5a9 feat(executor): normalize thinking budget across all Gemini executors 2025-12-09 21:05:06 +08:00
Luis Pater
d6352dd4d4 **feat(util): add DeleteKey function and update antigravity executor for Claude model compatibility** 2025-12-05 01:55:45 +08:00
auroraflux
32d3809f8c **feat(util): add -reasoning suffix support for Gemini models**
Adds support for the `-reasoning` model name suffix which enables
thinking/reasoning mode with dynamic budget. This allows clients to
request reasoning-enabled inference using model names like
`gemini-2.5-flash-reasoning` without explicit configuration.

The suffix is normalized to the base model (e.g., gemini-2.5-flash)
with thinkingBudget=-1 (dynamic) and include_thoughts=true.

Follows the existing pattern established by -nothinking and
-thinking-N suffixes.
2025-11-30 01:18:57 -08:00
Luis Pater
cbcfeb92cc Fixed: #291
**feat(executor): add thinking level to budget conversion utility**

- Introduced `ConvertThinkingLevelToBudget` to map thinking level ("high"/"low") to corresponding budget values.
- Applied the utility in `aistudio_executor.go` before stripping unsupported configs.
- Updated dependencies to include `tidwall/gjson` for JSON parsing.
2025-11-21 00:48:12 +08:00
Ben Vargas
8193392bfe Add AMP fallback proxy and shared Gemini normalization
- add fallback handler that forwards Amp provider requests to ampcode.com when the provider isn’t configured locally
- wrap AMP provider routes with the fallback so requests always have a handler
- share Gemini thinking model normalization helper between core handlers and AMP fallback
2025-11-19 18:23:17 -07:00
TUGOhost
92f4278039 feat: add auto model resolution and model creation timestamp tracking
- Add 'created' field to model registry for tracking model creation time
- Implement GetFirstAvailableModel() to find the first available model by newest creation timestamp
- Add ResolveAutoModel() utility function to resolve "auto" model name to actual available model
- Update request handler to resolve "auto" model before processing requests
- Ensures automatic model selection when "auto" is specified as model name

This enables dynamic model selection based on availability and creation time, improving the user experience when no specific model is requested.
2025-11-11 20:30:09 +08:00
hkfires
cfb9cb8951 feat(config): support HTTP headers across providers 2025-11-08 20:52:05 +08:00
hkfires
7c1c4ee60b feat(gemini): add Gemini API key endpoints 2025-10-31 11:09:28 +08:00
hkfires
7dd93a4a25 fix(executor): only apply thinking config to supported models 2025-10-29 19:19:17 +08:00
hkfires
41577bce07 feat(claude): map Anthropic 'thinking' to Gemini thinkingBudget 2025-10-29 19:19:17 +08:00
hkfires
359b8de44e feat(ws): add WebSocket auth 2025-10-26 07:46:04 +08:00
hkfires
d16599fa1d feat: prefer util.WritablePath() for logs and local storage 2025-10-19 10:19:55 +08:00
hkfires
9f45806106 feat(logging): centralize sensitive header masking 2025-10-18 17:16:00 +08:00
Luis Pater
4477c729a4 Fixed: #129 #123 #102 #97
feat: add all protocols request and response translation for Gemini and Gemini CLI compatibility
2025-10-17 02:11:29 +08:00
Luis Pater
ade279d1f2 Feature: #103
feat(gemini): add Gemini thinking configuration support and metadata normalization

- Introduced logic to parse and apply `thinkingBudget` and `include_thoughts` configurations from metadata.
- Enhanced request handling to include normalized Gemini model metadata, preserving the original model identifier.
- Updated Gemini and Gemini-CLI executors to apply thinking configuration based on metadata overrides.
- Refactored handlers to support metadata extraction and cloning during request preparation.
2025-10-16 11:31:18 +08:00
Luis Pater
20787cd107 feat(registry, executor, util): add support for gemini-2.5-flash-image-preview and improve aspect ratio handling
- Introduced `gemini-2.5-flash-image-preview` model to the registry with updated definitions.
- Enhanced Gemini CLI and API executors to handle image aspect ratio adjustments for the new model.
- Added utility function to create base64 white image placeholders based on aspect ratio configurations.
2025-10-10 01:49:58 +08:00