Add fallback parsing for quota reset delay when RetryInfo is not present:
- Try ErrorInfo.metadata.quotaResetDelay (e.g., "373.801628ms")
- Parse from error.message "Your quota will reset after Xs."
This ensures proper cooldown timing for rate-limited requests.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Added support for parsing and normalizing dynamic thinking model suffixes.
- Centralized budget resolution across executors and payload helpers.
- Retired legacy Gemini-specific thinking handlers in favor of unified logic.
- Updated executors to use metadata-based thinking configuration.
- Added `ResolveOriginalModel` utility for resolving normalized upstream models using request metadata.
- Updated executors (Gemini, Codex, iFlow, OpenAI, Qwen) to incorporate upstream model resolution and substitute model values in payloads and request URLs.
- Ensured fallbacks handle cases with missing or malformed metadata to derive models robustly.
- Refactored upstream model resolution to dynamically incorporate metadata for selecting and normalizing models.
- Improved handling of thinking configurations and model overrides in executors.
- Removed hardcoded thinking model entries and migrated logic to metadata-based resolution.
- Updated payload mutations to always include the resolved model.
Add new REST API endpoints under /v0/management/ampcode for managing
ampcode configuration including upstream URL, API key, localhost
restriction, model mappings, and force model mappings settings.
- Move force-model-mappings from config_basic to config_lists
- Add GET/PUT/PATCH/DELETE endpoints for all ampcode settings
- Support model mapping CRUD with upsert (PATCH) capability
- Add comprehensive test coverage for all ampcode endpoints
Add a configuration option to control whether model mappings take
precedence over local API keys for Amp CLI requests.
- Add PrioritizeModelMappings field to AmpCode config struct
- When false (default): Local API keys take precedence (original behavior)
- When true: Model mappings take precedence over local API keys
- Add management API endpoints GET/PUT /prioritize-model-mappings
This allows users who want mapping priority to enable it explicitly
while preserving backward compatibility.
Config example:
ampcode:
model-mappings:
- from: claude-opus-4-5-20251101
to: gemini-claude-opus-4-5-thinking
prioritize-model-mappings: true
Skip text content blocks that are empty or contain only whitespace
when translating Claude messages to OpenAI format. This fixes GLM-4.6
and other strict OpenAI-compatible providers that reject empty text
with error 'text cannot be empty'.
Simplifies routing logic by delegating all provider/mapping/proxy
decisions to FallbackHandler. Previously, the route checked for
provider/mapping availability before calling the handler, then
FallbackHandler performed the same checks again.
Changes:
- Remove model extraction and provider checking from route (lines 182-201)
- Route now only checks if request is POST with /models/ path
- FallbackHandler handles provider -> mapping -> proxy fallback
- Remove unused internal/util import
Benefits:
- Eliminates duplicate checks (addresses PR review feedback #2)
- Centralizes all provider/mapping logic in FallbackHandler
- Reduces routing code by ~20 lines
- Aligns with how other /api/provider routes work
Performance: No impact (checks still happen once in FallbackHandler)