feat(api): add ManagementTokenRequester interface for management token request endpoints

Merge pull request #757 from ben-vargas/fix-thinking-toolchoice-conflict
Fix: disable thinking when tool_choice forces tool use
2026-02-04 05:20:52 +08:00 · 2025-12-29 02:42:29 +08:00 · 2025-12-28 14:04:30 +08:00 · 2025-12-27 16:31:37 -07:00 · 2025-12-28 04:40:32 +08:00 · 2025-12-28 03:06:51 +08:00
49 changed files with 1886 additions and 280 deletions
--- a/.dockerignore
+++ b/.dockerignore
@@ -13,8 +13,6 @@ Dockerfile
 docs/*
 README.md
 README_CN.md
-MANAGEMENT_API.md
-MANAGEMENT_API_CN.md
 LICENSE

 # Runtime data folders (should be mounted as volumes)
@@ -32,3 +30,4 @@ bin/*
 .agent/*
 .bmad/*
 _bmad/*
+_bmad-output/*
--- a/.github/ISSUE_TEMPLATE/bug_report.md
+++ b/.github/ISSUE_TEMPLATE/bug_report.md
@@ -7,6 +7,13 @@ assignees: ''

 ---

+**Is it a request payload issue?**
+[  ] Yes, this is a request payload issue. I am using a client/cURL to send a request payload, but I received an unexpected error.
+[  ] No, it's another issue.
+
+**If it's a request payload issue, you MUST know**
+Our team doesn't have any GODs or ORACLEs or MIND READERs. Please make sure to attach the request log or curl payload.
+
 **Describe the bug**
 A clear and concise description of what the bug is.

--- a/.gitignore
+++ b/.gitignore
@@ -11,11 +11,15 @@ bin/*
 logs/*
 conv/*
 temp/*
+refs/*
+
+# Storage backends
 pgstore/*
 gitstore/*
 objectstore/*
+
+# Static assets
 static/*
-refs/*

 # Authentication data
 auths/*
@@ -35,6 +39,7 @@ GEMINI.md
 .agent/*
 .bmad/*
 _bmad/*
+_bmad-output/*

 # macOS
 .DS_Store
--- a/README.md
+++ b/README.md
@@ -10,11 +10,11 @@ So you can use local or multi-account CLI access with OpenAI(include Responses)/

 ## Sponsor

-[![z.ai](https://assets.router-for.me/english.png)](https://z.ai/subscribe?ic=8JVLJQFSKB)
+[![z.ai](https://assets.router-for.me/english-4.7.png)](https://z.ai/subscribe?ic=8JVLJQFSKB)

 This project is sponsored by Z.ai, supporting us with their GLM CODING PLAN.

-GLM CODING PLAN is a subscription service designed for AI coding, starting at just $3/month. It provides access to their flagship GLM-4.6 model across 10+ popular AI coding tools (Claude Code, Cline, Roo Code, etc.), offering developers top-tier, fast, and stable coding experiences.
+GLM CODING PLAN is a subscription service designed for AI coding, starting at just $3/month. It provides access to their flagship GLM-4.7 model across 10+ popular AI coding tools (Claude Code, Cline, Roo Code, etc.), offering developers top-tier, fast, and stable coding experiences.

 Get 10% OFF GLM CODING PLAN：https://z.ai/subscribe?ic=8JVLJQFSKB

@@ -26,6 +26,10 @@ Get 10% OFF GLM CODING PLAN：https://z.ai/subscribe?ic=8JVLJQFSKB
 <td width="180"><a href="https://www.packyapi.com/register?aff=cliproxyapi"><img src="./assets/packycode.png" alt="PackyCode" width="150"></a></td>
 <td>Thanks to PackyCode for sponsoring this project! PackyCode is a reliable and efficient API relay service provider, offering relay services for Claude Code, Codex, Gemini, and more. PackyCode provides special discounts for our software users: register using <a href="https://www.packyapi.com/register?aff=cliproxyapi">this link</a> and enter the "cliproxyapi" promo code during recharge to get 10% off.</td>
 </tr>
+<tr>
+<td width="180"><a href="https://cubence.com/signup?code=CLIPROXYAPI&source=cpa"><img src="./assets/cubence.png" alt="Cubence" width="150"></a></td>
+<td>Thanks to Cubence for sponsoring this project! Cubence is a reliable and efficient API relay service provider, offering relay services for Claude Code, Codex, Gemini, and more. Cubence provides special discounts for our software users: register using <a href="https://cubence.com/signup?code=CLIPROXYAPI&source=cpa">this link</a> and enter the "CLIPROXYAPI" promo code during recharge to get 10% off.</td>
+</tr>
 </tbody>
 </table>

@@ -110,6 +114,10 @@ CLI wrapper for instant switching between multiple Claude accounts and alternati

 Native macOS GUI for managing CLIProxyAPI: configure providers, model mappings, and endpoints via OAuth - no API keys needed.

+### [Quotio](https://github.com/nguyenphutrong/quotio)
+
+Native macOS menu bar app that unifies Claude, Gemini, OpenAI, Qwen, and Antigravity subscriptions with real-time quota tracking and smart auto-failover for AI coding tools like Claude Code, OpenCode, and Droid - no API keys needed.
+
 > [!NOTE]  
 > If you developed a project based on CLIProxyAPI, please open a PR to add it to this list.

--- a/README_CN.md
+++ b/README_CN.md
@@ -10,11 +10,11 @@

 ## 赞助商

-[![bigmodel.cn](https://assets.router-for.me/chinese.png)](https://www.bigmodel.cn/claude-code?ic=RRVJPB5SII)
+[![bigmodel.cn](https://assets.router-for.me/chinese-4.7.png)](https://www.bigmodel.cn/claude-code?ic=RRVJPB5SII)

 本项目由 Z智谱 提供赞助, 他们通过 GLM CODING PLAN 对本项目提供技术支持。

-GLM CODING PLAN 是专为AI编码打造的订阅套餐，每月最低仅需20元，即可在十余款主流AI编码工具如 Claude Code、Cline、Roo Code 中畅享智谱旗舰模型GLM-4.6，为开发者提供顶尖的编码体验。
+GLM CODING PLAN 是专为AI编码打造的订阅套餐，每月最低仅需20元，即可在十余款主流AI编码工具如 Claude Code、Cline、Roo Code 中畅享智谱旗舰模型GLM-4.7，为开发者提供顶尖的编码体验。

 智谱AI为本软件提供了特别优惠，使用以下链接购买可以享受九折优惠：https://www.bigmodel.cn/claude-code?ic=RRVJPB5SII

@@ -26,9 +26,14 @@ GLM CODING PLAN 是专为AI编码打造的订阅套餐，每月最低仅需20元
 <td width="180"><a href="https://www.packyapi.com/register?aff=cliproxyapi"><img src="./assets/packycode.png" alt="PackyCode" width="150"></a></td>
 <td>感谢 PackyCode 对本项目的赞助！PackyCode 是一家可靠高效的 API 中转服务商，提供 Claude Code、Codex、Gemini 等多种服务的中转。PackyCode 为本软件用户提供了特别优惠：使用<a href="https://www.packyapi.com/register?aff=cliproxyapi">此链接</a>注册，并在充值时输入 "cliproxyapi" 优惠码即可享受九折优惠。</td>
 </tr>
+<tr>
+<td width="180"><a href="https://cubence.com/signup?code=CLIPROXYAPI&source=cpa"><img src="./assets/cubence.png" alt="Cubence" width="150"></a></td>
+<td>感谢 Cubence 对本项目的赞助！Cubence 是一家可靠高效的 API 中转服务商，提供 Claude Code、Codex、Gemini 等多种服务的中转。Cubence 为本软件用户提供了特别优惠：使用<a href="https://cubence.com/signup?code=CLIPROXYAPI&source=cpa">此链接</a>注册，并在充值时输入 "CLIPROXYAPI" 优惠码即可享受九折优惠。</td>
+</tr>
 </tbody>
 </table>

+
 ## 功能特性

 - 为 CLI 模型提供 OpenAI/Gemini/Claude/Codex 兼容的 API 端点
@@ -108,6 +113,10 @@ CLI 封装器，用于通过 CLIProxyAPI OAuth 即时切换多个 Claude 账户

 基于 macOS 平台的原生 CLIProxyAPI GUI：配置供应商、模型映射以及OAuth端点，无需 API 密钥。

+### [Quotio](https://github.com/nguyenphutrong/quotio)
+
+原生 macOS 菜单栏应用，统一管理 Claude、Gemini、OpenAI、Qwen 和 Antigravity 订阅，提供实时配额追踪和智能自动故障转移，支持 Claude Code、OpenCode 和 Droid 等 AI 编程工具，无需 API 密钥。
+
 > [!NOTE]  
 > 如果你开发了基于 CLIProxyAPI 的项目，请提交一个 PR（拉取请求）将其添加到此列表中。

--- a/assets/cubence.png
+++ b/assets/cubence.png
--- a/config.example.yaml
+++ b/config.example.yaml
@@ -39,6 +39,9 @@ api-keys:
 # Enable debug logging
 debug: false

+# When true, disable high-overhead HTTP middleware features to reduce per-request memory usage under high concurrency.
+commercial-mode: false
+
 # When true, write application logs to rotating files instead of stdout
 logging-to-file: false

@@ -101,6 +104,9 @@ ws-auth: false
 #     headers:
 #       X-Custom-Header: "custom-value"
 #     proxy-url: "socks5://proxy.example.com:1080" # optional: per-key proxy override
+#     models:
+#       - name: "gpt-5-codex" # upstream model name
+#         alias: "codex-latest" # client alias mapped to the upstream model
 #     excluded-models:
 #       - "gpt-5.1"         # exclude specific models (exact match)
 #       - "gpt-5-*"         # wildcard matching prefix (e.g. gpt-5-medium, gpt-5-codex)
--- a/internal/api/handlers/management/api_tools.go
+++ b/internal/api/handlers/management/api_tools.go
@@ -0,0 +1,538 @@
+package management
+
+import (
+	"context"
+	"encoding/json"
+	"fmt"
+	"io"
+	"net"
+	"net/http"
+	"net/url"
+	"strings"
+	"time"
+
+	"github.com/gin-gonic/gin"
+	"github.com/router-for-me/CLIProxyAPI/v6/internal/runtime/geminicli"
+	coreauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
+	log "github.com/sirupsen/logrus"
+	"golang.org/x/net/proxy"
+	"golang.org/x/oauth2"
+	"golang.org/x/oauth2/google"
+)
+
+const defaultAPICallTimeout = 60 * time.Second
+
+const (
+	geminiOAuthClientID     = "681255809395-oo8ft2oprdrnp9e3aqf6av3hmdib135j.apps.googleusercontent.com"
+	geminiOAuthClientSecret = "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"
+)
+
+var geminiOAuthScopes = []string{
+	"https://www.googleapis.com/auth/cloud-platform",
+	"https://www.googleapis.com/auth/userinfo.email",
+	"https://www.googleapis.com/auth/userinfo.profile",
+}
+
+type apiCallRequest struct {
+	AuthIndexSnake  *string           `json:"auth_index"`
+	AuthIndexCamel  *string           `json:"authIndex"`
+	AuthIndexPascal *string           `json:"AuthIndex"`
+	Method          string            `json:"method"`
+	URL             string            `json:"url"`
+	Header          map[string]string `json:"header"`
+	Data            string            `json:"data"`
+}
+
+type apiCallResponse struct {
+	StatusCode int                 `json:"status_code"`
+	Header     map[string][]string `json:"header"`
+	Body       string              `json:"body"`
+}
+
+// APICall makes a generic HTTP request on behalf of the management API caller.
+// It is protected by the management middleware.
+//
+// Endpoint:
+//
+//	POST /v0/management/api-call
+//
+// Authentication:
+//
+//	Same as other management APIs (requires a management key and remote-management rules).
+//	You can provide the key via:
+//	- Authorization: Bearer <key>
+//	- X-Management-Key: <key>
+//
+// Request JSON:
+//   - auth_index / authIndex / AuthIndex (optional):
+//     The credential "auth_index" from GET /v0/management/auth-files (or other endpoints returning it).
+//     If omitted or not found, credential-specific proxy/token substitution is skipped.
+//   - method (required): HTTP method, e.g. GET, POST, PUT, PATCH, DELETE.
+//   - url (required): Absolute URL including scheme and host, e.g. "https://api.example.com/v1/ping".
+//   - header (optional): Request headers map.
+//     Supports magic variable "$TOKEN$" which is replaced using the selected credential:
+//     1) metadata.access_token
+//     2) attributes.api_key
+//     3) metadata.token / metadata.id_token / metadata.cookie
+//     Example: {"Authorization":"Bearer $TOKEN$"}.
+//     Note: if you need to override the HTTP Host header, set header["Host"].
+//   - data (optional): Raw request body as string (useful for POST/PUT/PATCH).
+//
+// Proxy selection (highest priority first):
+//  1. Selected credential proxy_url
+//  2. Global config proxy-url
+//  3. Direct connect (environment proxies are not used)
+//
+// Response JSON (returned with HTTP 200 when the APICall itself succeeds):
+//   - status_code: Upstream HTTP status code.
+//   - header: Upstream response headers.
+//   - body: Upstream response body as string.
+//
+// Example:
+//
+//	curl -sS -X POST "http://127.0.0.1:8317/v0/management/api-call" \
+//	  -H "Authorization: Bearer <MANAGEMENT_KEY>" \
+//	  -H "Content-Type: application/json" \
+//	  -d '{"auth_index":"<AUTH_INDEX>","method":"GET","url":"https://api.example.com/v1/ping","header":{"Authorization":"Bearer $TOKEN$"}}'
+//
+//	curl -sS -X POST "http://127.0.0.1:8317/v0/management/api-call" \
+//	  -H "Authorization: Bearer 831227" \
+//	  -H "Content-Type: application/json" \
+//	  -d '{"auth_index":"<AUTH_INDEX>","method":"POST","url":"https://api.example.com/v1/fetchAvailableModels","header":{"Authorization":"Bearer $TOKEN$","Content-Type":"application/json","User-Agent":"cliproxyapi"},"data":"{}"}'
+func (h *Handler) APICall(c *gin.Context) {
+	var body apiCallRequest
+	if errBindJSON := c.ShouldBindJSON(&body); errBindJSON != nil {
+		c.JSON(http.StatusBadRequest, gin.H{"error": "invalid body"})
+		return
+	}
+
+	method := strings.ToUpper(strings.TrimSpace(body.Method))
+	if method == "" {
+		c.JSON(http.StatusBadRequest, gin.H{"error": "missing method"})
+		return
+	}
+
+	urlStr := strings.TrimSpace(body.URL)
+	if urlStr == "" {
+		c.JSON(http.StatusBadRequest, gin.H{"error": "missing url"})
+		return
+	}
+	parsedURL, errParseURL := url.Parse(urlStr)
+	if errParseURL != nil || parsedURL.Scheme == "" || parsedURL.Host == "" {
+		c.JSON(http.StatusBadRequest, gin.H{"error": "invalid url"})
+		return
+	}
+
+	authIndex := firstNonEmptyString(body.AuthIndexSnake, body.AuthIndexCamel, body.AuthIndexPascal)
+	auth := h.authByIndex(authIndex)
+
+	reqHeaders := body.Header
+	if reqHeaders == nil {
+		reqHeaders = map[string]string{}
+	}
+
+	var hostOverride string
+	var token string
+	var tokenResolved bool
+	var tokenErr error
+	for key, value := range reqHeaders {
+		if !strings.Contains(value, "$TOKEN$") {
+			continue
+		}
+		if !tokenResolved {
+			token, tokenErr = h.resolveTokenForAuth(c.Request.Context(), auth)
+			tokenResolved = true
+		}
+		if auth != nil && token == "" {
+			if tokenErr != nil {
+				c.JSON(http.StatusBadRequest, gin.H{"error": "auth token refresh failed"})
+				return
+			}
+			c.JSON(http.StatusBadRequest, gin.H{"error": "auth token not found"})
+			return
+		}
+		if token == "" {
+			continue
+		}
+		reqHeaders[key] = strings.ReplaceAll(value, "$TOKEN$", token)
+	}
+
+	var requestBody io.Reader
+	if body.Data != "" {
+		requestBody = strings.NewReader(body.Data)
+	}
+
+	req, errNewRequest := http.NewRequestWithContext(c.Request.Context(), method, urlStr, requestBody)
+	if errNewRequest != nil {
+		c.JSON(http.StatusBadRequest, gin.H{"error": "failed to build request"})
+		return
+	}
+
+	for key, value := range reqHeaders {
+		if strings.EqualFold(key, "host") {
+			hostOverride = strings.TrimSpace(value)
+			continue
+		}
+		req.Header.Set(key, value)
+	}
+	if hostOverride != "" {
+		req.Host = hostOverride
+	}
+
+	httpClient := &http.Client{
+		Timeout: defaultAPICallTimeout,
+	}
+	httpClient.Transport = h.apiCallTransport(auth)
+
+	resp, errDo := httpClient.Do(req)
+	if errDo != nil {
+		log.WithError(errDo).Debug("management APICall request failed")
+		c.JSON(http.StatusBadGateway, gin.H{"error": "request failed"})
+		return
+	}
+	defer func() {
+		if errClose := resp.Body.Close(); errClose != nil {
+			log.Errorf("response body close error: %v", errClose)
+		}
+	}()
+
+	respBody, errReadAll := io.ReadAll(resp.Body)
+	if errReadAll != nil {
+		c.JSON(http.StatusBadGateway, gin.H{"error": "failed to read response"})
+		return
+	}
+
+	c.JSON(http.StatusOK, apiCallResponse{
+		StatusCode: resp.StatusCode,
+		Header:     resp.Header,
+		Body:       string(respBody),
+	})
+}
+
+func firstNonEmptyString(values ...*string) string {
+	for _, v := range values {
+		if v == nil {
+			continue
+		}
+		if out := strings.TrimSpace(*v); out != "" {
+			return out
+		}
+	}
+	return ""
+}
+
+func tokenValueForAuth(auth *coreauth.Auth) string {
+	if auth == nil {
+		return ""
+	}
+	if v := tokenValueFromMetadata(auth.Metadata); v != "" {
+		return v
+	}
+	if auth.Attributes != nil {
+		if v := strings.TrimSpace(auth.Attributes["api_key"]); v != "" {
+			return v
+		}
+	}
+	if shared := geminicli.ResolveSharedCredential(auth.Runtime); shared != nil {
+		if v := tokenValueFromMetadata(shared.MetadataSnapshot()); v != "" {
+			return v
+		}
+	}
+	return ""
+}
+
+func (h *Handler) resolveTokenForAuth(ctx context.Context, auth *coreauth.Auth) (string, error) {
+	if auth == nil {
+		return "", nil
+	}
+
+	provider := strings.ToLower(strings.TrimSpace(auth.Provider))
+	if provider == "gemini-cli" {
+		token, errToken := h.refreshGeminiOAuthAccessToken(ctx, auth)
+		return token, errToken
+	}
+
+	return tokenValueForAuth(auth), nil
+}
+
+func (h *Handler) refreshGeminiOAuthAccessToken(ctx context.Context, auth *coreauth.Auth) (string, error) {
+	if ctx == nil {
+		ctx = context.Background()
+	}
+	if auth == nil {
+		return "", nil
+	}
+
+	metadata, updater := geminiOAuthMetadata(auth)
+	if len(metadata) == 0 {
+		return "", fmt.Errorf("gemini oauth metadata missing")
+	}
+
+	base := make(map[string]any)
+	if tokenRaw, ok := metadata["token"].(map[string]any); ok && tokenRaw != nil {
+		base = cloneMap(tokenRaw)
+	}
+
+	var token oauth2.Token
+	if len(base) > 0 {
+		if raw, errMarshal := json.Marshal(base); errMarshal == nil {
+			_ = json.Unmarshal(raw, &token)
+		}
+	}
+
+	if token.AccessToken == "" {
+		token.AccessToken = stringValue(metadata, "access_token")
+	}
+	if token.RefreshToken == "" {
+		token.RefreshToken = stringValue(metadata, "refresh_token")
+	}
+	if token.TokenType == "" {
+		token.TokenType = stringValue(metadata, "token_type")
+	}
+	if token.Expiry.IsZero() {
+		if expiry := stringValue(metadata, "expiry"); expiry != "" {
+			if ts, errParseTime := time.Parse(time.RFC3339, expiry); errParseTime == nil {
+				token.Expiry = ts
+			}
+		}
+	}
+
+	conf := &oauth2.Config{
+		ClientID:     geminiOAuthClientID,
+		ClientSecret: geminiOAuthClientSecret,
+		Scopes:       geminiOAuthScopes,
+		Endpoint:     google.Endpoint,
+	}
+
+	ctxToken := ctx
+	httpClient := &http.Client{
+		Timeout:   defaultAPICallTimeout,
+		Transport: h.apiCallTransport(auth),
+	}
+	ctxToken = context.WithValue(ctxToken, oauth2.HTTPClient, httpClient)
+
+	src := conf.TokenSource(ctxToken, &token)
+	currentToken, errToken := src.Token()
+	if errToken != nil {
+		return "", errToken
+	}
+
+	merged := buildOAuthTokenMap(base, currentToken)
+	fields := buildOAuthTokenFields(currentToken, merged)
+	if updater != nil {
+		updater(fields)
+	}
+	return strings.TrimSpace(currentToken.AccessToken), nil
+}
+
+func geminiOAuthMetadata(auth *coreauth.Auth) (map[string]any, func(map[string]any)) {
+	if auth == nil {
+		return nil, nil
+	}
+	if shared := geminicli.ResolveSharedCredential(auth.Runtime); shared != nil {
+		snapshot := shared.MetadataSnapshot()
+		return snapshot, func(fields map[string]any) { shared.MergeMetadata(fields) }
+	}
+	return auth.Metadata, func(fields map[string]any) {
+		if auth.Metadata == nil {
+			auth.Metadata = make(map[string]any)
+		}
+		for k, v := range fields {
+			auth.Metadata[k] = v
+		}
+	}
+}
+
+func stringValue(metadata map[string]any, key string) string {
+	if len(metadata) == 0 || key == "" {
+		return ""
+	}
+	if v, ok := metadata[key].(string); ok {
+		return strings.TrimSpace(v)
+	}
+	return ""
+}
+
+func cloneMap(in map[string]any) map[string]any {
+	if len(in) == 0 {
+		return nil
+	}
+	out := make(map[string]any, len(in))
+	for k, v := range in {
+		out[k] = v
+	}
+	return out
+}
+
+func buildOAuthTokenMap(base map[string]any, tok *oauth2.Token) map[string]any {
+	merged := cloneMap(base)
+	if merged == nil {
+		merged = make(map[string]any)
+	}
+	if tok == nil {
+		return merged
+	}
+	if raw, errMarshal := json.Marshal(tok); errMarshal == nil {
+		var tokenMap map[string]any
+		if errUnmarshal := json.Unmarshal(raw, &tokenMap); errUnmarshal == nil {
+			for k, v := range tokenMap {
+				merged[k] = v
+			}
+		}
+	}
+	return merged
+}
+
+func buildOAuthTokenFields(tok *oauth2.Token, merged map[string]any) map[string]any {
+	fields := make(map[string]any, 5)
+	if tok != nil && tok.AccessToken != "" {
+		fields["access_token"] = tok.AccessToken
+	}
+	if tok != nil && tok.TokenType != "" {
+		fields["token_type"] = tok.TokenType
+	}
+	if tok != nil && tok.RefreshToken != "" {
+		fields["refresh_token"] = tok.RefreshToken
+	}
+	if tok != nil && !tok.Expiry.IsZero() {
+		fields["expiry"] = tok.Expiry.Format(time.RFC3339)
+	}
+	if len(merged) > 0 {
+		fields["token"] = cloneMap(merged)
+	}
+	return fields
+}
+
+func tokenValueFromMetadata(metadata map[string]any) string {
+	if len(metadata) == 0 {
+		return ""
+	}
+	if v, ok := metadata["accessToken"].(string); ok && strings.TrimSpace(v) != "" {
+		return strings.TrimSpace(v)
+	}
+	if v, ok := metadata["access_token"].(string); ok && strings.TrimSpace(v) != "" {
+		return strings.TrimSpace(v)
+	}
+	if tokenRaw, ok := metadata["token"]; ok && tokenRaw != nil {
+		switch typed := tokenRaw.(type) {
+		case string:
+			if v := strings.TrimSpace(typed); v != "" {
+				return v
+			}
+		case map[string]any:
+			if v, ok := typed["access_token"].(string); ok && strings.TrimSpace(v) != "" {
+				return strings.TrimSpace(v)
+			}
+			if v, ok := typed["accessToken"].(string); ok && strings.TrimSpace(v) != "" {
+				return strings.TrimSpace(v)
+			}
+		case map[string]string:
+			if v := strings.TrimSpace(typed["access_token"]); v != "" {
+				return v
+			}
+			if v := strings.TrimSpace(typed["accessToken"]); v != "" {
+				return v
+			}
+		}
+	}
+	if v, ok := metadata["token"].(string); ok && strings.TrimSpace(v) != "" {
+		return strings.TrimSpace(v)
+	}
+	if v, ok := metadata["id_token"].(string); ok && strings.TrimSpace(v) != "" {
+		return strings.TrimSpace(v)
+	}
+	if v, ok := metadata["cookie"].(string); ok && strings.TrimSpace(v) != "" {
+		return strings.TrimSpace(v)
+	}
+	return ""
+}
+
+func (h *Handler) authByIndex(authIndex string) *coreauth.Auth {
+	authIndex = strings.TrimSpace(authIndex)
+	if authIndex == "" || h == nil || h.authManager == nil {
+		return nil
+	}
+	auths := h.authManager.List()
+	for _, auth := range auths {
+		if auth == nil {
+			continue
+		}
+		auth.EnsureIndex()
+		if auth.Index == authIndex {
+			return auth
+		}
+	}
+	return nil
+}
+
+func (h *Handler) apiCallTransport(auth *coreauth.Auth) http.RoundTripper {
+	var proxyCandidates []string
+	if auth != nil {
+		if proxyStr := strings.TrimSpace(auth.ProxyURL); proxyStr != "" {
+			proxyCandidates = append(proxyCandidates, proxyStr)
+		}
+	}
+	if h != nil && h.cfg != nil {
+		if proxyStr := strings.TrimSpace(h.cfg.ProxyURL); proxyStr != "" {
+			proxyCandidates = append(proxyCandidates, proxyStr)
+		}
+	}
+
+	for _, proxyStr := range proxyCandidates {
+		if transport := buildProxyTransport(proxyStr); transport != nil {
+			return transport
+		}
+	}
+
+	transport, ok := http.DefaultTransport.(*http.Transport)
+	if !ok || transport == nil {
+		return &http.Transport{Proxy: nil}
+	}
+	clone := transport.Clone()
+	clone.Proxy = nil
+	return clone
+}
+
+func buildProxyTransport(proxyStr string) *http.Transport {
+	proxyStr = strings.TrimSpace(proxyStr)
+	if proxyStr == "" {
+		return nil
+	}
+
+	proxyURL, errParse := url.Parse(proxyStr)
+	if errParse != nil {
+		log.WithError(errParse).Debug("parse proxy URL failed")
+		return nil
+	}
+	if proxyURL.Scheme == "" || proxyURL.Host == "" {
+		log.Debug("proxy URL missing scheme/host")
+		return nil
+	}
+
+	if proxyURL.Scheme == "socks5" {
+		var proxyAuth *proxy.Auth
+		if proxyURL.User != nil {
+			username := proxyURL.User.Username()
+			password, _ := proxyURL.User.Password()
+			proxyAuth = &proxy.Auth{User: username, Password: password}
+		}
+		dialer, errSOCKS5 := proxy.SOCKS5("tcp", proxyURL.Host, proxyAuth, proxy.Direct)
+		if errSOCKS5 != nil {
+			log.WithError(errSOCKS5).Debug("create SOCKS5 dialer failed")
+			return nil
+		}
+		return &http.Transport{
+			Proxy: nil,
+			DialContext: func(ctx context.Context, network, addr string) (net.Conn, error) {
+				return dialer.Dial(network, addr)
+			},
+		}
+	}
+
+	if proxyURL.Scheme == "http" || proxyURL.Scheme == "https" {
+		return &http.Transport{Proxy: http.ProxyURL(proxyURL)}
+	}
+
+	log.Debugf("unsupported proxy scheme: %s", proxyURL.Scheme)
+	return nil
+}
--- a/internal/api/handlers/management/config_lists.go
+++ b/internal/api/handlers/management/config_lists.go
@@ -597,11 +597,7 @@ func (h *Handler) PutCodexKeys(c *gin.Context) {
 	filtered := make([]config.CodexKey, 0, len(arr))
 	for i := range arr {
 		entry := arr[i]
-		entry.APIKey = strings.TrimSpace(entry.APIKey)
-		entry.BaseURL = strings.TrimSpace(entry.BaseURL)
-		entry.ProxyURL = strings.TrimSpace(entry.ProxyURL)
-		entry.Headers = config.NormalizeHeaders(entry.Headers)
-		entry.ExcludedModels = config.NormalizeExcludedModels(entry.ExcludedModels)
+		normalizeCodexKey(&entry)
 		if entry.BaseURL == "" {
 			continue
 		}
@@ -613,12 +609,13 @@ func (h *Handler) PutCodexKeys(c *gin.Context) {
 }
 func (h *Handler) PatchCodexKey(c *gin.Context) {
 	type codexKeyPatch struct {
-		APIKey         *string            `json:"api-key"`
-		Prefix         *string            `json:"prefix"`
-		BaseURL        *string            `json:"base-url"`
-		ProxyURL       *string            `json:"proxy-url"`
-		Headers        *map[string]string `json:"headers"`
-		ExcludedModels *[]string          `json:"excluded-models"`
+		APIKey         *string              `json:"api-key"`
+		Prefix         *string              `json:"prefix"`
+		BaseURL        *string              `json:"base-url"`
+		ProxyURL       *string              `json:"proxy-url"`
+		Models         *[]config.CodexModel `json:"models"`
+		Headers        *map[string]string   `json:"headers"`
+		ExcludedModels *[]string            `json:"excluded-models"`
 	}
 	var body struct {
 		Index *int           `json:"index"`
@@ -667,12 +664,16 @@ func (h *Handler) PatchCodexKey(c *gin.Context) {
 	if body.Value.ProxyURL != nil {
 		entry.ProxyURL = strings.TrimSpace(*body.Value.ProxyURL)
 	}
+	if body.Value.Models != nil {
+		entry.Models = append([]config.CodexModel(nil), (*body.Value.Models)...)
+	}
 	if body.Value.Headers != nil {
 		entry.Headers = config.NormalizeHeaders(*body.Value.Headers)
 	}
 	if body.Value.ExcludedModels != nil {
 		entry.ExcludedModels = config.NormalizeExcludedModels(*body.Value.ExcludedModels)
 	}
+	normalizeCodexKey(&entry)
 	h.cfg.CodexKey[targetIndex] = entry
 	h.cfg.SanitizeCodexKeys()
 	h.persist(c)
@@ -762,6 +763,32 @@ func normalizeClaudeKey(entry *config.ClaudeKey) {
 	entry.Models = normalized
 }

+func normalizeCodexKey(entry *config.CodexKey) {
+	if entry == nil {
+		return
+	}
+	entry.APIKey = strings.TrimSpace(entry.APIKey)
+	entry.Prefix = strings.TrimSpace(entry.Prefix)
+	entry.BaseURL = strings.TrimSpace(entry.BaseURL)
+	entry.ProxyURL = strings.TrimSpace(entry.ProxyURL)
+	entry.Headers = config.NormalizeHeaders(entry.Headers)
+	entry.ExcludedModels = config.NormalizeExcludedModels(entry.ExcludedModels)
+	if len(entry.Models) == 0 {
+		return
+	}
+	normalized := make([]config.CodexModel, 0, len(entry.Models))
+	for i := range entry.Models {
+		model := entry.Models[i]
+		model.Name = strings.TrimSpace(model.Name)
+		model.Alias = strings.TrimSpace(model.Alias)
+		if model.Name == "" && model.Alias == "" {
+			continue
+		}
+		normalized = append(normalized, model)
+	}
+	entry.Models = normalized
+}
+
 // GetAmpCode returns the complete ampcode configuration.
 func (h *Handler) GetAmpCode(c *gin.Context) {
 	if h == nil || h.cfg == nil {
--- a/internal/api/handlers/management/handler.go
+++ b/internal/api/handlers/management/handler.go
@@ -59,6 +59,11 @@ func NewHandler(cfg *config.Config, configFilePath string, manager *coreauth.Man
 	}
 }

+// NewHandler creates a new management handler instance.
+func NewHandlerWithoutConfigFilePath(cfg *config.Config, manager *coreauth.Manager) *Handler {
+	return NewHandler(cfg, "", manager)
+}
+
 // SetConfig updates the in-memory config reference when the server hot-reloads.
 func (h *Handler) SetConfig(cfg *config.Config) { h.cfg = cfg }

--- a/internal/api/handlers/management/logs.go
+++ b/internal/api/handlers/management/logs.go
@@ -209,6 +209,94 @@ func (h *Handler) GetRequestErrorLogs(c *gin.Context) {
 	c.JSON(http.StatusOK, gin.H{"files": files})
 }

+// GetRequestLogByID finds and downloads a request log file by its request ID.
+// The ID is matched against the suffix of log file names (format: *-{requestID}.log).
+func (h *Handler) GetRequestLogByID(c *gin.Context) {
+	if h == nil {
+		c.JSON(http.StatusInternalServerError, gin.H{"error": "handler unavailable"})
+		return
+	}
+	if h.cfg == nil {
+		c.JSON(http.StatusServiceUnavailable, gin.H{"error": "configuration unavailable"})
+		return
+	}
+
+	dir := h.logDirectory()
+	if strings.TrimSpace(dir) == "" {
+		c.JSON(http.StatusInternalServerError, gin.H{"error": "log directory not configured"})
+		return
+	}
+
+	requestID := strings.TrimSpace(c.Param("id"))
+	if requestID == "" {
+		requestID = strings.TrimSpace(c.Query("id"))
+	}
+	if requestID == "" {
+		c.JSON(http.StatusBadRequest, gin.H{"error": "missing request ID"})
+		return
+	}
+	if strings.ContainsAny(requestID, "/\\") {
+		c.JSON(http.StatusBadRequest, gin.H{"error": "invalid request ID"})
+		return
+	}
+
+	entries, err := os.ReadDir(dir)
+	if err != nil {
+		if os.IsNotExist(err) {
+			c.JSON(http.StatusNotFound, gin.H{"error": "log directory not found"})
+			return
+		}
+		c.JSON(http.StatusInternalServerError, gin.H{"error": fmt.Sprintf("failed to list log directory: %v", err)})
+		return
+	}
+
+	suffix := "-" + requestID + ".log"
+	var matchedFile string
+	for _, entry := range entries {
+		if entry.IsDir() {
+			continue
+		}
+		name := entry.Name()
+		if strings.HasSuffix(name, suffix) {
+			matchedFile = name
+			break
+		}
+	}
+
+	if matchedFile == "" {
+		c.JSON(http.StatusNotFound, gin.H{"error": "log file not found for the given request ID"})
+		return
+	}
+
+	dirAbs, errAbs := filepath.Abs(dir)
+	if errAbs != nil {
+		c.JSON(http.StatusInternalServerError, gin.H{"error": fmt.Sprintf("failed to resolve log directory: %v", errAbs)})
+		return
+	}
+	fullPath := filepath.Clean(filepath.Join(dirAbs, matchedFile))
+	prefix := dirAbs + string(os.PathSeparator)
+	if !strings.HasPrefix(fullPath, prefix) {
+		c.JSON(http.StatusBadRequest, gin.H{"error": "invalid log file path"})
+		return
+	}
+
+	info, errStat := os.Stat(fullPath)
+	if errStat != nil {
+		if os.IsNotExist(errStat) {
+			c.JSON(http.StatusNotFound, gin.H{"error": "log file not found"})
+			return
+		}
+		c.JSON(http.StatusInternalServerError, gin.H{"error": fmt.Sprintf("failed to read log file: %v", errStat)})
+		return
+	}
+	if info.IsDir() {
+		c.JSON(http.StatusBadRequest, gin.H{"error": "invalid log file"})
+		return
+	}
+
+	c.FileAttachment(fullPath, matchedFile)
+}
+
 // DownloadRequestErrorLog downloads a specific error request log file by name.
 func (h *Handler) DownloadRequestErrorLog(c *gin.Context) {
 	if h == nil {
--- a/internal/api/handlers/management/usage.go
+++ b/internal/api/handlers/management/usage.go
@@ -1,12 +1,25 @@
 package management

 import (
+	"encoding/json"
 	"net/http"
+	"time"

 	"github.com/gin-gonic/gin"
 	"github.com/router-for-me/CLIProxyAPI/v6/internal/usage"
 )

+type usageExportPayload struct {
+	Version    int                      `json:"version"`
+	ExportedAt time.Time                `json:"exported_at"`
+	Usage      usage.StatisticsSnapshot `json:"usage"`
+}
+
+type usageImportPayload struct {
+	Version int                      `json:"version"`
+	Usage   usage.StatisticsSnapshot `json:"usage"`
+}
+
 // GetUsageStatistics returns the in-memory request statistics snapshot.
 func (h *Handler) GetUsageStatistics(c *gin.Context) {
 	var snapshot usage.StatisticsSnapshot
@@ -18,3 +31,49 @@ func (h *Handler) GetUsageStatistics(c *gin.Context) {
 		"failed_requests": snapshot.FailureCount,
 	})
 }
+
+// ExportUsageStatistics returns a complete usage snapshot for backup/migration.
+func (h *Handler) ExportUsageStatistics(c *gin.Context) {
+	var snapshot usage.StatisticsSnapshot
+	if h != nil && h.usageStats != nil {
+		snapshot = h.usageStats.Snapshot()
+	}
+	c.JSON(http.StatusOK, usageExportPayload{
+		Version:    1,
+		ExportedAt: time.Now().UTC(),
+		Usage:      snapshot,
+	})
+}
+
+// ImportUsageStatistics merges a previously exported usage snapshot into memory.
+func (h *Handler) ImportUsageStatistics(c *gin.Context) {
+	if h == nil || h.usageStats == nil {
+		c.JSON(http.StatusBadRequest, gin.H{"error": "usage statistics unavailable"})
+		return
+	}
+
+	data, err := c.GetRawData()
+	if err != nil {
+		c.JSON(http.StatusBadRequest, gin.H{"error": "failed to read request body"})
+		return
+	}
+
+	var payload usageImportPayload
+	if err := json.Unmarshal(data, &payload); err != nil {
+		c.JSON(http.StatusBadRequest, gin.H{"error": "invalid json"})
+		return
+	}
+	if payload.Version != 0 && payload.Version != 1 {
+		c.JSON(http.StatusBadRequest, gin.H{"error": "unsupported version"})
+		return
+	}
+
+	result := h.usageStats.MergeSnapshot(payload.Usage)
+	snapshot := h.usageStats.Snapshot()
+	c.JSON(http.StatusOK, gin.H{
+		"added":           result.Added,
+		"skipped":         result.Skipped,
+		"total_requests":  snapshot.TotalRequests,
+		"failed_requests": snapshot.FailureCount,
+	})
+}
--- a/internal/api/middleware/request_logging.go
+++ b/internal/api/middleware/request_logging.go
@@ -98,10 +98,11 @@ func captureRequestInfo(c *gin.Context) (*RequestInfo, error) {
 	}

 	return &RequestInfo{
-		URL:     url,
-		Method:  method,
-		Headers: headers,
-		Body:    body,
+		URL:       url,
+		Method:    method,
+		Headers:   headers,
+		Body:      body,
+		RequestID: logging.GetGinRequestID(c),
 	}, nil
 }

--- a/internal/api/middleware/response_writer.go
+++ b/internal/api/middleware/response_writer.go
@@ -15,10 +15,11 @@ import (

 // RequestInfo holds essential details of an incoming HTTP request for logging purposes.
 type RequestInfo struct {
-	URL     string              // URL is the request URL.
-	Method  string              // Method is the HTTP method (e.g., GET, POST).
-	Headers map[string][]string // Headers contains the request headers.
-	Body    []byte              // Body is the raw request body.
+	URL       string              // URL is the request URL.
+	Method    string              // Method is the HTTP method (e.g., GET, POST).
+	Headers   map[string][]string // Headers contains the request headers.
+	Body      []byte              // Body is the raw request body.
+	RequestID string              // RequestID is the unique identifier for the request.
 }

 // ResponseWriterWrapper wraps the standard gin.ResponseWriter to intercept and log response data.
@@ -149,6 +150,7 @@ func (w *ResponseWriterWrapper) WriteHeader(statusCode int) {
 			w.requestInfo.Method,
 			w.requestInfo.Headers,
 			w.requestInfo.Body,
+			w.requestInfo.RequestID,
 		)
 		if err == nil {
 			w.streamWriter = streamWriter
@@ -346,7 +348,7 @@ func (w *ResponseWriterWrapper) logRequest(statusCode int, headers map[string][]
 	}

 	if loggerWithOptions, ok := w.logger.(interface {
-		LogRequestWithOptions(string, string, map[string][]string, []byte, int, map[string][]string, []byte, []byte, []byte, []*interfaces.ErrorMessage, bool) error
+		LogRequestWithOptions(string, string, map[string][]string, []byte, int, map[string][]string, []byte, []byte, []byte, []*interfaces.ErrorMessage, bool, string) error
 	}); ok {
 		return loggerWithOptions.LogRequestWithOptions(
 			w.requestInfo.URL,
@@ -360,6 +362,7 @@ func (w *ResponseWriterWrapper) logRequest(statusCode int, headers map[string][]
 			apiResponseBody,
 			apiResponseErrors,
 			forceLog,
+			w.requestInfo.RequestID,
 		)
 	}

@@ -374,5 +377,6 @@ func (w *ResponseWriterWrapper) logRequest(statusCode int, headers map[string][]
 		apiRequestBody,
 		apiResponseBody,
 		apiResponseErrors,
+		w.requestInfo.RequestID,
 	)
 }
--- a/internal/api/modules/amp/amp.go
+++ b/internal/api/modules/amp/amp.go
@@ -279,16 +279,23 @@ func (m *AmpModule) hasModelMappingsChanged(old *config.AmpCode, new *config.Amp
 		return true
 	}

-	// Build map for efficient comparison
-	oldMap := make(map[string]string, len(old.ModelMappings))
+	// Build map for efficient and robust comparison
+	type mappingInfo struct {
+		to    string
+		regex bool
+	}
+	oldMap := make(map[string]mappingInfo, len(old.ModelMappings))
 	for _, mapping := range old.ModelMappings {
-		oldMap[strings.TrimSpace(mapping.From)] = strings.TrimSpace(mapping.To)
+		oldMap[strings.TrimSpace(mapping.From)] = mappingInfo{
+			to:    strings.TrimSpace(mapping.To),
+			regex: mapping.Regex,
+		}
 	}

 	for _, mapping := range new.ModelMappings {
 		from := strings.TrimSpace(mapping.From)
 		to := strings.TrimSpace(mapping.To)
-		if oldTo, exists := oldMap[from]; !exists || oldTo != to {
+		if oldVal, exists := oldMap[from]; !exists || oldVal.to != to || oldVal.regex != mapping.Regex {
 			return true
 		}
 	}
--- a/internal/api/modules/amp/model_mapping.go
+++ b/internal/api/modules/amp/model_mapping.go
@@ -3,6 +3,7 @@
 package amp

 import (
+	"regexp"
 	"strings"
 	"sync"

@@ -26,13 +27,15 @@ type ModelMapper interface {
 // DefaultModelMapper implements ModelMapper with thread-safe mapping storage.
 type DefaultModelMapper struct {
 	mu       sync.RWMutex
-	mappings map[string]string // from -> to (normalized lowercase keys)
+	mappings map[string]string // exact: from -> to (normalized lowercase keys)
+	regexps  []regexMapping    // regex rules evaluated in order
 }

 // NewModelMapper creates a new model mapper with the given initial mappings.
 func NewModelMapper(mappings []config.AmpModelMapping) *DefaultModelMapper {
 	m := &DefaultModelMapper{
 		mappings: make(map[string]string),
+		regexps:  nil,
 	}
 	m.UpdateMappings(mappings)
 	return m
@@ -55,7 +58,18 @@ func (m *DefaultModelMapper) MapModel(requestedModel string) string {
 	// Check for direct mapping
 	targetModel, exists := m.mappings[normalizedRequest]
 	if !exists {
-		return ""
+		// Try regex mappings in order
+		base, _ := util.NormalizeThinkingModel(requestedModel)
+		for _, rm := range m.regexps {
+			if rm.re.MatchString(requestedModel) || (base != "" && rm.re.MatchString(base)) {
+				targetModel = rm.to
+				exists = true
+				break
+			}
+		}
+		if !exists {
+			return ""
+		}
 	}

 	// Verify target model has available providers
@@ -78,6 +92,7 @@ func (m *DefaultModelMapper) UpdateMappings(mappings []config.AmpModelMapping) {

 	// Clear and rebuild mappings
 	m.mappings = make(map[string]string, len(mappings))
+	m.regexps = make([]regexMapping, 0, len(mappings))

 	for _, mapping := range mappings {
 		from := strings.TrimSpace(mapping.From)
@@ -88,16 +103,30 @@ func (m *DefaultModelMapper) UpdateMappings(mappings []config.AmpModelMapping) {
 			continue
 		}

-		// Store with normalized lowercase key for case-insensitive lookup
-		normalizedFrom := strings.ToLower(from)
-		m.mappings[normalizedFrom] = to
-
-		log.Debugf("amp model mapping registered: %s -> %s", from, to)
+		if mapping.Regex {
+			// Compile case-insensitive regex; wrap with (?i) to match behavior of exact lookups
+			pattern := "(?i)" + from
+			re, err := regexp.Compile(pattern)
+			if err != nil {
+				log.Warnf("amp model mapping: invalid regex %q: %v", from, err)
+				continue
+			}
+			m.regexps = append(m.regexps, regexMapping{re: re, to: to})
+			log.Debugf("amp model regex mapping registered: /%s/ -> %s", from, to)
+		} else {
+			// Store with normalized lowercase key for case-insensitive lookup
+			normalizedFrom := strings.ToLower(from)
+			m.mappings[normalizedFrom] = to
+			log.Debugf("amp model mapping registered: %s -> %s", from, to)
+		}
 	}

 	if len(m.mappings) > 0 {
 		log.Infof("amp model mapping: loaded %d mapping(s)", len(m.mappings))
 	}
+	if n := len(m.regexps); n > 0 {
+		log.Infof("amp model mapping: loaded %d regex mapping(s)", n)
+	}
 }

 // GetMappings returns a copy of current mappings (for debugging/status).
@@ -111,3 +140,8 @@ func (m *DefaultModelMapper) GetMappings() map[string]string {
 	}
 	return result
 }
+
+type regexMapping struct {
+	re *regexp.Regexp
+	to string
+}
--- a/internal/api/modules/amp/model_mapping_test.go
+++ b/internal/api/modules/amp/model_mapping_test.go
@@ -203,3 +203,81 @@ func TestModelMapper_GetMappings_ReturnsCopy(t *testing.T) {
 		t.Error("Original map was modified")
 	}
 }
+
+func TestModelMapper_Regex_MatchBaseWithoutParens(t *testing.T) {
+	reg := registry.GetGlobalRegistry()
+	reg.RegisterClient("test-client-regex-1", "gemini", []*registry.ModelInfo{
+		{ID: "gemini-2.5-pro", OwnedBy: "google", Type: "gemini"},
+	})
+	defer reg.UnregisterClient("test-client-regex-1")
+
+	mappings := []config.AmpModelMapping{
+		{From: "^gpt-5$", To: "gemini-2.5-pro", Regex: true},
+	}
+
+	mapper := NewModelMapper(mappings)
+
+	// Incoming model has reasoning suffix but should match base via regex
+	result := mapper.MapModel("gpt-5(high)")
+	if result != "gemini-2.5-pro" {
+		t.Errorf("Expected gemini-2.5-pro, got %s", result)
+	}
+}
+
+func TestModelMapper_Regex_ExactPrecedence(t *testing.T) {
+	reg := registry.GetGlobalRegistry()
+	reg.RegisterClient("test-client-regex-2", "claude", []*registry.ModelInfo{
+		{ID: "claude-sonnet-4", OwnedBy: "anthropic", Type: "claude"},
+	})
+	reg.RegisterClient("test-client-regex-3", "gemini", []*registry.ModelInfo{
+		{ID: "gemini-2.5-pro", OwnedBy: "google", Type: "gemini"},
+	})
+	defer reg.UnregisterClient("test-client-regex-2")
+	defer reg.UnregisterClient("test-client-regex-3")
+
+	mappings := []config.AmpModelMapping{
+		{From: "gpt-5", To: "claude-sonnet-4"},                 // exact
+		{From: "^gpt-5.*$", To: "gemini-2.5-pro", Regex: true}, // regex
+	}
+
+	mapper := NewModelMapper(mappings)
+
+	// Exact match should win over regex
+	result := mapper.MapModel("gpt-5")
+	if result != "claude-sonnet-4" {
+		t.Errorf("Expected claude-sonnet-4, got %s", result)
+	}
+}
+
+func TestModelMapper_Regex_InvalidPattern_Skipped(t *testing.T) {
+	// Invalid regex should be skipped and not cause panic
+	mappings := []config.AmpModelMapping{
+		{From: "(", To: "target", Regex: true},
+	}
+
+	mapper := NewModelMapper(mappings)
+
+	result := mapper.MapModel("anything")
+	if result != "" {
+		t.Errorf("Expected empty result due to invalid regex, got %s", result)
+	}
+}
+
+func TestModelMapper_Regex_CaseInsensitive(t *testing.T) {
+	reg := registry.GetGlobalRegistry()
+	reg.RegisterClient("test-client-regex-4", "claude", []*registry.ModelInfo{
+		{ID: "claude-sonnet-4", OwnedBy: "anthropic", Type: "claude"},
+	})
+	defer reg.UnregisterClient("test-client-regex-4")
+
+	mappings := []config.AmpModelMapping{
+		{From: "^CLAUDE-OPUS-.*$", To: "claude-sonnet-4", Regex: true},
+	}
+
+	mapper := NewModelMapper(mappings)
+
+	result := mapper.MapModel("claude-opus-4.5")
+	if result != "claude-sonnet-4" {
+		t.Errorf("Expected claude-sonnet-4, got %s", result)
+	}
+}
--- a/internal/api/server.go
+++ b/internal/api/server.go
@@ -209,13 +209,15 @@ func NewServer(cfg *config.Config, authManager *auth.Manager, accessManager *sdk
 	// Resolve logs directory relative to the configuration file directory.
 	var requestLogger logging.RequestLogger
 	var toggle func(bool)
-	if optionState.requestLoggerFactory != nil {
-		requestLogger = optionState.requestLoggerFactory(cfg, configFilePath)
-	}
-	if requestLogger != nil {
-		engine.Use(middleware.RequestLoggingMiddleware(requestLogger))
-		if setter, ok := requestLogger.(interface{ SetEnabled(bool) }); ok {
-			toggle = setter.SetEnabled
+	if !cfg.CommercialMode {
+		if optionState.requestLoggerFactory != nil {
+			requestLogger = optionState.requestLoggerFactory(cfg, configFilePath)
+		}
+		if requestLogger != nil {
+			engine.Use(middleware.RequestLoggingMiddleware(requestLogger))
+			if setter, ok := requestLogger.(interface{ SetEnabled(bool) }); ok {
+				toggle = setter.SetEnabled
+			}
 		}
 	}

@@ -474,6 +476,8 @@ func (s *Server) registerManagementRoutes() {
 	mgmt.Use(s.managementAvailabilityMiddleware(), s.mgmt.Middleware())
 	{
 		mgmt.GET("/usage", s.mgmt.GetUsageStatistics)
+		mgmt.GET("/usage/export", s.mgmt.ExportUsageStatistics)
+		mgmt.POST("/usage/import", s.mgmt.ImportUsageStatistics)
 		mgmt.GET("/config", s.mgmt.GetConfig)
 		mgmt.GET("/config.yaml", s.mgmt.GetConfigYAML)
 		mgmt.PUT("/config.yaml", s.mgmt.PutConfigYAML)
@@ -496,6 +500,8 @@ func (s *Server) registerManagementRoutes() {
 		mgmt.PATCH("/proxy-url", s.mgmt.PutProxyURL)
 		mgmt.DELETE("/proxy-url", s.mgmt.DeleteProxyURL)

+		mgmt.POST("/api-call", s.mgmt.APICall)
+
 		mgmt.GET("/quota-exceeded/switch-project", s.mgmt.GetSwitchProject)
 		mgmt.PUT("/quota-exceeded/switch-project", s.mgmt.PutSwitchProject)
 		mgmt.PATCH("/quota-exceeded/switch-project", s.mgmt.PutSwitchProject)
@@ -518,6 +524,7 @@ func (s *Server) registerManagementRoutes() {
 		mgmt.DELETE("/logs", s.mgmt.DeleteLogs)
 		mgmt.GET("/request-error-logs", s.mgmt.GetRequestErrorLogs)
 		mgmt.GET("/request-error-logs/:name", s.mgmt.DownloadRequestErrorLog)
+		mgmt.GET("/request-log-by-id/:id", s.mgmt.GetRequestLogByID)
 		mgmt.GET("/request-log", s.mgmt.GetRequestLog)
 		mgmt.PUT("/request-log", s.mgmt.PutRequestLog)
 		mgmt.PATCH("/request-log", s.mgmt.PutRequestLog)
--- a/internal/config/config.go
+++ b/internal/config/config.go
@@ -39,6 +39,9 @@ type Config struct {
 	// Debug enables or disables debug-level logging and other debug features.
 	Debug bool `yaml:"debug" json:"debug"`

+	// CommercialMode disables high-overhead HTTP middleware features to minimize per-request memory usage.
+	CommercialMode bool `yaml:"commercial-mode" json:"commercial-mode"`
+
 	// LoggingToFile controls whether application logs are written to rotating files or stdout.
 	LoggingToFile bool `yaml:"logging-to-file" json:"logging-to-file"`

@@ -144,6 +147,11 @@ type AmpModelMapping struct {
 	// To is the target model name to route to (e.g., "claude-sonnet-4").
 	// The target model must have available providers in the registry.
 	To string `yaml:"to" json:"to"`
+
+	// Regex indicates whether the 'from' field should be interpreted as a regular
+	// expression for matching model names. When true, this mapping is evaluated
+	// after exact matches and in the order provided. Defaults to false (exact match).
+	Regex bool `yaml:"regex,omitempty" json:"regex,omitempty"`
 }

 // AmpCode groups Amp CLI integration settings including upstream routing,
@@ -245,6 +253,9 @@ type CodexKey struct {
 	// ProxyURL overrides the global proxy setting for this API key if provided.
 	ProxyURL string `yaml:"proxy-url" json:"proxy-url"`

+	// Models defines upstream model names and aliases for request routing.
+	Models []CodexModel `yaml:"models" json:"models"`
+
 	// Headers optionally adds extra HTTP headers for requests sent with this key.
 	Headers map[string]string `yaml:"headers,omitempty" json:"headers,omitempty"`

@@ -252,6 +263,15 @@ type CodexKey struct {
 	ExcludedModels []string `yaml:"excluded-models,omitempty" json:"excluded-models,omitempty"`
 }

+// CodexModel describes a mapping between an alias and the actual upstream model name.
+type CodexModel struct {
+	// Name is the upstream model identifier used when issuing requests.
+	Name string `yaml:"name" json:"name"`
+
+	// Alias is the client-facing model name that maps to Name.
+	Alias string `yaml:"alias" json:"alias"`
+}
+
 // GeminiKey represents the configuration for a Gemini API key,
 // including optional overrides for upstream base URL, proxy routing, and headers.
 type GeminiKey struct {
@@ -809,8 +829,8 @@ func getOrCreateMapValue(mapNode *yaml.Node, key string) *yaml.Node {
 }

 // mergeMappingPreserve merges keys from src into dst mapping node while preserving
-// key order and comments of existing keys in dst. Unknown keys from src are appended
-// to dst at the end, copying their node structure from src.
+// key order and comments of existing keys in dst. New keys are only added if their
+// value is non-zero to avoid polluting the config with defaults.
 func mergeMappingPreserve(dst, src *yaml.Node) {
 	if dst == nil || src == nil {
 		return
@@ -821,20 +841,19 @@ func mergeMappingPreserve(dst, src *yaml.Node) {
 		copyNodeShallow(dst, src)
 		return
 	}
-	// Build a lookup of existing keys in dst
 	for i := 0; i+1 < len(src.Content); i += 2 {
 		sk := src.Content[i]
 		sv := src.Content[i+1]
 		idx := findMapKeyIndex(dst, sk.Value)
 		if idx >= 0 {
-			// Merge into existing value node
+			// Merge into existing value node (always update, even to zero values)
 			dv := dst.Content[idx+1]
 			mergeNodePreserve(dv, sv)
 		} else {
-			if shouldSkipEmptyCollectionOnPersist(sk.Value, sv) {
+			// New key: only add if value is non-zero to avoid polluting config with defaults
+			if isZeroValueNode(sv) {
 				continue
 			}
-			// Append new key/value pair by deep-copying from src
 			dst.Content = append(dst.Content, deepCopyNode(sk), deepCopyNode(sv))
 		}
 	}
@@ -917,32 +936,49 @@ func findMapKeyIndex(mapNode *yaml.Node, key string) int {
 	return -1
 }

-func shouldSkipEmptyCollectionOnPersist(key string, node *yaml.Node) bool {
-	switch key {
-	case "generative-language-api-key",
-		"gemini-api-key",
-		"vertex-api-key",
-		"claude-api-key",
-		"codex-api-key",
-		"openai-compatibility":
-		return isEmptyCollectionNode(node)
-	default:
-		return false
-	}
-}
-
-func isEmptyCollectionNode(node *yaml.Node) bool {
+// isZeroValueNode returns true if the YAML node represents a zero/default value
+// that should not be written as a new key to preserve config cleanliness.
+// For mappings and sequences, recursively checks if all children are zero values.
+func isZeroValueNode(node *yaml.Node) bool {
 	if node == nil {
 		return true
 	}
 	switch node.Kind {
-	case yaml.SequenceNode:
-		return len(node.Content) == 0
 	case yaml.ScalarNode:
-		return node.Tag == "!!null"
-	default:
-		return false
+		switch node.Tag {
+		case "!!bool":
+			return node.Value == "false"
+		case "!!int", "!!float":
+			return node.Value == "0" || node.Value == "0.0"
+		case "!!str":
+			return node.Value == ""
+		case "!!null":
+			return true
+		}
+	case yaml.SequenceNode:
+		if len(node.Content) == 0 {
+			return true
+		}
+		// Check if all elements are zero values
+		for _, child := range node.Content {
+			if !isZeroValueNode(child) {
+				return false
+			}
+		}
+		return true
+	case yaml.MappingNode:
+		if len(node.Content) == 0 {
+			return true
+		}
+		// Check if all values are zero values (values are at odd indices)
+		for i := 1; i < len(node.Content); i += 2 {
+			if !isZeroValueNode(node.Content[i]) {
+				return false
+			}
+		}
+		return true
 	}
+	return false
 }

 // deepCopyNode creates a deep copy of a yaml.Node graph.
--- a/internal/config/sdk_config.go
+++ b/internal/config/sdk_config.go
@@ -30,13 +30,13 @@ type SDKConfig struct {
 // StreamingConfig holds server streaming behavior configuration.
 type StreamingConfig struct {
 	// KeepAliveSeconds controls how often the server emits SSE heartbeats (": keep-alive\n\n").
-	// nil means default (15 seconds). <= 0 disables keep-alives.
-	KeepAliveSeconds *int `yaml:"keepalive-seconds,omitempty" json:"keepalive-seconds,omitempty"`
+	// <= 0 disables keep-alives. Default is 0.
+	KeepAliveSeconds int `yaml:"keepalive-seconds,omitempty" json:"keepalive-seconds,omitempty"`

 	// BootstrapRetries controls how many times the server may retry a streaming request before any bytes are sent,
 	// to allow auth rotation / transient recovery.
-	// nil means default (2). 0 disables bootstrap retries.
-	BootstrapRetries *int `yaml:"bootstrap-retries,omitempty" json:"bootstrap-retries,omitempty"`
+	// <= 0 disables bootstrap retries. Default is 0.
+	BootstrapRetries int `yaml:"bootstrap-retries,omitempty" json:"bootstrap-retries,omitempty"`
 }

 // AccessConfig groups request authentication providers.
--- a/internal/logging/gin_logger.go
+++ b/internal/logging/gin_logger.go
@@ -7,6 +7,7 @@ import (
 	"fmt"
 	"net/http"
 	"runtime/debug"
+	"strings"
 	"time"

 	"github.com/gin-gonic/gin"
@@ -14,11 +15,24 @@ import (
 	log "github.com/sirupsen/logrus"
 )

+// aiAPIPrefixes defines path prefixes for AI API requests that should have request ID tracking.
+var aiAPIPrefixes = []string{
+	"/v1/chat/completions",
+	"/v1/completions",
+	"/v1/messages",
+	"/v1/responses",
+	"/v1beta/models/",
+	"/api/provider/",
+}
+
 const skipGinLogKey = "__gin_skip_request_logging__"

 // GinLogrusLogger returns a Gin middleware handler that logs HTTP requests and responses
 // using logrus. It captures request details including method, path, status code, latency,
-// client IP, and any error messages, formatting them in a Gin-style log format.
+// client IP, and any error messages. Request ID is only added for AI API requests.
+//
+// Output format (AI API): [2025-12-23 20:14:10] [info ] | a1b2c3d4 | 200 |       23.559s | ...
+// Output format (others): [2025-12-23 20:14:10] [info ] | -------- | 200 |       23.559s | ...
 //
 // Returns:
 //   - gin.HandlerFunc: A middleware handler for request logging
@@ -28,6 +42,15 @@ func GinLogrusLogger() gin.HandlerFunc {
 		path := c.Request.URL.Path
 		raw := util.MaskSensitiveQuery(c.Request.URL.RawQuery)

+		// Only generate request ID for AI API paths
+		var requestID string
+		if isAIAPIPath(path) {
+			requestID = GenerateRequestID()
+			SetGinRequestID(c, requestID)
+			ctx := WithRequestID(c.Request.Context(), requestID)
+			c.Request = c.Request.WithContext(ctx)
+		}
+
 		c.Next()

 		if shouldSkipGinRequestLogging(c) {
@@ -49,23 +72,38 @@ func GinLogrusLogger() gin.HandlerFunc {
 		clientIP := c.ClientIP()
 		method := c.Request.Method
 		errorMessage := c.Errors.ByType(gin.ErrorTypePrivate).String()
-		timestamp := time.Now().Format("2006/01/02 - 15:04:05")
-		logLine := fmt.Sprintf("[GIN] %s | %3d | %13v | %15s | %-7s \"%s\"", timestamp, statusCode, latency, clientIP, method, path)
+
+		if requestID == "" {
+			requestID = "--------"
+		}
+		logLine := fmt.Sprintf("%3d | %13v | %15s | %-7s \"%s\"", statusCode, latency, clientIP, method, path)
 		if errorMessage != "" {
 			logLine = logLine + " | " + errorMessage
 		}

+		entry := log.WithField("request_id", requestID)
+
 		switch {
 		case statusCode >= http.StatusInternalServerError:
-			log.Error(logLine)
+			entry.Error(logLine)
 		case statusCode >= http.StatusBadRequest:
-			log.Warn(logLine)
+			entry.Warn(logLine)
 		default:
-			log.Info(logLine)
+			entry.Info(logLine)
 		}
 	}
 }

+// isAIAPIPath checks if the given path is an AI API endpoint that should have request ID tracking.
+func isAIAPIPath(path string) bool {
+	for _, prefix := range aiAPIPrefixes {
+		if strings.HasPrefix(path, prefix) {
+			return true
+		}
+	}
+	return false
+}
+
 // GinLogrusRecovery returns a Gin middleware handler that recovers from panics and logs
 // them using logrus. When a panic occurs, it captures the panic value, stack trace,
 // and request path, then returns a 500 Internal Server Error response to the client.
--- a/internal/logging/global_logger.go
+++ b/internal/logging/global_logger.go
@@ -24,7 +24,8 @@ var (
 )

 // LogFormatter defines a custom log format for logrus.
-// This formatter adds timestamp, level, and source location to each log entry.
+// This formatter adds timestamp, level, request ID, and source location to each log entry.
+// Format: [2025-12-23 20:14:04] [debug] [manager.go:524] | a1b2c3d4 | Use API key sk-9...0RHO for model gpt-5.2
 type LogFormatter struct{}

 // Format renders a single log entry with custom formatting.
@@ -39,11 +40,22 @@ func (m *LogFormatter) Format(entry *log.Entry) ([]byte, error) {
 	timestamp := entry.Time.Format("2006-01-02 15:04:05")
 	message := strings.TrimRight(entry.Message, "\r\n")

+	reqID := "--------"
+	if id, ok := entry.Data["request_id"].(string); ok && id != "" {
+		reqID = id
+	}
+
+	level := entry.Level.String()
+	if level == "warning" {
+		level = "warn"
+	}
+	levelStr := fmt.Sprintf("%-5s", level)
+
 	var formatted string
 	if entry.Caller != nil {
-		formatted = fmt.Sprintf("[%s] [%s] [%s:%d] %s\n", timestamp, entry.Level, filepath.Base(entry.Caller.File), entry.Caller.Line, message)
+		formatted = fmt.Sprintf("[%s] [%s] [%s] [%s:%d] %s\n", timestamp, reqID, levelStr, filepath.Base(entry.Caller.File), entry.Caller.Line, message)
 	} else {
-		formatted = fmt.Sprintf("[%s] [%s] %s\n", timestamp, entry.Level, message)
+		formatted = fmt.Sprintf("[%s] [%s] [%s] %s\n", timestamp, reqID, levelStr, message)
 	}
 	buffer.WriteString(formatted)

--- a/internal/logging/request_logger.go
+++ b/internal/logging/request_logger.go
@@ -43,10 +43,11 @@ type RequestLogger interface {
 	//   - response: The raw response data
 	//   - apiRequest: The API request data
 	//   - apiResponse: The API response data
+	//   - requestID: Optional request ID for log file naming
 	//
 	// Returns:
 	//   - error: An error if logging fails, nil otherwise
-	LogRequest(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, apiRequest, apiResponse []byte, apiResponseErrors []*interfaces.ErrorMessage) error
+	LogRequest(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, apiRequest, apiResponse []byte, apiResponseErrors []*interfaces.ErrorMessage, requestID string) error

 	// LogStreamingRequest initiates logging for a streaming request and returns a writer for chunks.
 	//
@@ -55,11 +56,12 @@ type RequestLogger interface {
 	//   - method: The HTTP method
 	//   - headers: The request headers
 	//   - body: The request body
+	//   - requestID: Optional request ID for log file naming
 	//
 	// Returns:
 	//   - StreamingLogWriter: A writer for streaming response chunks
 	//   - error: An error if logging initialization fails, nil otherwise
-	LogStreamingRequest(url, method string, headers map[string][]string, body []byte) (StreamingLogWriter, error)
+	LogStreamingRequest(url, method string, headers map[string][]string, body []byte, requestID string) (StreamingLogWriter, error)

 	// IsEnabled returns whether request logging is currently enabled.
 	//
@@ -177,20 +179,21 @@ func (l *FileRequestLogger) SetEnabled(enabled bool) {
 //   - response: The raw response data
 //   - apiRequest: The API request data
 //   - apiResponse: The API response data
+//   - requestID: Optional request ID for log file naming
 //
 // Returns:
 //   - error: An error if logging fails, nil otherwise
-func (l *FileRequestLogger) LogRequest(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, apiRequest, apiResponse []byte, apiResponseErrors []*interfaces.ErrorMessage) error {
-	return l.logRequest(url, method, requestHeaders, body, statusCode, responseHeaders, response, apiRequest, apiResponse, apiResponseErrors, false)
+func (l *FileRequestLogger) LogRequest(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, apiRequest, apiResponse []byte, apiResponseErrors []*interfaces.ErrorMessage, requestID string) error {
+	return l.logRequest(url, method, requestHeaders, body, statusCode, responseHeaders, response, apiRequest, apiResponse, apiResponseErrors, false, requestID)
 }

 // LogRequestWithOptions logs a request with optional forced logging behavior.
 // The force flag allows writing error logs even when regular request logging is disabled.
-func (l *FileRequestLogger) LogRequestWithOptions(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, apiRequest, apiResponse []byte, apiResponseErrors []*interfaces.ErrorMessage, force bool) error {
-	return l.logRequest(url, method, requestHeaders, body, statusCode, responseHeaders, response, apiRequest, apiResponse, apiResponseErrors, force)
+func (l *FileRequestLogger) LogRequestWithOptions(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, apiRequest, apiResponse []byte, apiResponseErrors []*interfaces.ErrorMessage, force bool, requestID string) error {
+	return l.logRequest(url, method, requestHeaders, body, statusCode, responseHeaders, response, apiRequest, apiResponse, apiResponseErrors, force, requestID)
 }

-func (l *FileRequestLogger) logRequest(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, apiRequest, apiResponse []byte, apiResponseErrors []*interfaces.ErrorMessage, force bool) error {
+func (l *FileRequestLogger) logRequest(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, apiRequest, apiResponse []byte, apiResponseErrors []*interfaces.ErrorMessage, force bool, requestID string) error {
 	if !l.enabled && !force {
 		return nil
 	}
@@ -200,10 +203,10 @@ func (l *FileRequestLogger) logRequest(url, method string, requestHeaders map[st
 		return fmt.Errorf("failed to create logs directory: %w", errEnsure)
 	}

-	// Generate filename
-	filename := l.generateFilename(url)
+	// Generate filename with request ID
+	filename := l.generateFilename(url, requestID)
 	if force && !l.enabled {
-		filename = l.generateErrorFilename(url)
+		filename = l.generateErrorFilename(url, requestID)
 	}
 	filePath := filepath.Join(l.logsDir, filename)

@@ -271,11 +274,12 @@ func (l *FileRequestLogger) logRequest(url, method string, requestHeaders map[st
 //   - method: The HTTP method
 //   - headers: The request headers
 //   - body: The request body
+//   - requestID: Optional request ID for log file naming
 //
 // Returns:
 //   - StreamingLogWriter: A writer for streaming response chunks
 //   - error: An error if logging initialization fails, nil otherwise
-func (l *FileRequestLogger) LogStreamingRequest(url, method string, headers map[string][]string, body []byte) (StreamingLogWriter, error) {
+func (l *FileRequestLogger) LogStreamingRequest(url, method string, headers map[string][]string, body []byte, requestID string) (StreamingLogWriter, error) {
 	if !l.enabled {
 		return &NoOpStreamingLogWriter{}, nil
 	}
@@ -285,8 +289,8 @@ func (l *FileRequestLogger) LogStreamingRequest(url, method string, headers map[
 		return nil, fmt.Errorf("failed to create logs directory: %w", err)
 	}

-	// Generate filename
-	filename := l.generateFilename(url)
+	// Generate filename with request ID
+	filename := l.generateFilename(url, requestID)
 	filePath := filepath.Join(l.logsDir, filename)

 	requestHeaders := make(map[string][]string, len(headers))
@@ -330,8 +334,8 @@ func (l *FileRequestLogger) LogStreamingRequest(url, method string, headers map[
 }

 // generateErrorFilename creates a filename with an error prefix to differentiate forced error logs.
-func (l *FileRequestLogger) generateErrorFilename(url string) string {
-	return fmt.Sprintf("error-%s", l.generateFilename(url))
+func (l *FileRequestLogger) generateErrorFilename(url string, requestID ...string) string {
+	return fmt.Sprintf("error-%s", l.generateFilename(url, requestID...))
 }

 // ensureLogsDir creates the logs directory if it doesn't exist.
@@ -346,13 +350,15 @@ func (l *FileRequestLogger) ensureLogsDir() error {
 }

 // generateFilename creates a sanitized filename from the URL path and current timestamp.
+// Format: v1-responses-2025-12-23T195811-a1b2c3d4.log
 //
 // Parameters:
 //   - url: The request URL
+//   - requestID: Optional request ID to include in filename
 //
 // Returns:
 //   - string: A sanitized filename for the log file
-func (l *FileRequestLogger) generateFilename(url string) string {
+func (l *FileRequestLogger) generateFilename(url string, requestID ...string) string {
 	// Extract path from URL
 	path := url
 	if strings.Contains(url, "?") {
@@ -368,12 +374,18 @@ func (l *FileRequestLogger) generateFilename(url string) string {
 	sanitized := l.sanitizeForFilename(path)

 	// Add timestamp
-	timestamp := time.Now().Format("2006-01-02T150405-.000000000")
-	timestamp = strings.Replace(timestamp, ".", "", -1)
+	timestamp := time.Now().Format("2006-01-02T150405")

-	id := requestLogID.Add(1)
+	// Use request ID if provided, otherwise use sequential ID
+	var idPart string
+	if len(requestID) > 0 && requestID[0] != "" {
+		idPart = requestID[0]
+	} else {
+		id := requestLogID.Add(1)
+		idPart = fmt.Sprintf("%d", id)
+	}

-	return fmt.Sprintf("%s-%s-%d.log", sanitized, timestamp, id)
+	return fmt.Sprintf("%s-%s-%s.log", sanitized, timestamp, idPart)
 }

 // sanitizeForFilename replaces characters that are not safe for filenames.
--- a/internal/logging/requestid.go
+++ b/internal/logging/requestid.go
@@ -0,0 +1,61 @@
+package logging
+
+import (
+	"context"
+	"crypto/rand"
+	"encoding/hex"
+
+	"github.com/gin-gonic/gin"
+)
+
+// requestIDKey is the context key for storing/retrieving request IDs.
+type requestIDKey struct{}
+
+// ginRequestIDKey is the Gin context key for request IDs.
+const ginRequestIDKey = "__request_id__"
+
+// GenerateRequestID creates a new 8-character hex request ID.
+func GenerateRequestID() string {
+	b := make([]byte, 4)
+	if _, err := rand.Read(b); err != nil {
+		return "00000000"
+	}
+	return hex.EncodeToString(b)
+}
+
+// WithRequestID returns a new context with the request ID attached.
+func WithRequestID(ctx context.Context, requestID string) context.Context {
+	return context.WithValue(ctx, requestIDKey{}, requestID)
+}
+
+// GetRequestID retrieves the request ID from the context.
+// Returns empty string if not found.
+func GetRequestID(ctx context.Context) string {
+	if ctx == nil {
+		return ""
+	}
+	if id, ok := ctx.Value(requestIDKey{}).(string); ok {
+		return id
+	}
+	return ""
+}
+
+// SetGinRequestID stores the request ID in the Gin context.
+func SetGinRequestID(c *gin.Context, requestID string) {
+	if c != nil {
+		c.Set(ginRequestIDKey, requestID)
+	}
+}
+
+// GetGinRequestID retrieves the request ID from the Gin context.
+func GetGinRequestID(c *gin.Context) string {
+	if c == nil {
+		return ""
+	}
+	if id, exists := c.Get(ginRequestIDKey); exists {
+		if s, ok := id.(string); ok {
+			return s
+		}
+	}
+	return ""
+}
--- a/internal/registry/model_definitions.go
+++ b/internal/registry/model_definitions.go
@@ -727,6 +727,7 @@ func GetIFlowModels() []*ModelInfo {
 		{ID: "qwen3-max-preview", DisplayName: "Qwen3-Max-Preview", Description: "Qwen3 Max preview build", Created: 1757030400},
 		{ID: "kimi-k2-0905", DisplayName: "Kimi-K2-Instruct-0905", Description: "Moonshot Kimi K2 instruct 0905", Created: 1757030400},
 		{ID: "glm-4.6", DisplayName: "GLM-4.6", Description: "Zhipu GLM 4.6 general model", Created: 1759190400, Thinking: iFlowThinkingSupport},
+		{ID: "glm-4.7", DisplayName: "GLM-4.7", Description: "Zhipu GLM 4.7 general model", Created: 1766448000, Thinking: iFlowThinkingSupport},
 		{ID: "kimi-k2", DisplayName: "Kimi-K2", Description: "Moonshot Kimi K2 general model", Created: 1752192000},
 		{ID: "kimi-k2-thinking", DisplayName: "Kimi-K2-Thinking", Description: "Moonshot Kimi K2 thinking model", Created: 1762387200},
 		{ID: "deepseek-v3.2-chat", DisplayName: "DeepSeek-V3.2", Description: "DeepSeek V3.2 Chat", Created: 1764576000},
@@ -740,6 +741,7 @@ func GetIFlowModels() []*ModelInfo {
 		{ID: "qwen3-235b-a22b-instruct", DisplayName: "Qwen3-235B-A22B-Instruct", Description: "Qwen3 235B A22B Instruct", Created: 1753401600},
 		{ID: "qwen3-235b", DisplayName: "Qwen3-235B-A22B", Description: "Qwen3 235B A22B", Created: 1753401600},
 		{ID: "minimax-m2", DisplayName: "MiniMax-M2", Description: "MiniMax M2", Created: 1758672000},
+		{ID: "minimax-m2.1", DisplayName: "MiniMax-M2.1", Description: "MiniMax M2.1", Created: 1766448000, Thinking: iFlowThinkingSupport},
 	}
 	models := make([]*ModelInfo, 0, len(entries))
 	for _, entry := range entries {
--- a/internal/runtime/executor/claude_executor.go
+++ b/internal/runtime/executor/claude_executor.go
@@ -74,6 +74,9 @@ func (e *ClaudeExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, r
 	}
 	body = applyPayloadConfig(e.cfg, req.Model, body)

+	// Disable thinking if tool_choice forces tool use (Anthropic API constraint)
+	body = disableThinkingIfToolChoiceForced(body)
+
 	// Ensure max_tokens > thinking.budget_tokens when thinking is enabled
 	body = ensureMaxTokensForThinking(req.Model, body)

@@ -185,6 +188,9 @@ func (e *ClaudeExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.A
 	body = checkSystemInstructions(body)
 	body = applyPayloadConfig(e.cfg, req.Model, body)

+	// Disable thinking if tool_choice forces tool use (Anthropic API constraint)
+	body = disableThinkingIfToolChoiceForced(body)
+
 	// Ensure max_tokens > thinking.budget_tokens when thinking is enabled
 	body = ensureMaxTokensForThinking(req.Model, body)

@@ -461,6 +467,19 @@ func (e *ClaudeExecutor) injectThinkingConfig(modelName string, metadata map[str
 	return util.ApplyClaudeThinkingConfig(body, budget)
 }

+// disableThinkingIfToolChoiceForced checks if tool_choice forces tool use and disables thinking.
+// Anthropic API does not allow thinking when tool_choice is set to "any" or a specific tool.
+// See: https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking#important-considerations
+func disableThinkingIfToolChoiceForced(body []byte) []byte {
+	toolChoiceType := gjson.GetBytes(body, "tool_choice.type").String()
+	// "auto" is allowed with thinking, but "any" or "tool" (specific tool) are not
+	if toolChoiceType == "any" || toolChoiceType == "tool" {
+		// Remove thinking configuration entirely to avoid API error
+		body, _ = sjson.DeleteBytes(body, "thinking")
+	}
+	return body
+}
+
 // ensureMaxTokensForThinking ensures max_tokens > thinking.budget_tokens when thinking is enabled.
 // Anthropic API requires this constraint; violating it returns a 400 error.
 // This function should be called after all thinking configuration is finalized.
--- a/internal/runtime/executor/codex_executor.go
+++ b/internal/runtime/executor/codex_executor.go
@@ -50,6 +50,16 @@ func (e *CodexExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, re
 	defer reporter.trackFailure(ctx, &err)

 	upstreamModel := util.ResolveOriginalModel(req.Model, req.Metadata)
+	if upstreamModel == "" {
+		upstreamModel = req.Model
+	}
+	if modelOverride := e.resolveUpstreamModel(upstreamModel, auth); modelOverride != "" {
+		upstreamModel = modelOverride
+	} else if !strings.EqualFold(upstreamModel, req.Model) {
+		if modelOverride := e.resolveUpstreamModel(req.Model, auth); modelOverride != "" {
+			upstreamModel = modelOverride
+		}
+	}

 	from := opts.SourceFormat
 	to := sdktranslator.FromString("codex")
@@ -147,6 +157,16 @@ func (e *CodexExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Au
 	defer reporter.trackFailure(ctx, &err)

 	upstreamModel := util.ResolveOriginalModel(req.Model, req.Metadata)
+	if upstreamModel == "" {
+		upstreamModel = req.Model
+	}
+	if modelOverride := e.resolveUpstreamModel(upstreamModel, auth); modelOverride != "" {
+		upstreamModel = modelOverride
+	} else if !strings.EqualFold(upstreamModel, req.Model) {
+		if modelOverride := e.resolveUpstreamModel(req.Model, auth); modelOverride != "" {
+			upstreamModel = modelOverride
+		}
+	}

 	from := opts.SourceFormat
 	to := sdktranslator.FromString("codex")
@@ -247,12 +267,22 @@ func (e *CodexExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Au

 func (e *CodexExecutor) CountTokens(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (cliproxyexecutor.Response, error) {
 	upstreamModel := util.ResolveOriginalModel(req.Model, req.Metadata)
+	if upstreamModel == "" {
+		upstreamModel = req.Model
+	}
+	if modelOverride := e.resolveUpstreamModel(upstreamModel, auth); modelOverride != "" {
+		upstreamModel = modelOverride
+	} else if !strings.EqualFold(upstreamModel, req.Model) {
+		if modelOverride := e.resolveUpstreamModel(req.Model, auth); modelOverride != "" {
+			upstreamModel = modelOverride
+		}
+	}

 	from := opts.SourceFormat
 	to := sdktranslator.FromString("codex")
 	body := sdktranslator.TranslateRequest(from, to, req.Model, bytes.Clone(req.Payload), false)

-	modelForCounting := req.Model
+	modelForCounting := upstreamModel

 	body = ApplyReasoningEffortMetadata(body, req.Metadata, req.Model, "reasoning.effort", false)
 	body, _ = sjson.SetBytes(body, "model", upstreamModel)
@@ -520,3 +550,87 @@ func codexCreds(a *cliproxyauth.Auth) (apiKey, baseURL string) {
 	}
 	return
 }
+
+func (e *CodexExecutor) resolveUpstreamModel(alias string, auth *cliproxyauth.Auth) string {
+	trimmed := strings.TrimSpace(alias)
+	if trimmed == "" {
+		return ""
+	}
+
+	entry := e.resolveCodexConfig(auth)
+	if entry == nil {
+		return ""
+	}
+
+	normalizedModel, metadata := util.NormalizeThinkingModel(trimmed)
+
+	// Candidate names to match against configured aliases/names.
+	candidates := []string{strings.TrimSpace(normalizedModel)}
+	if !strings.EqualFold(normalizedModel, trimmed) {
+		candidates = append(candidates, trimmed)
+	}
+	if original := util.ResolveOriginalModel(normalizedModel, metadata); original != "" && !strings.EqualFold(original, normalizedModel) {
+		candidates = append(candidates, original)
+	}
+
+	for i := range entry.Models {
+		model := entry.Models[i]
+		name := strings.TrimSpace(model.Name)
+		modelAlias := strings.TrimSpace(model.Alias)
+
+		for _, candidate := range candidates {
+			if candidate == "" {
+				continue
+			}
+			if modelAlias != "" && strings.EqualFold(modelAlias, candidate) {
+				if name != "" {
+					return name
+				}
+				return candidate
+			}
+			if name != "" && strings.EqualFold(name, candidate) {
+				return name
+			}
+		}
+	}
+	return ""
+}
+
+func (e *CodexExecutor) resolveCodexConfig(auth *cliproxyauth.Auth) *config.CodexKey {
+	if auth == nil || e.cfg == nil {
+		return nil
+	}
+	var attrKey, attrBase string
+	if auth.Attributes != nil {
+		attrKey = strings.TrimSpace(auth.Attributes["api_key"])
+		attrBase = strings.TrimSpace(auth.Attributes["base_url"])
+	}
+	for i := range e.cfg.CodexKey {
+		entry := &e.cfg.CodexKey[i]
+		cfgKey := strings.TrimSpace(entry.APIKey)
+		cfgBase := strings.TrimSpace(entry.BaseURL)
+		if attrKey != "" && attrBase != "" {
+			if strings.EqualFold(cfgKey, attrKey) && strings.EqualFold(cfgBase, attrBase) {
+				return entry
+			}
+			continue
+		}
+		if attrKey != "" && strings.EqualFold(cfgKey, attrKey) {
+			if cfgBase == "" || strings.EqualFold(cfgBase, attrBase) {
+				return entry
+			}
+		}
+		if attrKey == "" && attrBase != "" && strings.EqualFold(cfgBase, attrBase) {
+			return entry
+		}
+	}
+	if attrKey != "" {
+		for i := range e.cfg.CodexKey {
+			entry := &e.cfg.CodexKey[i]
+			if strings.EqualFold(strings.TrimSpace(entry.APIKey), attrKey) {
+				return entry
+			}
+		}
+	}
+	return nil
+}
--- a/internal/runtime/executor/iflow_executor.go
+++ b/internal/runtime/executor/iflow_executor.go
@@ -67,6 +67,7 @@ func (e *IFlowExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, re
 		return resp, errValidate
 	}
 	body = applyIFlowThinkingConfig(body)
+	body = preserveReasoningContentInMessages(body)
 	body = applyPayloadConfig(e.cfg, req.Model, body)

 	endpoint := strings.TrimSuffix(baseURL, "/") + iflowDefaultEndpoint
@@ -159,6 +160,7 @@ func (e *IFlowExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Au
 		return nil, errValidate
 	}
 	body = applyIFlowThinkingConfig(body)
+	body = preserveReasoningContentInMessages(body)
 	// Ensure tools array exists to avoid provider quirks similar to Qwen's behaviour.
 	toolsResult := gjson.GetBytes(body, "tools")
 	if toolsResult.Exists() && toolsResult.IsArray() && len(toolsResult.Array()) == 0 {
@@ -445,20 +447,98 @@ func ensureToolsArray(body []byte) []byte {
 	return updated
 }

-// applyIFlowThinkingConfig converts normalized reasoning_effort to iFlow chat_template_kwargs.enable_thinking.
-// This should be called after NormalizeThinkingConfig has processed the payload.
-// iFlow only supports boolean enable_thinking, so any non-"none" effort enables thinking.
-func applyIFlowThinkingConfig(body []byte) []byte {
-	effort := gjson.GetBytes(body, "reasoning_effort")
-	if !effort.Exists() {
+// preserveReasoningContentInMessages ensures reasoning_content from assistant messages in the
+// conversation history is preserved when sending to iFlow models that support thinking.
+// This is critical for multi-turn conversations where the model needs to see its previous
+// reasoning to maintain coherent thought chains across tool calls and conversation turns.
+//
+// For GLM-4.7 and MiniMax-M2.1, the full assistant response (including reasoning) must be
+// appended back into message history before the next call.
+func preserveReasoningContentInMessages(body []byte) []byte {
+	model := strings.ToLower(gjson.GetBytes(body, "model").String())
+
+	// Only apply to models that support thinking with history preservation
+	needsPreservation := strings.HasPrefix(model, "glm-4.7") ||
+		strings.HasPrefix(model, "glm-4-7") ||
+		strings.HasPrefix(model, "minimax-m2.1") ||
+		strings.HasPrefix(model, "minimax-m2-1")
+
+	if !needsPreservation {
 		return body
 	}

-	val := strings.ToLower(strings.TrimSpace(effort.String()))
-	enableThinking := val != "none" && val != ""
+	messages := gjson.GetBytes(body, "messages")
+	if !messages.Exists() || !messages.IsArray() {
+		return body
+	}

-	body, _ = sjson.DeleteBytes(body, "reasoning_effort")
-	body, _ = sjson.SetBytes(body, "chat_template_kwargs.enable_thinking", enableThinking)
+	// Check if any assistant message already has reasoning_content preserved
+	hasReasoningContent := false
+	messages.ForEach(func(_, msg gjson.Result) bool {
+		role := msg.Get("role").String()
+		if role == "assistant" {
+			rc := msg.Get("reasoning_content")
+			if rc.Exists() && rc.String() != "" {
+				hasReasoningContent = true
+				return false // stop iteration
+			}
+		}
+		return true
+	})
+
+	// If reasoning content is already present, the messages are properly formatted
+	// No need to modify - the client has correctly preserved reasoning in history
+	if hasReasoningContent {
+		log.Debugf("iflow executor: reasoning_content found in message history for %s", model)
+	}
+
+	return body
+}
+
+// applyIFlowThinkingConfig converts normalized reasoning_effort to model-specific thinking configurations.
+// This should be called after NormalizeThinkingConfig has processed the payload.
+//
+// Model-specific handling:
+//   - GLM-4.7: Uses extra_body={"thinking": {"type": "enabled"}, "clear_thinking": false}
+//   - MiniMax-M2.1: Uses reasoning_split=true for OpenAI-style reasoning separation
+//   - Other iFlow models: Uses chat_template_kwargs.enable_thinking (boolean)
+func applyIFlowThinkingConfig(body []byte) []byte {
+	effort := gjson.GetBytes(body, "reasoning_effort")
+	model := strings.ToLower(gjson.GetBytes(body, "model").String())
+
+	// Check if thinking should be enabled
+	val := ""
+	if effort.Exists() {
+		val = strings.ToLower(strings.TrimSpace(effort.String()))
+	}
+	enableThinking := effort.Exists() && val != "none" && val != ""
+
+	// Remove reasoning_effort as we'll convert to model-specific format
+	if effort.Exists() {
+		body, _ = sjson.DeleteBytes(body, "reasoning_effort")
+	}
+
+	// GLM-4.7: Use extra_body with thinking config and clear_thinking: false
+	if strings.HasPrefix(model, "glm-4.7") || strings.HasPrefix(model, "glm-4-7") {
+		if enableThinking {
+			body, _ = sjson.SetBytes(body, "extra_body.thinking.type", "enabled")
+			body, _ = sjson.SetBytes(body, "extra_body.clear_thinking", false)
+		}
+		return body
+	}
+
+	// MiniMax-M2.1: Use reasoning_split=true for interleaved thinking
+	if strings.HasPrefix(model, "minimax-m2.1") || strings.HasPrefix(model, "minimax-m2-1") {
+		if enableThinking {
+			body, _ = sjson.SetBytes(body, "reasoning_split", true)
+		}
+		return body
+	}
+
+	// Other iFlow models (including GLM-4.6): Use chat_template_kwargs.enable_thinking
+	if effort.Exists() {
+		body, _ = sjson.SetBytes(body, "chat_template_kwargs.enable_thinking", enableThinking)
+	}

 	return body
 }
--- a/internal/runtime/executor/usage_helpers.go
+++ b/internal/runtime/executor/usage_helpers.go
@@ -19,7 +19,7 @@ type usageReporter struct {
 	provider    string
 	model       string
 	authID      string
-	authIndex   uint64
+	authIndex   string
 	apiKey      string
 	source      string
 	requestedAt time.Time
@@ -275,6 +275,20 @@ func parseClaudeStreamUsage(line []byte) (usage.Detail, bool) {
 	return detail, true
 }

+func parseGeminiFamilyUsageDetail(node gjson.Result) usage.Detail {
+	detail := usage.Detail{
+		InputTokens:     node.Get("promptTokenCount").Int(),
+		OutputTokens:    node.Get("candidatesTokenCount").Int(),
+		ReasoningTokens: node.Get("thoughtsTokenCount").Int(),
+		TotalTokens:     node.Get("totalTokenCount").Int(),
+		CachedTokens:    node.Get("cachedContentTokenCount").Int(),
+	}
+	if detail.TotalTokens == 0 {
+		detail.TotalTokens = detail.InputTokens + detail.OutputTokens + detail.ReasoningTokens
+	}
+	return detail
+}
+
 func parseGeminiCLIUsage(data []byte) usage.Detail {
 	usageNode := gjson.ParseBytes(data)
 	node := usageNode.Get("response.usageMetadata")
@@ -284,16 +298,7 @@ func parseGeminiCLIUsage(data []byte) usage.Detail {
 	if !node.Exists() {
 		return usage.Detail{}
 	}
-	detail := usage.Detail{
-		InputTokens:     node.Get("promptTokenCount").Int(),
-		OutputTokens:    node.Get("candidatesTokenCount").Int(),
-		ReasoningTokens: node.Get("thoughtsTokenCount").Int(),
-		TotalTokens:     node.Get("totalTokenCount").Int(),
-	}
-	if detail.TotalTokens == 0 {
-		detail.TotalTokens = detail.InputTokens + detail.OutputTokens + detail.ReasoningTokens
-	}
-	return detail
+	return parseGeminiFamilyUsageDetail(node)
 }

 func parseGeminiUsage(data []byte) usage.Detail {
@@ -305,16 +310,7 @@ func parseGeminiUsage(data []byte) usage.Detail {
 	if !node.Exists() {
 		return usage.Detail{}
 	}
-	detail := usage.Detail{
-		InputTokens:     node.Get("promptTokenCount").Int(),
-		OutputTokens:    node.Get("candidatesTokenCount").Int(),
-		ReasoningTokens: node.Get("thoughtsTokenCount").Int(),
-		TotalTokens:     node.Get("totalTokenCount").Int(),
-	}
-	if detail.TotalTokens == 0 {
-		detail.TotalTokens = detail.InputTokens + detail.OutputTokens + detail.ReasoningTokens
-	}
-	return detail
+	return parseGeminiFamilyUsageDetail(node)
 }

 func parseGeminiStreamUsage(line []byte) (usage.Detail, bool) {
@@ -329,16 +325,7 @@ func parseGeminiStreamUsage(line []byte) (usage.Detail, bool) {
 	if !node.Exists() {
 		return usage.Detail{}, false
 	}
-	detail := usage.Detail{
-		InputTokens:     node.Get("promptTokenCount").Int(),
-		OutputTokens:    node.Get("candidatesTokenCount").Int(),
-		ReasoningTokens: node.Get("thoughtsTokenCount").Int(),
-		TotalTokens:     node.Get("totalTokenCount").Int(),
-	}
-	if detail.TotalTokens == 0 {
-		detail.TotalTokens = detail.InputTokens + detail.OutputTokens + detail.ReasoningTokens
-	}
-	return detail, true
+	return parseGeminiFamilyUsageDetail(node), true
 }

 func parseGeminiCLIStreamUsage(line []byte) (usage.Detail, bool) {
@@ -353,16 +340,7 @@ func parseGeminiCLIStreamUsage(line []byte) (usage.Detail, bool) {
 	if !node.Exists() {
 		return usage.Detail{}, false
 	}
-	detail := usage.Detail{
-		InputTokens:     node.Get("promptTokenCount").Int(),
-		OutputTokens:    node.Get("candidatesTokenCount").Int(),
-		ReasoningTokens: node.Get("thoughtsTokenCount").Int(),
-		TotalTokens:     node.Get("totalTokenCount").Int(),
-	}
-	if detail.TotalTokens == 0 {
-		detail.TotalTokens = detail.InputTokens + detail.OutputTokens + detail.ReasoningTokens
-	}
-	return detail, true
+	return parseGeminiFamilyUsageDetail(node), true
 }

 func parseAntigravityUsage(data []byte) usage.Detail {
@@ -377,16 +355,7 @@ func parseAntigravityUsage(data []byte) usage.Detail {
 	if !node.Exists() {
 		return usage.Detail{}
 	}
-	detail := usage.Detail{
-		InputTokens:     node.Get("promptTokenCount").Int(),
-		OutputTokens:    node.Get("candidatesTokenCount").Int(),
-		ReasoningTokens: node.Get("thoughtsTokenCount").Int(),
-		TotalTokens:     node.Get("totalTokenCount").Int(),
-	}
-	if detail.TotalTokens == 0 {
-		detail.TotalTokens = detail.InputTokens + detail.OutputTokens + detail.ReasoningTokens
-	}
-	return detail
+	return parseGeminiFamilyUsageDetail(node)
 }

 func parseAntigravityStreamUsage(line []byte) (usage.Detail, bool) {
@@ -404,16 +373,7 @@ func parseAntigravityStreamUsage(line []byte) (usage.Detail, bool) {
 	if !node.Exists() {
 		return usage.Detail{}, false
 	}
-	detail := usage.Detail{
-		InputTokens:     node.Get("promptTokenCount").Int(),
-		OutputTokens:    node.Get("candidatesTokenCount").Int(),
-		ReasoningTokens: node.Get("thoughtsTokenCount").Int(),
-		TotalTokens:     node.Get("totalTokenCount").Int(),
-	}
-	if detail.TotalTokens == 0 {
-		detail.TotalTokens = detail.InputTokens + detail.OutputTokens + detail.ReasoningTokens
-	}
-	return detail, true
+	return parseGeminiFamilyUsageDetail(node), true
 }

 var stopChunkWithoutUsage sync.Map
@@ -522,12 +482,16 @@ func StripUsageMetadataFromJSON(rawJSON []byte) ([]byte, bool) {
 	cleaned := jsonBytes
 	var changed bool

-	if gjson.GetBytes(cleaned, "usageMetadata").Exists() {
+	if usageMetadata = gjson.GetBytes(cleaned, "usageMetadata"); usageMetadata.Exists() {
+		// Rename usageMetadata to cpaUsageMetadata in the message_start event of Claude
+		cleaned, _ = sjson.SetRawBytes(cleaned, "cpaUsageMetadata", []byte(usageMetadata.Raw))
 		cleaned, _ = sjson.DeleteBytes(cleaned, "usageMetadata")
 		changed = true
 	}

-	if gjson.GetBytes(cleaned, "response.usageMetadata").Exists() {
+	if usageMetadata = gjson.GetBytes(cleaned, "response.usageMetadata"); usageMetadata.Exists() {
+		// Rename usageMetadata to cpaUsageMetadata in the message_start event of Claude
+		cleaned, _ = sjson.SetRawBytes(cleaned, "response.cpaUsageMetadata", []byte(usageMetadata.Raw))
 		cleaned, _ = sjson.DeleteBytes(cleaned, "response.usageMetadata")
 		changed = true
 	}
--- a/internal/translator/antigravity/claude/antigravity_claude_response.go
+++ b/internal/translator/antigravity/claude/antigravity_claude_response.go
@@ -99,6 +99,14 @@ func ConvertAntigravityResponseToClaude(_ context.Context, _ string, originalReq
 		// This follows the Claude Code API specification for streaming message initialization
 		messageStartTemplate := `{"type": "message_start", "message": {"id": "msg_1nZdL29xx5MUA1yADyHTEsnR8uuvGzszyY", "type": "message", "role": "assistant", "content": [], "model": "claude-3-5-sonnet-20241022", "stop_reason": null, "stop_sequence": null, "usage": {"input_tokens": 0, "output_tokens": 0}}}`

+		// Use cpaUsageMetadata within the message_start event for Claude.
+		if promptTokenCount := gjson.GetBytes(rawJSON, "response.cpaUsageMetadata.promptTokenCount"); promptTokenCount.Exists() {
+			messageStartTemplate, _ = sjson.Set(messageStartTemplate, "message.usage.input_tokens", promptTokenCount.Int())
+		}
+		if candidatesTokenCount := gjson.GetBytes(rawJSON, "response.cpaUsageMetadata.candidatesTokenCount"); candidatesTokenCount.Exists() {
+			messageStartTemplate, _ = sjson.Set(messageStartTemplate, "message.usage.output_tokens", candidatesTokenCount.Int())
+		}
+
 		// Override default values with actual response metadata if available from the Gemini CLI response
 		if modelVersionResult := gjson.GetBytes(rawJSON, "response.modelVersion"); modelVersionResult.Exists() {
 			messageStartTemplate, _ = sjson.Set(messageStartTemplate, "message.model", modelVersionResult.String())
@@ -271,11 +279,11 @@ func ConvertAntigravityResponseToClaude(_ context.Context, _ string, originalReq

 	if usageResult := gjson.GetBytes(rawJSON, "response.usageMetadata"); usageResult.Exists() {
 		params.HasUsageMetadata = true
-		params.PromptTokenCount = usageResult.Get("promptTokenCount").Int()
+		params.CachedTokenCount = usageResult.Get("cachedContentTokenCount").Int()
+		params.PromptTokenCount = usageResult.Get("promptTokenCount").Int() - params.CachedTokenCount
 		params.CandidatesTokenCount = usageResult.Get("candidatesTokenCount").Int()
 		params.ThoughtsTokenCount = usageResult.Get("thoughtsTokenCount").Int()
 		params.TotalTokenCount = usageResult.Get("totalTokenCount").Int()
-		params.CachedTokenCount = usageResult.Get("cachedContentTokenCount").Int()
 		if params.CandidatesTokenCount == 0 && params.TotalTokenCount > 0 {
 			params.CandidatesTokenCount = params.TotalTokenCount - params.PromptTokenCount - params.ThoughtsTokenCount
 			if params.CandidatesTokenCount < 0 {
--- a/internal/translator/antigravity/openai/chat-completions/antigravity_openai_request.go
+++ b/internal/translator/antigravity/openai/chat-completions/antigravity_openai_request.go
@@ -247,10 +247,30 @@ func ConvertOpenAIRequestToAntigravity(modelName string, inputRawJSON []byte, _
 			} else if role == "assistant" {
 				node := []byte(`{"role":"model","parts":[]}`)
 				p := 0
-				if content.Type == gjson.String {
+				if content.Type == gjson.String && content.String() != "" {
 					node, _ = sjson.SetBytes(node, "parts.-1.text", content.String())
-					out, _ = sjson.SetRawBytes(out, "request.contents.-1", node)
 					p++
+				} else if content.IsArray() {
+					// Assistant multimodal content (e.g. text + image) -> single model content with parts
+					for _, item := range content.Array() {
+						switch item.Get("type").String() {
+						case "text":
+							p++
+						case "image_url":
+							// If the assistant returned an inline data URL, preserve it for history fidelity.
+							imageURL := item.Get("image_url.url").String()
+							if len(imageURL) > 5 { // expect data:...
+								pieces := strings.SplitN(imageURL[5:], ";", 2)
+								if len(pieces) == 2 && len(pieces[1]) > 7 {
+									mime := pieces[0]
+									data := pieces[1][7:]
+									node, _ = sjson.SetBytes(node, "parts."+itoa(p)+".inlineData.mime_type", mime)
+									node, _ = sjson.SetBytes(node, "parts."+itoa(p)+".inlineData.data", data)
+									p++
+								}
+							}
+						}
+					}
 				}

 				// Tool calls -> single model content with functionCall parts
@@ -305,6 +325,8 @@ func ConvertOpenAIRequestToAntigravity(modelName string, inputRawJSON []byte, _
 					if pp > 0 {
 						out, _ = sjson.SetRawBytes(out, "request.contents.-1", toolNode)
 					}
+				} else {
+					out, _ = sjson.SetRawBytes(out, "request.contents.-1", node)
 				}
 			}
 		}
--- a/internal/translator/antigravity/openai/chat-completions/antigravity_openai_response.go
+++ b/internal/translator/antigravity/openai/chat-completions/antigravity_openai_response.go
@@ -87,15 +87,15 @@ func ConvertAntigravityResponseToOpenAI(_ context.Context, _ string, originalReq

 	// Extract and set usage metadata (token counts).
 	if usageResult := gjson.GetBytes(rawJSON, "response.usageMetadata"); usageResult.Exists() {
+		cachedTokenCount := usageResult.Get("cachedContentTokenCount").Int()
 		if candidatesTokenCountResult := usageResult.Get("candidatesTokenCount"); candidatesTokenCountResult.Exists() {
 			template, _ = sjson.Set(template, "usage.completion_tokens", candidatesTokenCountResult.Int())
 		}
 		if totalTokenCountResult := usageResult.Get("totalTokenCount"); totalTokenCountResult.Exists() {
 			template, _ = sjson.Set(template, "usage.total_tokens", totalTokenCountResult.Int())
 		}
-		promptTokenCount := usageResult.Get("promptTokenCount").Int()
+		promptTokenCount := usageResult.Get("promptTokenCount").Int() - cachedTokenCount
 		thoughtsTokenCount := usageResult.Get("thoughtsTokenCount").Int()
-		cachedTokenCount := usageResult.Get("cachedContentTokenCount").Int()
 		template, _ = sjson.Set(template, "usage.prompt_tokens", promptTokenCount+thoughtsTokenCount)
 		if thoughtsTokenCount > 0 {
 			template, _ = sjson.Set(template, "usage.completion_tokens_details.reasoning_tokens", thoughtsTokenCount)
@@ -181,12 +181,14 @@ func ConvertAntigravityResponseToOpenAI(_ context.Context, _ string, originalReq
 					mimeType = "image/png"
 				}
 				imageURL := fmt.Sprintf("data:%s;base64,%s", mimeType, data)
-				imagePayload := `{"image_url":{"url":""},"type":"image_url"}`
-				imagePayload, _ = sjson.Set(imagePayload, "image_url.url", imageURL)
 				imagesResult := gjson.Get(template, "choices.0.delta.images")
 				if !imagesResult.Exists() || !imagesResult.IsArray() {
 					template, _ = sjson.SetRaw(template, "choices.0.delta.images", `[]`)
 				}
+				imageIndex := len(gjson.Get(template, "choices.0.delta.images").Array())
+				imagePayload := `{"type":"image_url","image_url":{"url":""}}`
+				imagePayload, _ = sjson.Set(imagePayload, "index", imageIndex)
+				imagePayload, _ = sjson.Set(imagePayload, "image_url.url", imageURL)
 				template, _ = sjson.Set(template, "choices.0.delta.role", "assistant")
 				template, _ = sjson.SetRaw(template, "choices.0.delta.images.-1", imagePayload)
 			}
--- a/internal/translator/claude/openai/chat-completions/claude_openai_response.go
+++ b/internal/translator/claude/openai/chat-completions/claude_openai_response.go
@@ -205,9 +205,12 @@ func ConvertClaudeResponseToOpenAI(_ context.Context, modelName string, original
 		if usage := root.Get("usage"); usage.Exists() {
 			inputTokens := usage.Get("input_tokens").Int()
 			outputTokens := usage.Get("output_tokens").Int()
-			template, _ = sjson.Set(template, "usage.prompt_tokens", inputTokens)
+			cacheReadInputTokens := usage.Get("cache_read_input_tokens").Int()
+			cacheCreationInputTokens := usage.Get("cache_creation_input_tokens").Int()
+			template, _ = sjson.Set(template, "usage.prompt_tokens", inputTokens+cacheCreationInputTokens)
 			template, _ = sjson.Set(template, "usage.completion_tokens", outputTokens)
 			template, _ = sjson.Set(template, "usage.total_tokens", inputTokens+outputTokens)
+			template, _ = sjson.Set(template, "usage.prompt_tokens_details.cached_tokens", cacheReadInputTokens)
 		}
 		return []string{template}

@@ -281,8 +284,6 @@ func ConvertClaudeResponseToOpenAINonStream(_ context.Context, _ string, origina
 	var messageID string
 	var model string
 	var createdAt int64
-	var inputTokens, outputTokens int64
-	var reasoningTokens int64
 	var stopReason string
 	var contentParts []string
 	var reasoningParts []string
@@ -299,9 +300,6 @@ func ConvertClaudeResponseToOpenAINonStream(_ context.Context, _ string, origina
 				messageID = message.Get("id").String()
 				model = message.Get("model").String()
 				createdAt = time.Now().Unix()
-				if usage := message.Get("usage"); usage.Exists() {
-					inputTokens = usage.Get("input_tokens").Int()
-				}
 			}

 		case "content_block_start":
@@ -364,11 +362,14 @@ func ConvertClaudeResponseToOpenAINonStream(_ context.Context, _ string, origina
 				}
 			}
 			if usage := root.Get("usage"); usage.Exists() {
-				outputTokens = usage.Get("output_tokens").Int()
-				// Estimate reasoning tokens from accumulated thinking content
-				if len(reasoningParts) > 0 {
-					reasoningTokens = int64(len(strings.Join(reasoningParts, "")) / 4) // Rough estimation
-				}
+				inputTokens := usage.Get("input_tokens").Int()
+				outputTokens := usage.Get("output_tokens").Int()
+				cacheReadInputTokens := usage.Get("cache_read_input_tokens").Int()
+				cacheCreationInputTokens := usage.Get("cache_creation_input_tokens").Int()
+				out, _ = sjson.Set(out, "usage.prompt_tokens", inputTokens+cacheCreationInputTokens)
+				out, _ = sjson.Set(out, "usage.completion_tokens", outputTokens)
+				out, _ = sjson.Set(out, "usage.total_tokens", inputTokens+outputTokens)
+				out, _ = sjson.Set(out, "usage.prompt_tokens_details.cached_tokens", cacheReadInputTokens)
 			}
 		}
 	}
@@ -427,16 +428,5 @@ func ConvertClaudeResponseToOpenAINonStream(_ context.Context, _ string, origina
 		out, _ = sjson.Set(out, "choices.0.finish_reason", mapAnthropicStopReasonToOpenAI(stopReason))
 	}

-	// Set usage information including prompt tokens, completion tokens, and total tokens
-	totalTokens := inputTokens + outputTokens
-	out, _ = sjson.Set(out, "usage.prompt_tokens", inputTokens)
-	out, _ = sjson.Set(out, "usage.completion_tokens", outputTokens)
-	out, _ = sjson.Set(out, "usage.total_tokens", totalTokens)
-
-	// Add reasoning tokens to usage details if any reasoning content was processed
-	if reasoningTokens > 0 {
-		out, _ = sjson.Set(out, "usage.completion_tokens_details.reasoning_tokens", reasoningTokens)
-	}
-
 	return out
 }
--- a/internal/translator/claude/openai/responses/claude_openai-responses_request.go
+++ b/internal/translator/claude/openai/responses/claude_openai-responses_request.go
@@ -114,13 +114,16 @@ func ConvertOpenAIResponsesRequestToClaude(modelName string, inputRawJSON []byte
 					var builder strings.Builder
 					if parts := item.Get("content"); parts.Exists() && parts.IsArray() {
 						parts.ForEach(func(_, part gjson.Result) bool {
-							text := part.Get("text").String()
+							textResult := part.Get("text")
+							text := textResult.String()
 							if builder.Len() > 0 && text != "" {
 								builder.WriteByte('\n')
 							}
 							builder.WriteString(text)
 							return true
 						})
+					} else if parts.Type == gjson.String {
+						builder.WriteString(parts.String())
 					}
 					instructionsText = builder.String()
 					if instructionsText != "" {
@@ -207,6 +210,8 @@ func ConvertOpenAIResponsesRequestToClaude(modelName string, inputRawJSON []byte
 						}
 						return true
 					})
+				} else if parts.Type == gjson.String {
+					textAggregate.WriteString(parts.String())
 				}

 				// Fallback to given role if content types not decisive
--- a/internal/translator/gemini-cli/openai/chat-completions/gemini-cli_openai_request.go
+++ b/internal/translator/gemini-cli/openai/chat-completions/gemini-cli_openai_request.go
@@ -218,8 +218,29 @@ func ConvertOpenAIRequestToGeminiCLI(modelName string, inputRawJSON []byte, _ bo
 				if content.Type == gjson.String {
 					// Assistant text -> single model content
 					node, _ = sjson.SetBytes(node, "parts.-1.text", content.String())
-					out, _ = sjson.SetRawBytes(out, "request.contents.-1", node)
 					p++
+				} else if content.IsArray() {
+					// Assistant multimodal content (e.g. text + image) -> single model content with parts
+					for _, item := range content.Array() {
+						switch item.Get("type").String() {
+						case "text":
+							node, _ = sjson.SetBytes(node, "parts."+itoa(p)+".text", item.Get("text").String())
+							p++
+						case "image_url":
+							// If the assistant returned an inline data URL, preserve it for history fidelity.
+							imageURL := item.Get("image_url.url").String()
+							if len(imageURL) > 5 { // expect data:...
+								pieces := strings.SplitN(imageURL[5:], ";", 2)
+								if len(pieces) == 2 && len(pieces[1]) > 7 {
+									mime := pieces[0]
+									data := pieces[1][7:]
+									node, _ = sjson.SetBytes(node, "parts."+itoa(p)+".inlineData.mime_type", mime)
+									node, _ = sjson.SetBytes(node, "parts."+itoa(p)+".inlineData.data", data)
+									p++
+								}
+							}
+						}
+					}
 				}

 				// Tool calls -> single model content with functionCall parts
@@ -260,6 +281,8 @@ func ConvertOpenAIRequestToGeminiCLI(modelName string, inputRawJSON []byte, _ bo
 					if pp > 0 {
 						out, _ = sjson.SetRawBytes(out, "request.contents.-1", toolNode)
 					}
+				} else {
+					out, _ = sjson.SetRawBytes(out, "request.contents.-1", node)
 				}
 			}
 		}
--- a/internal/translator/gemini-cli/openai/chat-completions/gemini-cli_openai_response.go
+++ b/internal/translator/gemini-cli/openai/chat-completions/gemini-cli_openai_response.go
@@ -170,12 +170,14 @@ func ConvertCliResponseToOpenAI(_ context.Context, _ string, originalRequestRawJ
 					mimeType = "image/png"
 				}
 				imageURL := fmt.Sprintf("data:%s;base64,%s", mimeType, data)
-				imagePayload := `{"image_url":{"url":""},"type":"image_url"}`
-				imagePayload, _ = sjson.Set(imagePayload, "image_url.url", imageURL)
 				imagesResult := gjson.Get(template, "choices.0.delta.images")
 				if !imagesResult.Exists() || !imagesResult.IsArray() {
 					template, _ = sjson.SetRaw(template, "choices.0.delta.images", `[]`)
 				}
+				imageIndex := len(gjson.Get(template, "choices.0.delta.images").Array())
+				imagePayload := `{"type":"image_url","image_url":{"url":""}}`
+				imagePayload, _ = sjson.Set(imagePayload, "index", imageIndex)
+				imagePayload, _ = sjson.Set(imagePayload, "image_url.url", imageURL)
 				template, _ = sjson.Set(template, "choices.0.delta.role", "assistant")
 				template, _ = sjson.SetRaw(template, "choices.0.delta.images.-1", imagePayload)
 			}
--- a/internal/translator/gemini/openai/chat-completions/gemini_openai_request.go
+++ b/internal/translator/gemini/openai/chat-completions/gemini_openai_request.go
@@ -233,18 +233,15 @@ func ConvertOpenAIRequestToGemini(modelName string, inputRawJSON []byte, _ bool)
 			} else if role == "assistant" {
 				node := []byte(`{"role":"model","parts":[]}`)
 				p := 0
-
 				if content.Type == gjson.String {
 					// Assistant text -> single model content
 					node, _ = sjson.SetBytes(node, "parts.-1.text", content.String())
-					out, _ = sjson.SetRawBytes(out, "contents.-1", node)
 					p++
 				} else if content.IsArray() {
 					// Assistant multimodal content (e.g. text + image) -> single model content with parts
 					for _, item := range content.Array() {
 						switch item.Get("type").String() {
 						case "text":
-							node, _ = sjson.SetBytes(node, "parts."+itoa(p)+".text", item.Get("text").String())
 							p++
 						case "image_url":
 							// If the assistant returned an inline data URL, preserve it for history fidelity.
@@ -261,7 +258,6 @@ func ConvertOpenAIRequestToGemini(modelName string, inputRawJSON []byte, _ bool)
 							}
 						}
 					}
-					out, _ = sjson.SetRawBytes(out, "contents.-1", node)
 				}

 				// Tool calls -> single model content with functionCall parts
@@ -302,6 +298,8 @@ func ConvertOpenAIRequestToGemini(modelName string, inputRawJSON []byte, _ bool)
 					if pp > 0 {
 						out, _ = sjson.SetRawBytes(out, "contents.-1", toolNode)
 					}
+				} else {
+					out, _ = sjson.SetRawBytes(out, "contents.-1", node)
 				}
 			}
 		}
--- a/internal/translator/gemini/openai/chat-completions/gemini_openai_response.go
+++ b/internal/translator/gemini/openai/chat-completions/gemini_openai_response.go
@@ -89,15 +89,15 @@ func ConvertGeminiResponseToOpenAI(_ context.Context, _ string, originalRequestR

 	// Extract and set usage metadata (token counts).
 	if usageResult := gjson.GetBytes(rawJSON, "usageMetadata"); usageResult.Exists() {
+		cachedTokenCount := usageResult.Get("cachedContentTokenCount").Int()
 		if candidatesTokenCountResult := usageResult.Get("candidatesTokenCount"); candidatesTokenCountResult.Exists() {
 			template, _ = sjson.Set(template, "usage.completion_tokens", candidatesTokenCountResult.Int())
 		}
 		if totalTokenCountResult := usageResult.Get("totalTokenCount"); totalTokenCountResult.Exists() {
 			template, _ = sjson.Set(template, "usage.total_tokens", totalTokenCountResult.Int())
 		}
-		promptTokenCount := usageResult.Get("promptTokenCount").Int()
+		promptTokenCount := usageResult.Get("promptTokenCount").Int() - cachedTokenCount
 		thoughtsTokenCount := usageResult.Get("thoughtsTokenCount").Int()
-		cachedTokenCount := usageResult.Get("cachedContentTokenCount").Int()
 		template, _ = sjson.Set(template, "usage.prompt_tokens", promptTokenCount+thoughtsTokenCount)
 		if thoughtsTokenCount > 0 {
 			template, _ = sjson.Set(template, "usage.completion_tokens_details.reasoning_tokens", thoughtsTokenCount)
@@ -182,12 +182,14 @@ func ConvertGeminiResponseToOpenAI(_ context.Context, _ string, originalRequestR
 					mimeType = "image/png"
 				}
 				imageURL := fmt.Sprintf("data:%s;base64,%s", mimeType, data)
-				imagePayload := `{"image_url":{"url":""},"type":"image_url"}`
-				imagePayload, _ = sjson.Set(imagePayload, "image_url.url", imageURL)
 				imagesResult := gjson.Get(template, "choices.0.delta.images")
 				if !imagesResult.Exists() || !imagesResult.IsArray() {
 					template, _ = sjson.SetRaw(template, "choices.0.delta.images", `[]`)
 				}
+				imageIndex := len(gjson.Get(template, "choices.0.delta.images").Array())
+				imagePayload := `{"type":"image_url","image_url":{"url":""}}`
+				imagePayload, _ = sjson.Set(imagePayload, "index", imageIndex)
+				imagePayload, _ = sjson.Set(imagePayload, "image_url.url", imageURL)
 				template, _ = sjson.Set(template, "choices.0.delta.role", "assistant")
 				template, _ = sjson.SetRaw(template, "choices.0.delta.images.-1", imagePayload)
 			}
@@ -316,12 +318,14 @@ func ConvertGeminiResponseToOpenAINonStream(_ context.Context, _ string, origina
 					mimeType = "image/png"
 				}
 				imageURL := fmt.Sprintf("data:%s;base64,%s", mimeType, data)
-				imagePayload := `{"image_url":{"url":""},"type":"image_url"}`
-				imagePayload, _ = sjson.Set(imagePayload, "image_url.url", imageURL)
 				imagesResult := gjson.Get(template, "choices.0.message.images")
 				if !imagesResult.Exists() || !imagesResult.IsArray() {
 					template, _ = sjson.SetRaw(template, "choices.0.message.images", `[]`)
 				}
+				imageIndex := len(gjson.Get(template, "choices.0.message.images").Array())
+				imagePayload := `{"type":"image_url","image_url":{"url":""}}`
+				imagePayload, _ = sjson.Set(imagePayload, "index", imageIndex)
+				imagePayload, _ = sjson.Set(imagePayload, "image_url.url", imageURL)
 				template, _ = sjson.Set(template, "choices.0.message.role", "assistant")
 				template, _ = sjson.SetRaw(template, "choices.0.message.images.-1", imagePayload)
 			}
--- a/internal/usage/logger_plugin.go
+++ b/internal/usage/logger_plugin.go
@@ -6,6 +6,7 @@ package usage
 import (
 	"context"
 	"fmt"
+	"strings"
 	"sync"
 	"sync/atomic"
 	"time"
@@ -90,7 +91,7 @@ type modelStats struct {
 type RequestDetail struct {
 	Timestamp time.Time  `json:"timestamp"`
 	Source    string     `json:"source"`
-	AuthIndex uint64     `json:"auth_index"`
+	AuthIndex string     `json:"auth_index"`
 	Tokens    TokenStats `json:"tokens"`
 	Failed    bool       `json:"failed"`
 }
@@ -281,6 +282,118 @@ func (s *RequestStatistics) Snapshot() StatisticsSnapshot {
 	return result
 }

+type MergeResult struct {
+	Added   int64 `json:"added"`
+	Skipped int64 `json:"skipped"`
+}
+
+// MergeSnapshot merges an exported statistics snapshot into the current store.
+// Existing data is preserved and duplicate request details are skipped.
+func (s *RequestStatistics) MergeSnapshot(snapshot StatisticsSnapshot) MergeResult {
+	result := MergeResult{}
+	if s == nil {
+		return result
+	}
+
+	s.mu.Lock()
+	defer s.mu.Unlock()
+
+	seen := make(map[string]struct{})
+	for apiName, stats := range s.apis {
+		if stats == nil {
+			continue
+		}
+		for modelName, modelStatsValue := range stats.Models {
+			if modelStatsValue == nil {
+				continue
+			}
+			for _, detail := range modelStatsValue.Details {
+				seen[dedupKey(apiName, modelName, detail)] = struct{}{}
+			}
+		}
+	}
+
+	for apiName, apiSnapshot := range snapshot.APIs {
+		apiName = strings.TrimSpace(apiName)
+		if apiName == "" {
+			continue
+		}
+		stats, ok := s.apis[apiName]
+		if !ok || stats == nil {
+			stats = &apiStats{Models: make(map[string]*modelStats)}
+			s.apis[apiName] = stats
+		} else if stats.Models == nil {
+			stats.Models = make(map[string]*modelStats)
+		}
+		for modelName, modelSnapshot := range apiSnapshot.Models {
+			modelName = strings.TrimSpace(modelName)
+			if modelName == "" {
+				modelName = "unknown"
+			}
+			for _, detail := range modelSnapshot.Details {
+				detail.Tokens = normaliseTokenStats(detail.Tokens)
+				if detail.Timestamp.IsZero() {
+					detail.Timestamp = time.Now()
+				}
+				key := dedupKey(apiName, modelName, detail)
+				if _, exists := seen[key]; exists {
+					result.Skipped++
+					continue
+				}
+				seen[key] = struct{}{}
+				s.recordImported(apiName, modelName, stats, detail)
+				result.Added++
+			}
+		}
+	}
+
+	return result
+}
+
+func (s *RequestStatistics) recordImported(apiName, modelName string, stats *apiStats, detail RequestDetail) {
+	totalTokens := detail.Tokens.TotalTokens
+	if totalTokens < 0 {
+		totalTokens = 0
+	}
+
+	s.totalRequests++
+	if detail.Failed {
+		s.failureCount++
+	} else {
+		s.successCount++
+	}
+	s.totalTokens += totalTokens
+
+	s.updateAPIStats(stats, modelName, detail)
+
+	dayKey := detail.Timestamp.Format("2006-01-02")
+	hourKey := detail.Timestamp.Hour()
+
+	s.requestsByDay[dayKey]++
+	s.requestsByHour[hourKey]++
+	s.tokensByDay[dayKey] += totalTokens
+	s.tokensByHour[hourKey] += totalTokens
+}
+
+func dedupKey(apiName, modelName string, detail RequestDetail) string {
+	timestamp := detail.Timestamp.UTC().Format(time.RFC3339Nano)
+	tokens := normaliseTokenStats(detail.Tokens)
+	return fmt.Sprintf(
+		"%s|%s|%s|%s|%s|%t|%d|%d|%d|%d|%d",
+		apiName,
+		modelName,
+		timestamp,
+		detail.Source,
+		detail.AuthIndex,
+		detail.Failed,
+		tokens.InputTokens,
+		tokens.OutputTokens,
+		tokens.ReasoningTokens,
+		tokens.CachedTokens,
+		tokens.TotalTokens,
+	)
+}
+
 func resolveAPIIdentifier(ctx context.Context, record coreusage.Record) string {
 	if ctx != nil {
 		if ginCtx, ok := ctx.Value("gin").(*gin.Context); ok && ginCtx != nil {
@@ -340,6 +453,16 @@ func normaliseDetail(detail coreusage.Detail) TokenStats {
 	return tokens
 }

+func normaliseTokenStats(tokens TokenStats) TokenStats {
+	if tokens.TotalTokens == 0 {
+		tokens.TotalTokens = tokens.InputTokens + tokens.OutputTokens + tokens.ReasoningTokens
+	}
+	if tokens.TotalTokens == 0 {
+		tokens.TotalTokens = tokens.InputTokens + tokens.OutputTokens + tokens.ReasoningTokens + tokens.CachedTokens
+	}
+	return tokens
+}
+
 func formatHour(hour int) string {
 	if hour < 0 {
 		hour = 0
--- a/internal/watcher/diff/model_hash.go
+++ b/internal/watcher/diff/model_hash.go
@@ -56,6 +56,21 @@ func ComputeClaudeModelsHash(models []config.ClaudeModel) string {
 	return hashJoined(keys)
 }

+// ComputeCodexModelsHash returns a stable hash for Codex model aliases.
+func ComputeCodexModelsHash(models []config.CodexModel) string {
+	keys := normalizeModelPairs(func(out func(key string)) {
+		for _, model := range models {
+			name := strings.TrimSpace(model.Name)
+			alias := strings.TrimSpace(model.Alias)
+			if name == "" && alias == "" {
+				continue
+			}
+			out(strings.ToLower(name) + "|" + strings.ToLower(alias))
+		}
+	})
+	return hashJoined(keys)
+}
+
 // ComputeExcludedModelsHash returns a normalized hash for excluded model lists.
 func ComputeExcludedModelsHash(excluded []string) string {
 	if len(excluded) == 0 {
--- a/internal/watcher/diff/model_hash_test.go
+++ b/internal/watcher/diff/model_hash_test.go
@@ -81,6 +81,15 @@ func TestComputeClaudeModelsHash_Empty(t *testing.T) {
 	}
 }

+func TestComputeCodexModelsHash_Empty(t *testing.T) {
+	if got := ComputeCodexModelsHash(nil); got != "" {
+		t.Fatalf("expected empty hash for nil models, got %q", got)
+	}
+	if got := ComputeCodexModelsHash([]config.CodexModel{}); got != "" {
+		t.Fatalf("expected empty hash for empty slice, got %q", got)
+	}
+}
+
 func TestComputeClaudeModelsHash_IgnoresBlankAndDedup(t *testing.T) {
 	a := []config.ClaudeModel{
 		{Name: "m1", Alias: "a1"},
@@ -95,6 +104,20 @@ func TestComputeClaudeModelsHash_IgnoresBlankAndDedup(t *testing.T) {
 	}
 }

+func TestComputeCodexModelsHash_IgnoresBlankAndDedup(t *testing.T) {
+	a := []config.CodexModel{
+		{Name: "m1", Alias: "a1"},
+		{Name: " "},
+		{Name: "M1", Alias: "A1"},
+	}
+	b := []config.CodexModel{
+		{Name: "m1", Alias: "a1"},
+	}
+	if h1, h2 := ComputeCodexModelsHash(a), ComputeCodexModelsHash(b); h1 == "" || h1 != h2 {
+		t.Fatalf("expected same hash ignoring blanks/dupes, got %q / %q", h1, h2)
+	}
+}
+
 func TestComputeExcludedModelsHash_Normalizes(t *testing.T) {
 	hash1 := ComputeExcludedModelsHash([]string{" A ", "b", "a"})
 	hash2 := ComputeExcludedModelsHash([]string{"a", " b", "A"})
@@ -157,3 +180,15 @@ func TestComputeClaudeModelsHash_Deterministic(t *testing.T) {
 		t.Fatalf("expected different hash when models change, got %s", h3)
 	}
 }
+
+func TestComputeCodexModelsHash_Deterministic(t *testing.T) {
+	models := []config.CodexModel{{Name: "a", Alias: "A"}, {Name: "b"}}
+	h1 := ComputeCodexModelsHash(models)
+	h2 := ComputeCodexModelsHash(models)
+	if h1 == "" || h1 != h2 {
+		t.Fatalf("expected deterministic hash, got %s / %s", h1, h2)
+	}
+	if h3 := ComputeCodexModelsHash([]config.CodexModel{{Name: "a"}}); h3 == h1 {
+		t.Fatalf("expected different hash when models change, got %s", h3)
+	}
+}
--- a/internal/watcher/synthesizer/config.go
+++ b/internal/watcher/synthesizer/config.go
@@ -147,6 +147,9 @@ func (s *ConfigSynthesizer) synthesizeCodexKeys(ctx *SynthesisContext) []*coreau
 		if ck.BaseURL != "" {
 			attrs["base_url"] = ck.BaseURL
 		}
+		if hash := diff.ComputeCodexModelsHash(ck.Models); hash != "" {
+			attrs["models_hash"] = hash
+		}
 		addConfigHeadersToAttrs(ck.Headers, attrs)
 		proxyURL := strings.TrimSpace(ck.ProxyURL)
 		a := &coreauth.Auth{
--- a/sdk/api/handlers/handlers.go
+++ b/sdk/api/handlers/handlers.go
@@ -14,6 +14,7 @@ import (
 	"github.com/gin-gonic/gin"
 	"github.com/google/uuid"
 	"github.com/router-for-me/CLIProxyAPI/v6/internal/interfaces"
+	"github.com/router-for-me/CLIProxyAPI/v6/internal/logging"
 	"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
 	coreauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
 	coreexecutor "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/executor"
@@ -103,8 +104,8 @@ func BuildErrorResponseBody(status int, errText string) []byte {
 // Returning 0 disables keep-alives (default when unset).
 func StreamingKeepAliveInterval(cfg *config.SDKConfig) time.Duration {
 	seconds := defaultStreamingKeepAliveSeconds
-	if cfg != nil && cfg.Streaming.KeepAliveSeconds != nil {
-		seconds = *cfg.Streaming.KeepAliveSeconds
+	if cfg != nil {
+		seconds = cfg.Streaming.KeepAliveSeconds
 	}
 	if seconds <= 0 {
 		return 0
@@ -115,8 +116,8 @@ func StreamingKeepAliveInterval(cfg *config.SDKConfig) time.Duration {
 // StreamingBootstrapRetries returns how many times a streaming request may be retried before any bytes are sent.
 func StreamingBootstrapRetries(cfg *config.SDKConfig) int {
 	retries := defaultStreamingBootstrapRetries
-	if cfg != nil && cfg.Streaming.BootstrapRetries != nil {
-		retries = *cfg.Streaming.BootstrapRetries
+	if cfg != nil {
+		retries = cfg.Streaming.BootstrapRetries
 	}
 	if retries < 0 {
 		retries = 0
@@ -216,13 +217,39 @@ func (h *BaseAPIHandler) GetAlt(c *gin.Context) string {
 // Parameters:
 //   - handler: The API handler associated with the request.
 //   - c: The Gin context of the current request.
-//   - ctx: The parent context.
+//   - ctx: The parent context (caller values/deadlines are preserved; request context adds cancellation and request ID).
 //
 // Returns:
 //   - context.Context: The new context with cancellation and embedded values.
 //   - APIHandlerCancelFunc: A function to cancel the context and log the response.
 func (h *BaseAPIHandler) GetContextWithCancel(handler interfaces.APIHandler, c *gin.Context, ctx context.Context) (context.Context, APIHandlerCancelFunc) {
-	newCtx, cancel := context.WithCancel(ctx)
+	parentCtx := ctx
+	if parentCtx == nil {
+		parentCtx = context.Background()
+	}
+
+	var requestCtx context.Context
+	if c != nil && c.Request != nil {
+		requestCtx = c.Request.Context()
+	}
+
+	if requestCtx != nil && logging.GetRequestID(parentCtx) == "" {
+		if requestID := logging.GetRequestID(requestCtx); requestID != "" {
+			parentCtx = logging.WithRequestID(parentCtx, requestID)
+		} else if requestID := logging.GetGinRequestID(c); requestID != "" {
+			parentCtx = logging.WithRequestID(parentCtx, requestID)
+		}
+	}
+	newCtx, cancel := context.WithCancel(parentCtx)
+	if requestCtx != nil && requestCtx != parentCtx {
+		go func() {
+			select {
+			case <-requestCtx.Done():
+				cancel()
+			case <-newCtx.Done():
+			}
+		}()
+	}
 	newCtx = context.WithValue(newCtx, "gin", c)
 	newCtx = context.WithValue(newCtx, "handler", handler)
 	return newCtx, func(params ...interface{}) {
--- a/sdk/api/handlers/handlers_stream_bootstrap_test.go
+++ b/sdk/api/handlers/handlers_stream_bootstrap_test.go
@@ -94,12 +94,11 @@ func TestExecuteStreamWithAuthManager_RetriesBeforeFirstByte(t *testing.T) {
 		registry.GetGlobalRegistry().UnregisterClient(auth2.ID)
 	})

-	bootstrapRetries := 1
 	handler := NewBaseAPIHandlers(&sdkconfig.SDKConfig{
 		Streaming: sdkconfig.StreamingConfig{
-			BootstrapRetries: &bootstrapRetries,
+			BootstrapRetries: 1,
 		},
-	}, manager, nil)
+	}, manager)
 	dataChan, errChan := handler.ExecuteStreamWithAuthManager(context.Background(), "openai", "test-model", []byte(`{"model":"test-model"}`), "")
 	if dataChan == nil || errChan == nil {
 		t.Fatalf("expected non-nil channels")
--- a/sdk/api/management.go
+++ b/sdk/api/management.go
@@ -0,0 +1,62 @@
+// Package api exposes helpers for embedding CLIProxyAPI.
+//
+// It wraps internal management handler types so external projects can integrate
+// management endpoints without importing internal packages.
+package api
+
+import (
+	"github.com/gin-gonic/gin"
+	internalmanagement "github.com/router-for-me/CLIProxyAPI/v6/internal/api/handlers/management"
+	coreauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
+	"github.com/router-for-me/CLIProxyAPI/v6/sdk/config"
+)
+
+// ManagementTokenRequester exposes a limited subset of management endpoints for requesting tokens.
+type ManagementTokenRequester interface {
+	RequestAnthropicToken(*gin.Context)
+	RequestGeminiCLIToken(*gin.Context)
+	RequestCodexToken(*gin.Context)
+	RequestAntigravityToken(*gin.Context)
+	RequestQwenToken(*gin.Context)
+	RequestIFlowToken(*gin.Context)
+	RequestIFlowCookieToken(*gin.Context)
+}
+
+type managementTokenRequester struct {
+	handler *internalmanagement.Handler
+}
+
+// NewManagementTokenRequester creates a limited management handler exposing only token request endpoints.
+func NewManagementTokenRequester(cfg *config.Config, manager *coreauth.Manager) ManagementTokenRequester {
+	return &managementTokenRequester{
+		handler: internalmanagement.NewHandlerWithoutConfigFilePath(cfg, manager),
+	}
+}
+
+func (m *managementTokenRequester) RequestAnthropicToken(c *gin.Context) {
+	m.handler.RequestAnthropicToken(c)
+}
+
+func (m *managementTokenRequester) RequestGeminiCLIToken(c *gin.Context) {
+	m.handler.RequestGeminiCLIToken(c)
+}
+
+func (m *managementTokenRequester) RequestCodexToken(c *gin.Context) {
+	m.handler.RequestCodexToken(c)
+}
+
+func (m *managementTokenRequester) RequestAntigravityToken(c *gin.Context) {
+	m.handler.RequestAntigravityToken(c)
+}
+
+func (m *managementTokenRequester) RequestQwenToken(c *gin.Context) {
+	m.handler.RequestQwenToken(c)
+}
+
+func (m *managementTokenRequester) RequestIFlowToken(c *gin.Context) {
+	m.handler.RequestIFlowToken(c)
+}
+
+func (m *managementTokenRequester) RequestIFlowCookieToken(c *gin.Context) {
+	m.handler.RequestIFlowCookieToken(c)
+}
--- a/sdk/cliproxy/auth/conductor.go
+++ b/sdk/cliproxy/auth/conductor.go
@@ -12,6 +12,7 @@ import (
 	"time"

 	"github.com/google/uuid"
+	"github.com/router-for-me/CLIProxyAPI/v6/internal/logging"
 	"github.com/router-for-me/CLIProxyAPI/v6/internal/registry"
 	"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
 	cliproxyexecutor "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/executor"
@@ -202,10 +203,10 @@ func (m *Manager) Register(ctx context.Context, auth *Auth) (*Auth, error) {
 	if auth == nil {
 		return nil, nil
 	}
-	auth.EnsureIndex()
 	if auth.ID == "" {
 		auth.ID = uuid.NewString()
 	}
+	auth.EnsureIndex()
 	m.mu.Lock()
 	m.auths[auth.ID] = auth.Clone()
 	m.mu.Unlock()
@@ -220,7 +221,7 @@ func (m *Manager) Update(ctx context.Context, auth *Auth) (*Auth, error) {
 		return nil, nil
 	}
 	m.mu.Lock()
-	if existing, ok := m.auths[auth.ID]; ok && existing != nil && !auth.indexAssigned && auth.Index == 0 {
+	if existing, ok := m.auths[auth.ID]; ok && existing != nil && !auth.indexAssigned && auth.Index == "" {
 		auth.Index = existing.Index
 		auth.indexAssigned = existing.indexAssigned
 	}
@@ -262,7 +263,6 @@ func (m *Manager) Execute(ctx context.Context, providers []string, req cliproxye
 		return cliproxyexecutor.Response{}, &Error{Code: "provider_not_found", Message: "no provider supplied"}
 	}
 	rotated := m.rotateProviders(req.Model, normalized)
-	defer m.advanceProviderCursor(req.Model, normalized)

 	retryTimes, maxWait := m.retrySettings()
 	attempts := retryTimes + 1
@@ -301,7 +301,6 @@ func (m *Manager) ExecuteCount(ctx context.Context, providers []string, req clip
 		return cliproxyexecutor.Response{}, &Error{Code: "provider_not_found", Message: "no provider supplied"}
 	}
 	rotated := m.rotateProviders(req.Model, normalized)
-	defer m.advanceProviderCursor(req.Model, normalized)

 	retryTimes, maxWait := m.retrySettings()
 	attempts := retryTimes + 1
@@ -340,7 +339,6 @@ func (m *Manager) ExecuteStream(ctx context.Context, providers []string, req cli
 		return nil, &Error{Code: "provider_not_found", Message: "no provider supplied"}
 	}
 	rotated := m.rotateProviders(req.Model, normalized)
-	defer m.advanceProviderCursor(req.Model, normalized)

 	retryTimes, maxWait := m.retrySettings()
 	attempts := retryTimes + 1
@@ -389,17 +387,18 @@ func (m *Manager) executeWithProvider(ctx context.Context, provider string, req

 		accountType, accountInfo := auth.AccountInfo()
 		proxyInfo := auth.ProxyInfo()
+		entry := logEntryWithRequestID(ctx)
 		if accountType == "api_key" {
 			if proxyInfo != "" {
-				log.Debugf("Use API key %s for model %s %s", util.HideAPIKey(accountInfo), req.Model, proxyInfo)
+				entry.Debugf("Use API key %s for model %s %s", util.HideAPIKey(accountInfo), req.Model, proxyInfo)
 			} else {
-				log.Debugf("Use API key %s for model %s", util.HideAPIKey(accountInfo), req.Model)
+				entry.Debugf("Use API key %s for model %s", util.HideAPIKey(accountInfo), req.Model)
 			}
 		} else if accountType == "oauth" {
 			if proxyInfo != "" {
-				log.Debugf("Use OAuth %s for model %s %s", accountInfo, req.Model, proxyInfo)
+				entry.Debugf("Use OAuth %s for model %s %s", accountInfo, req.Model, proxyInfo)
 			} else {
-				log.Debugf("Use OAuth %s for model %s", accountInfo, req.Model)
+				entry.Debugf("Use OAuth %s for model %s", accountInfo, req.Model)
 			}
 		}

@@ -449,17 +448,18 @@ func (m *Manager) executeCountWithProvider(ctx context.Context, provider string,

 		accountType, accountInfo := auth.AccountInfo()
 		proxyInfo := auth.ProxyInfo()
+		entry := logEntryWithRequestID(ctx)
 		if accountType == "api_key" {
 			if proxyInfo != "" {
-				log.Debugf("Use API key %s for model %s %s", util.HideAPIKey(accountInfo), req.Model, proxyInfo)
+				entry.Debugf("Use API key %s for model %s %s", util.HideAPIKey(accountInfo), req.Model, proxyInfo)
 			} else {
-				log.Debugf("Use API key %s for model %s", util.HideAPIKey(accountInfo), req.Model)
+				entry.Debugf("Use API key %s for model %s", util.HideAPIKey(accountInfo), req.Model)
 			}
 		} else if accountType == "oauth" {
 			if proxyInfo != "" {
-				log.Debugf("Use OAuth %s for model %s %s", accountInfo, req.Model, proxyInfo)
+				entry.Debugf("Use OAuth %s for model %s %s", accountInfo, req.Model, proxyInfo)
 			} else {
-				log.Debugf("Use OAuth %s for model %s", accountInfo, req.Model)
+				entry.Debugf("Use OAuth %s for model %s", accountInfo, req.Model)
 			}
 		}

@@ -509,17 +509,18 @@ func (m *Manager) executeStreamWithProvider(ctx context.Context, provider string

 		accountType, accountInfo := auth.AccountInfo()
 		proxyInfo := auth.ProxyInfo()
+		entry := logEntryWithRequestID(ctx)
 		if accountType == "api_key" {
 			if proxyInfo != "" {
-				log.Debugf("Use API key %s for model %s %s", util.HideAPIKey(accountInfo), req.Model, proxyInfo)
+				entry.Debugf("Use API key %s for model %s %s", util.HideAPIKey(accountInfo), req.Model, proxyInfo)
 			} else {
-				log.Debugf("Use API key %s for model %s", util.HideAPIKey(accountInfo), req.Model)
+				entry.Debugf("Use API key %s for model %s", util.HideAPIKey(accountInfo), req.Model)
 			}
 		} else if accountType == "oauth" {
 			if proxyInfo != "" {
-				log.Debugf("Use OAuth %s for model %s %s", accountInfo, req.Model, proxyInfo)
+				entry.Debugf("Use OAuth %s for model %s %s", accountInfo, req.Model, proxyInfo)
 			} else {
-				log.Debugf("Use OAuth %s for model %s", accountInfo, req.Model)
+				entry.Debugf("Use OAuth %s for model %s", accountInfo, req.Model)
 			}
 		}

@@ -636,13 +637,20 @@ func (m *Manager) normalizeProviders(providers []string) []string {
 	return result
 }

+// rotateProviders returns a rotated view of the providers list starting from the
+// current offset for the model, and atomically increments the offset for the next call.
+// This ensures concurrent requests get different starting providers.
 func (m *Manager) rotateProviders(model string, providers []string) []string {
 	if len(providers) == 0 {
 		return nil
 	}
-	m.mu.RLock()
+
+	// Atomic read-and-increment: get current offset and advance cursor in one lock
+	m.mu.Lock()
 	offset := m.providerOffsets[model]
-	m.mu.RUnlock()
+	m.providerOffsets[model] = (offset + 1) % len(providers)
+	m.mu.Unlock()
+
 	if len(providers) > 0 {
 		offset %= len(providers)
 	}
@@ -658,19 +666,6 @@ func (m *Manager) rotateProviders(model string, providers []string) []string {
 	return rotated
 }

-func (m *Manager) advanceProviderCursor(model string, providers []string) {
-	if len(providers) == 0 {
-		m.mu.Lock()
-		delete(m.providerOffsets, model)
-		m.mu.Unlock()
-		return
-	}
-	m.mu.Lock()
-	current := m.providerOffsets[model]
-	m.providerOffsets[model] = (current + 1) % len(providers)
-	m.mu.Unlock()
-}
-
 func (m *Manager) retrySettings() (int, time.Duration) {
 	if m == nil {
 		return 0, 0
@@ -1604,6 +1599,17 @@ type RequestPreparer interface {
 	PrepareRequest(req *http.Request, auth *Auth) error
 }

+// logEntryWithRequestID returns a logrus entry with request_id field if available in context.
+func logEntryWithRequestID(ctx context.Context) *log.Entry {
+	if ctx == nil {
+		return log.NewEntry(log.StandardLogger())
+	}
+	if reqID := logging.GetRequestID(ctx); reqID != "" {
+		return log.WithField("request_id", reqID)
+	}
+	return log.NewEntry(log.StandardLogger())
+}
+
 // InjectCredentials delegates per-provider HTTP request preparation when supported.
 // If the registered executor for the auth provider implements RequestPreparer,
 // it will be invoked to modify the request (e.g., add headers).
--- a/sdk/cliproxy/auth/types.go
+++ b/sdk/cliproxy/auth/types.go
@@ -1,11 +1,12 @@
 package auth

 import (
+	"crypto/sha256"
+	"encoding/hex"
 	"encoding/json"
 	"strconv"
 	"strings"
 	"sync"
-	"sync/atomic"
 	"time"

 	baseauth "github.com/router-for-me/CLIProxyAPI/v6/internal/auth"
@@ -15,8 +16,8 @@ import (
 type Auth struct {
 	// ID uniquely identifies the auth record across restarts.
 	ID string `json:"id"`
-	// Index is a monotonically increasing runtime identifier used for diagnostics.
-	Index uint64 `json:"-"`
+	// Index is a stable runtime identifier derived from auth metadata (not persisted).
+	Index string `json:"-"`
 	// Provider is the upstream provider key (e.g. "gemini", "claude").
 	Provider string `json:"provider"`
 	// Prefix optionally namespaces models for routing (e.g., "teamA/gemini-3-pro-preview").
@@ -94,12 +95,6 @@ type ModelState struct {
 	UpdatedAt time.Time `json:"updated_at"`
 }

-var authIndexCounter atomic.Uint64
-
-func nextAuthIndex() uint64 {
-	return authIndexCounter.Add(1) - 1
-}
-
 // Clone shallow copies the Auth structure, duplicating maps to avoid accidental mutation.
 func (a *Auth) Clone() *Auth {
 	if a == nil {
@@ -128,15 +123,41 @@ func (a *Auth) Clone() *Auth {
 	return &copyAuth
 }

-// EnsureIndex returns the global index, assigning one if it was not set yet.
-func (a *Auth) EnsureIndex() uint64 {
-	if a == nil {
-		return 0
+func stableAuthIndex(seed string) string {
+	seed = strings.TrimSpace(seed)
+	if seed == "" {
+		return ""
 	}
-	if a.indexAssigned {
+	sum := sha256.Sum256([]byte(seed))
+	return hex.EncodeToString(sum[:8])
+}
+
+// EnsureIndex returns a stable index derived from the auth file name or API key.
+func (a *Auth) EnsureIndex() string {
+	if a == nil {
+		return ""
+	}
+	if a.indexAssigned && a.Index != "" {
 		return a.Index
 	}
-	idx := nextAuthIndex()
+
+	seed := strings.TrimSpace(a.FileName)
+	if seed != "" {
+		seed = "file:" + seed
+	} else if a.Attributes != nil {
+		if apiKey := strings.TrimSpace(a.Attributes["api_key"]); apiKey != "" {
+			seed = "api_key:" + apiKey
+		}
+	}
+	if seed == "" {
+		if id := strings.TrimSpace(a.ID); id != "" {
+			seed = "id:" + id
+		} else {
+			return ""
+		}
+	}
+
+	idx := stableAuthIndex(seed)
 	a.Index = idx
 	a.indexAssigned = true
 	return idx
--- a/sdk/cliproxy/service.go
+++ b/sdk/cliproxy/service.go
@@ -741,6 +741,9 @@ func (s *Service) registerModelsForAuth(a *coreauth.Auth) {
 	case "codex":
 		models = registry.GetOpenAIModels()
 		if entry := s.resolveConfigCodexKey(a); entry != nil {
+			if len(entry.Models) > 0 {
+				models = buildCodexConfigModels(entry)
+			}
 			if authKind == "apikey" {
 				excluded = entry.ExcludedModels
 			}
@@ -1179,3 +1182,41 @@ func buildClaudeConfigModels(entry *config.ClaudeKey) []*ModelInfo {
 	}
 	return out
 }
+
+func buildCodexConfigModels(entry *config.CodexKey) []*ModelInfo {
+	if entry == nil || len(entry.Models) == 0 {
+		return nil
+	}
+	now := time.Now().Unix()
+	out := make([]*ModelInfo, 0, len(entry.Models))
+	seen := make(map[string]struct{}, len(entry.Models))
+	for i := range entry.Models {
+		model := entry.Models[i]
+		name := strings.TrimSpace(model.Name)
+		alias := strings.TrimSpace(model.Alias)
+		if alias == "" {
+			alias = name
+		}
+		if alias == "" {
+			continue
+		}
+		key := strings.ToLower(alias)
+		if _, exists := seen[key]; exists {
+			continue
+		}
+		seen[key] = struct{}{}
+		display := name
+		if display == "" {
+			display = alias
+		}
+		out = append(out, &ModelInfo{
+			ID:          alias,
+			Object:      "model",
+			Created:     now,
+			OwnedBy:     "openai",
+			Type:        "openai",
+			DisplayName: display,
+		})
+	}
+	return out
+}
--- a/sdk/cliproxy/usage/manager.go
+++ b/sdk/cliproxy/usage/manager.go
@@ -14,7 +14,7 @@ type Record struct {
 	Model       string
 	APIKey      string
 	AuthID      string
-	AuthIndex   uint64
+	AuthIndex   string
 	Source      string
 	RequestedAt time.Time
 	Failed      bool
Author	SHA1	Message	Date
Luis Pater	ee171bc563	feat(api): add ManagementTokenRequester interface for management token request endpoints	2025-12-29 02:42:29 +08:00
Luis Pater	457924828a	Merge pull request #757 from ben-vargas/fix-thinking-toolchoice-conflict Fix: disable thinking when tool_choice forces tool use	2025-12-28 14:04:30 +08:00
Ben Vargas	aca2ef6359	Fix: disable thinking when tool_choice forces tool use Anthropic API does not allow extended thinking when tool_choice is set to "any" or a specific tool. This was causing 400 errors when using features like Amp's /handoff command which forces tool_choice. Added disableThinkingIfToolChoiceForced() that removes thinking config when incompatible tool_choice is detected, applied to both streaming and non-streaming paths. Fixes router-for-me/CLIProxyAPI#630	2025-12-27 16:31:37 -07:00
Luis Pater	ade7194792	feat(management): add generic API call handler to management endpoints	2025-12-28 04:40:32 +08:00
Luis Pater	3a436e116a	feat(cliproxy): implement model aliasing and hashing for Codex configurations, enhance request routing logic, and normalize Codex model entries	2025-12-28 03:06:51 +08:00
Luis Pater	336867853b	Merge pull request #756 from leaph/check-ai-thinking-settings feat(iflow): add model-specific thinking configs for GLM-4.7 and Mini…	2025-12-28 02:08:27 +08:00
leaph	6403ff4ec4	feat(iflow): add model-specific thinking configs for GLM-4.7 and MiniMax-M2.1 - GLM-4.7: Uses extra_body={"thinking": {"type": "enabled"}, "clear_thinking": false} - MiniMax-M2.1: Uses reasoning_split=true for OpenAI-style reasoning separation - Added preserveReasoningContentInMessages() to support re-injection of reasoning content in assistant message history for multi-turn conversations - Added ThinkingSupport to MiniMax-M2.1 model definition	2025-12-27 18:39:15 +01:00
Luis Pater	d222469b44	Update issue templates	2025-12-28 01:22:42 +08:00
Luis Pater	7646a2b877	Fixed: #749 fix(translators): ensure `gjson.String` content is non-empty before setting `parts` in OpenAI request logic	2025-12-28 00:54:26 +08:00
Luis Pater	62090f2568	Merge pull request #750 from router-for-me/config fix(config): preserve original config structure and avoid default value pollution	2025-12-27 22:10:01 +08:00
Luis Pater	c281f4cbaf	Fixed: #747 fix(translators): rename and integrate `usageMetadata` as `cpaUsageMetadata` in Claude processing logic	2025-12-27 22:02:11 +08:00
hkfires	09455f9e85	fix(config): make streaming keepalive and retries ints	2025-12-27 20:56:47 +08:00
hkfires	c8e72ba0dc	fix(config): smart merge writes non-default new keys only	2025-12-27 20:28:54 +08:00
hkfires	375ef252ab	docs(config): clarify merge mapping behavior	2025-12-27 19:30:21 +08:00
hkfires	ee552f8720	chore(config): update ignore patterns	2025-12-27 19:13:14 +08:00
hkfires	2e88c4858e	fix(config): avoid adding new keys when merging	2025-12-27 19:00:47 +08:00
Luis Pater	3f50da85c1	Merge pull request #745 from router-for-me/auth fix(auth): make provider rotation atomic	2025-12-27 13:01:22 +08:00
hkfires	8be06255f7	fix(auth): make provider rotation atomic	2025-12-27 12:56:48 +08:00
Luis Pater	72274099aa	Fixed: #738 fix(translators): refine prompt token calculation by incorporating cached tokens in Claude response handling	2025-12-27 03:56:11 +08:00
Luis Pater	dcae098e23	Fixed: #736 fix(translators): handle gjson string types in Claude request processing to ensure consistent content parsing	2025-12-27 01:25:47 +08:00
Luis Pater	2eb05ec640	Merge pull request #727 from nguyenphutrong/main docs: add Quotio to community projects	2025-12-26 11:53:09 +08:00
Luis Pater	3ce0d76aa4	feat(usage): add import/export functionality for usage statistics and enhance deduplication logic	2025-12-26 11:49:51 +08:00
Trong Nguyen	a00b79d9be	docs(readme): add Quotio to community projects section	2025-12-26 10:46:05 +07:00
Luis Pater	33e53a2a56	fix(translators): ensure correct handling and output of multimodal assistant content across request handlers	2025-12-26 05:08:04 +08:00
Luis Pater	cd5b80785f	Merge pull request #722 from hungthai1401/bugfix/remove-extra-args Fixed incorrect function signature call to `NewBaseAPIHandlers`	2025-12-26 02:56:56 +08:00
Thai Nguyen Hung	54f71aa273	fix(test): remove extra argument from ExecuteStreamWithAuthManager call	2025-12-25 21:55:35 +07:00
Luis Pater	3f949b7f84	Merge pull request #704 from tinyc0der/add-index fix(openai): add index field to image response for LiteLLM compatibility	2025-12-25 21:35:12 +08:00
Luis Pater	443c4538bb	feat(config): add `commercial-mode` to optimize HTTP middleware for lower memory usage	2025-12-25 21:05:01 +08:00
TinyCoder	a7fc2ee4cf	refactor(image): avoid using json.Marshal	2025-12-25 14:21:01 +07:00
Luis Pater	8e749ac22d	docs(readme): update GLM model version from 4.6 to 4.7 in README and README_CN	2025-12-24 23:59:48 +08:00
Luis Pater	69e09d9bc7	docs(readme): update GLM model version from 4.6 to 4.7 in README and README_CN	2025-12-24 23:46:27 +08:00
Luis Pater	06ad527e8c	Fixed: #696 fix(translators): adjust prompt token calculation by subtracting cached tokens across Gemini, OpenAI, and Claude handlers	2025-12-24 23:29:18 +08:00
Luis Pater	b7409dd2de	Merge pull request #706 from router-for-me/log Log	2025-12-24 22:24:39 +08:00
hkfires	5ba325a8fc	refactor(logging): standardize request id formatting and layout	2025-12-24 22:03:07 +08:00
Luis Pater	d502840f91	Merge pull request #695 from NguyenSiTrung/main feat: add cached token parsing for Gemini , Antigravity API responses	2025-12-24 21:58:55 +08:00
hkfires	99238a4b59	fix(logging): normalize warning level to warn	2025-12-24 21:11:37 +08:00
hkfires	6d43a2ff9a	refactor(logging): inline request id in log output	2025-12-24 21:07:18 +08:00
Luis Pater	3faa1ca9af	Merge pull request #700 from router-for-me/log refactor(sdk/auth): rename manager.go to conductor.go	2025-12-24 19:36:24 +08:00
Luis Pater	9d975e0375	feat(models): add support for GLM-4.7 and MiniMax-M2.1	2025-12-24 19:30:57 +08:00
hkfires	2a6d8b78d4	feat(api): add endpoint to retrieve request logs by ID	2025-12-24 19:24:51 +08:00
TinyCoder	671558a822	fix(openai): add index field to image response for LiteLLM compatibility LiteLLM's Pydantic model requires an index field in each image object. Without it, responses fail validation with "images.0.index Field required".	2025-12-24 17:43:31 +07:00
hkfires	26fbb77901	refactor(sdk/auth): rename manager.go to conductor.go	2025-12-24 15:21:03 +08:00
NguyenSiTrung	a277302262	Merge remote-tracking branch 'upstream/main'	2025-12-24 10:54:09 +07:00
NguyenSiTrung	969c1a5b72	refactor: extract parseGeminiFamilyUsageDetail helper to reduce duplication	2025-12-24 10:22:31 +07:00
NguyenSiTrung	872339bceb	feat: add cached token parsing for Gemini API responses	2025-12-24 10:20:11 +07:00
Luis Pater	5dc0dbc7aa	Merge pull request #697 from Cubence-com/main docs(readme): add Cubence sponsor and fix PackyCode link	2025-12-24 11:19:32 +08:00
Luis Pater	2b7ba54a2f	Merge pull request #688 from router-for-me/feature/request-id-tracking feat(logging): implement request ID tracking and propagation	2025-12-24 10:54:13 +08:00
hkfires	007c3304f2	feat(logging): scope request ID tracking to AI API endpoints	2025-12-24 09:17:09 +08:00
hkfires	e76ba0ede9	feat(logging): implement request ID tracking and propagation	2025-12-24 08:32:17 +08:00
Luis Pater	c06ac07e23	Merge pull request #686 from ajkdrag/main feat: regex support for model-mappings	2025-12-24 04:37:44 +08:00
Fetters	8d25cf0d75	fix(readme): update PackyCode sponsorship link and remove redundant tbody	2025-12-23 23:44:40 +08:00
Fetters	64e85e7019	docs(readme): add Cubence sponsor	2025-12-23 23:30:57 +08:00
altamash	0c0aae1eac	Robust change detection: replaced string concat with struct-based compare in hasModelMappingsChanged; removed boolTo01. • Performance: pre-allocate map and regex slice capacities in UpdateMappings. • Verified with amp module tests (all passing)	2025-12-23 18:52:28 +05:30
altamash	5dcf7cb846	feat: regex support for model-mappings	2025-12-23 18:41:58 +05:30