From 39c294b72d79c5fb8ee57b4bc08b1caec698df4e Mon Sep 17 00:00:00 2001
From: haoshan98 <haoshanw@gmail.com>
Date: Fri, 19 Jun 2026 10:03:38 +0000
Subject: [PATCH 1/3] Codex integration design

Signed-off-by: haoshan98 <haoshanw@gmail.com>
---
 docs/design/core-public-api.md | 249 ++++++++++++++++++++++++++++++++-
 1 file changed, 247 insertions(+), 2 deletions(-)

diff --git a/docs/design/core-public-api.md b/docs/design/core-public-api.md
index fb92954..873fc45 100644
--- a/docs/design/core-public-api.md
+++ b/docs/design/core-public-api.md
@@ -1,7 +1,8 @@
 # Design: `agentic-core` Public API
 
 > Status: Active — implementation in progress
-> References: [ADR-03](../adr/ADR-03_gateway_integration.md), [Issue #42](https://github.com/vllm-project/agentic-api/issues/42), [Praxis #354](https://github.com/praxis-proxy/praxis/issues/354)
+> References: [ADR-03](../adr/ADR-03_gateway_integration.md), [Issue #42](https://github.com/vllm-project/agentic-api/issues/42),
+> [Issue #54](https://github.com/vllm-project/agentic-api/issues/54), [Praxis #354](https://github.com/praxis-proxy/praxis/issues/354)
 > Owner: @ashwing (tool dispatch, loop control, streaming tee) + @maralbahari (base loop, store integration)
 
 ---
@@ -33,6 +34,7 @@ The base loop handles text messages. This design extends it with:
 3. **Streaming tee** — forward SSE to client in real-time while accumulating for tool detection
 4. **Extended SSE events** — function_call, reasoning, file_search, web_search event types
 5. **Tool executor traits** — MCP, web_search, vector_store as pluggable implementations
+6. **Codex CLI compatibility** — recognize Codex client-side tool types and route them without server execution
 
 ---
 
@@ -123,7 +125,7 @@ pub async fn execute_loop(
 1. Rehydrate (delegates to PR #46's `rehydrate_conversation`)
 2. Call inference (delegates to PR #46's `call_inference` — returns stream lazily)
 3. Accumulate response (via `ResponseAccumulator::from_stream`)
-4. Check output for `OutputItem::FunctionCall` → `dispatch_tools` → loop or done
+4. Check output for `OutputItem::FunctionCall` → `dispatch_tools` → loop, client action, or done
 5. Persist final response (delegates to PR #46's `persist_response` with explicit handlers)
 
 **Phase 2 is non-streaming only.** The tool loop inspects the full accumulated response before deciding. Streaming + tool dispatch (forwarding events to client while detecting tool calls) requires Phase 3's tee pattern.
@@ -252,6 +254,249 @@ How the complete pipeline maps to @leseb's proposed filter chain:
 
 ---
 
+## Codex Integration
+
+Allow `agentic-api` to serve as the upstream layer for Codex CLI in the coming PR by @haoshan98, related to [Issue #54](https://github.com/vllm-project/agentic-api/issues/54).
+`agentic-api` should accept Codex CLI traffic, route inference to vLLM-supported models, preserve
+`previous_response_id` and conversation persistence, and pass client-owned tool calls back to Codex for local
+execution.
+
+The immediate compatibility gap is request parsing and pass-through behavior. `agentic-api` already supports
+`type: "function"`, but it does not yet recognize the Responses API tool shapes Codex uses for local/client tools:
+`namespace`, `tool_search`, and `custom`. Today those request tools can fail before they reach upstream inference.
+This section scopes the Codex integration to accepting those tool types losslessly and returning client-owned tool
+calls to Codex CLI for local execution.
+
+Server-hosted tool types such as `file_search`, `web_search_preview`, and `code_interpreter` remain future
+server-side work. They should not be conflated with this Codex compatibility pass.
+
+### Codex Tool Type Taxonomy
+
+Codex CLI sends tool declarations that are executed locally by the CLI. For this phase, `agentic-core` only needs
+to recognize and preserve these shapes, normalize them for vLLM when necessary, and avoid treating them as
+gateway-executed tools.
+
+| Tool type | Executor | Core behavior |
+|-----------|----------|---------------|
+| `function` | Codex CLI by default | Already supported on the wire. Codex requests should return calls to Codex unless configuration marks the tool as gateway-owned. |
+| `namespace` | Codex CLI | Accept the model-facing container shape and preserve child function metadata. Calls still arrive as `function_call` with an optional namespace. |
+| `tool_search` | Codex CLI | Accept deferred-discovery shape, preserve `execution`, and return calls/output handling to Codex when `execution = "client"`. |
+| `custom` | Codex CLI | Accept free-form/grammar tool shape and preserve `format` metadata for Codex. |
+
+The key requirement is compatibility, not server execution. These tools should pass through `agentic-api`
+without request validation failures, and any client-owned model-emitted calls should be surfaced to Codex CLI
+rather than executed inside the gateway.
+
+The key distinction is execution owner, not just wire type. `function` is a shared wire type: Codex can own local
+functions, while future gateway integrations may also expose server-executed functions. The request normalizer must
+classify every model-visible tool before inference and carry that registry through response handling.
+
+### Public Type Additions
+
+The current `ResponsesTool = FunctionTool` alias is too narrow for Codex. Replace it with a tagged tool enum that
+preserves unknown shapes while giving Codex-used variants first-class names.
+
+Do not implement this as `#[serde(tag = "type")]` wrapping the existing `FunctionTool` struct directly, because
+`FunctionTool` already stores the wire `type` field. Either use variant-specific payload structs that omit the
+already-consumed tag, as sketched below, or write manual deserialization that preserves the raw object.
+
+```rust
+#[derive(Debug, Clone, Serialize, Deserialize)]
+#[serde(untagged)]
+pub enum ResponsesTool {
+    Known(KnownResponsesTool),
+    Unknown(Value),
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize)]
+#[serde(tag = "type")]
+pub enum KnownResponsesTool {
+    #[serde(rename = "function")]
+    Function(ResponsesFunctionTool),
+    #[serde(rename = "namespace")]
+    Namespace(CodexNamespaceTool),
+    #[serde(rename = "tool_search")]
+    ToolSearch(CodexToolSearchTool),
+    #[serde(rename = "custom")]
+    Custom(CodexCustomTool),
+}
+
+pub struct ResponsesFunctionTool {
+    pub name: String,
+    pub description: Option<String>,
+    pub parameters: Option<Value>,
+    pub strict: Option<bool>,
+    #[serde(default)]
+    pub defer_loading: bool,
+}
+
+pub struct CodexNamespaceTool {
+    pub name: String,
+    pub description: Option<String>,
+    #[serde(default)]
+    pub tools: Vec<CodexNamespaceMember>,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize)]
+#[serde(tag = "type")]
+pub enum CodexNamespaceMember {
+    #[serde(rename = "function")]
+    Function(ResponsesFunctionTool),
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize)]
+#[serde(rename_all = "snake_case")]
+pub enum ToolSearchExecution {
+    Server,
+    Client,
+}
+
+pub struct CodexToolSearchTool {
+    pub description: Option<String>,
+    pub execution: Option<ToolSearchExecution>,
+    pub parameters: Option<Value>,
+}
+
+pub struct CodexCustomTool {
+    pub name: String,
+    pub description: Option<String>,
+    pub format: Option<Value>,
+    #[serde(default)]
+    pub defer_loading: bool,
+}
+```
+
+The storage and upstream request paths should preserve raw unknown tools as `Value`. Unknown tool types must not
+be executed by default.
+
+### Codex Call Shapes
+
+Codex's local router treats namespace as part of a function tool name, not as a separate payload type:
+
+| Response item | Local Codex payload | Gateway behavior |
+|---------------|---------------------|------------------|
+| `function_call` | `ToolPayload::Function { arguments }` | Preserve optional `namespace` and return the call to Codex when the tool is client-owned. |
+| `tool_search_call` with `execution = "client"` | `ToolPayload::ToolSearch` | Return to Codex for local deferred discovery. |
+| Hosted `tool_search_call` | Provider-owned | Do not execute locally; provider/upstream owns it. |
+| `custom_tool_call` | `ToolPayload::Custom { input }` | Preserve free-form `input`, not JSON-schema function arguments. |
+
+`custom_tool_call` is for free-form/custom Responses tools, including grammar-based patch or code tools. It should
+not be normalized as JSON-schema function arguments unless the adapter can reconstruct the original custom call
+exactly.
+
+### Normalization And Registry
+
+Codex compatibility needs two related operations:
+
+1. Build an upstream-safe tool list for the selected inference backend.
+2. Keep a lossless registry that maps model-emitted calls back to the original client-visible tool declaration.
+
+If vLLM only accepts flat function declarations, `tool_search` and `custom` become request normalization concerns.
+`namespace` is mostly model-facing spec organization: `ToolSpec::Namespace` wraps function tools, and the model
+still emits `function_call` with an optional `namespace`. Any backend-specific flattening is an adapter detail,
+not the public semantics of the tool.
+
+```rust
+pub struct NormalizedTools {
+    pub upstream_tools: Vec<Value>,
+    pub registry: ToolRegistry,
+}
+
+pub struct ToolName {
+    pub namespace: Option<String>,
+    pub name: String,
+}
+
+pub enum ToolExecutionOwner {
+    Client,
+    Gateway,
+}
+
+pub struct ToolRegistryEntry {
+    pub owner: ToolExecutionOwner,
+    pub original_type: String,
+    pub original_name: ToolName,
+    pub model_name: ToolName,
+    pub original_tool: Value,
+}
+```
+
+For a namespace tool:
+
+```json
+{
+  "type": "namespace",
+  "name": "mcp__github",
+  "tools": [
+    { "name": "create_issue", "description": "Create issue", "parameters": {} }
+  ]
+}
+```
+
+the normalizer records a registry entry keyed by the split `ToolName`:
+
+```text
+ToolName { namespace: Some("mcp__github"), name: "create_issue" }
+  -> owner = Client, original_type = "namespace"
+```
+
+Because the registry keys by split `ToolName`, two tools named `run` in different namespaces can coexist. When the
+upstream response includes a `namespace` field on a `function_call`, preserve it. When an upstream backend only
+returns an encoded flat name, recover the namespace and child tool name from the registry before returning the
+response to Codex. This likely requires extending `FunctionToolCall` with:
+
+```rust
+pub namespace: Option<String>,
+```
+
+`tool_search` may need to be adapted for backends that only understand functions, but the registry must preserve
+`execution` and map client-executed `tool_search_call` / `tool_search_output` items back to the
+Responses-compatible shape. Hosted/non-client tool search remains provider-owned and should not be handled by the
+local Codex route. `custom` carries free-form `input` instead of JSON arguments, so the registry must retain
+`format` and enough raw metadata to reconstruct the Codex-visible call.
+
+### Pass-Through Behavior
+
+Routing rules:
+
+1. Client-owned calls (`function`, namespaced `function`, client-executed `tool_search`, and `custom`) are returned
+   to Codex without gateway execution.
+2. Gateway-owned calls execute inside `agentic-api` only when explicitly supported by a registered executor.
+3. `namespace`, `tool_search`, and `custom` request declarations must not fail deserialization or validation.
+4. The registry preserves the original request tool shape so the returned call can be interpreted by Codex CLI.
+5. Unknown tool types are not executed by default. Preserve them when possible and reject them only when the
+   upstream cannot receive a safe normalized declaration.
+
+For Codex-owned calls, the gateway should not synthesize tool outputs. It persists the assistant call item, returns
+the call to Codex, and expects Codex to continue the conversation with the corresponding tool output item after
+local execution.
+
+The loop needs an explicit client-action decision so this path is not confused with either `Done` or
+`Continue`:
+
+```rust
+pub enum LoopDecision {
+    Continue(Vec<InputItem>),
+    RequiresClientAction(Vec<OutputItem>),
+    Done,
+    Incomplete(String),
+}
+```
+
+### Auto-Approval Model Alias
+
+[Issue #54](https://github.com/vllm-project/agentic-api/issues/54) also notes Codex's auto-approval request path. MVP support should add a simple model alias map in
+configuration:
+
+```toml
+[model_aliases]
+codex-auto-review = "real-upstream-model"
+```
+
+`ExecutionContext` resolves aliases before `call_inference()`. A Codex-specific `/v1/models` response with model metadata can come later; the alias map is sufficient to unblock CLI compatibility without expanding the public API.
+
+---
+
 ## Open Questions
 
 1. **`execute_loop` vs refactoring `execute`:** Should the loop wrapper be a new function or replace PR #46's `execute()`? Pending maralbahari's response on PR #46 review.

From 6d7ce49244a364777fce675d9161ea026c1ded3c Mon Sep 17 00:00:00 2001
From: haoshan98 <haoshanw@gmail.com>
Date: Thu, 25 Jun 2026 11:01:03 +0000
Subject: [PATCH 2/3] Revert "Codex integration design"

This reverts commit e1852fc0ffdf389942ff0e89f6b19018dd7bbf53.

Signed-off-by: haoshan98 <haoshanw@gmail.com>
---
 docs/design/core-public-api.md | 249 +--------------------------------
 1 file changed, 2 insertions(+), 247 deletions(-)

diff --git a/docs/design/core-public-api.md b/docs/design/core-public-api.md
index 873fc45..fb92954 100644
--- a/docs/design/core-public-api.md
+++ b/docs/design/core-public-api.md
@@ -1,8 +1,7 @@
 # Design: `agentic-core` Public API
 
 > Status: Active — implementation in progress
-> References: [ADR-03](../adr/ADR-03_gateway_integration.md), [Issue #42](https://github.com/vllm-project/agentic-api/issues/42),
-> [Issue #54](https://github.com/vllm-project/agentic-api/issues/54), [Praxis #354](https://github.com/praxis-proxy/praxis/issues/354)
+> References: [ADR-03](../adr/ADR-03_gateway_integration.md), [Issue #42](https://github.com/vllm-project/agentic-api/issues/42), [Praxis #354](https://github.com/praxis-proxy/praxis/issues/354)
 > Owner: @ashwing (tool dispatch, loop control, streaming tee) + @maralbahari (base loop, store integration)
 
 ---
@@ -34,7 +33,6 @@ The base loop handles text messages. This design extends it with:
 3. **Streaming tee** — forward SSE to client in real-time while accumulating for tool detection
 4. **Extended SSE events** — function_call, reasoning, file_search, web_search event types
 5. **Tool executor traits** — MCP, web_search, vector_store as pluggable implementations
-6. **Codex CLI compatibility** — recognize Codex client-side tool types and route them without server execution
 
 ---
 
@@ -125,7 +123,7 @@ pub async fn execute_loop(
 1. Rehydrate (delegates to PR #46's `rehydrate_conversation`)
 2. Call inference (delegates to PR #46's `call_inference` — returns stream lazily)
 3. Accumulate response (via `ResponseAccumulator::from_stream`)
-4. Check output for `OutputItem::FunctionCall` → `dispatch_tools` → loop, client action, or done
+4. Check output for `OutputItem::FunctionCall` → `dispatch_tools` → loop or done
 5. Persist final response (delegates to PR #46's `persist_response` with explicit handlers)
 
 **Phase 2 is non-streaming only.** The tool loop inspects the full accumulated response before deciding. Streaming + tool dispatch (forwarding events to client while detecting tool calls) requires Phase 3's tee pattern.
@@ -254,249 +252,6 @@ How the complete pipeline maps to @leseb's proposed filter chain:
 
 ---
 
-## Codex Integration
-
-Allow `agentic-api` to serve as the upstream layer for Codex CLI in the coming PR by @haoshan98, related to [Issue #54](https://github.com/vllm-project/agentic-api/issues/54).
-`agentic-api` should accept Codex CLI traffic, route inference to vLLM-supported models, preserve
-`previous_response_id` and conversation persistence, and pass client-owned tool calls back to Codex for local
-execution.
-
-The immediate compatibility gap is request parsing and pass-through behavior. `agentic-api` already supports
-`type: "function"`, but it does not yet recognize the Responses API tool shapes Codex uses for local/client tools:
-`namespace`, `tool_search`, and `custom`. Today those request tools can fail before they reach upstream inference.
-This section scopes the Codex integration to accepting those tool types losslessly and returning client-owned tool
-calls to Codex CLI for local execution.
-
-Server-hosted tool types such as `file_search`, `web_search_preview`, and `code_interpreter` remain future
-server-side work. They should not be conflated with this Codex compatibility pass.
-
-### Codex Tool Type Taxonomy
-
-Codex CLI sends tool declarations that are executed locally by the CLI. For this phase, `agentic-core` only needs
-to recognize and preserve these shapes, normalize them for vLLM when necessary, and avoid treating them as
-gateway-executed tools.
-
-| Tool type | Executor | Core behavior |
-|-----------|----------|---------------|
-| `function` | Codex CLI by default | Already supported on the wire. Codex requests should return calls to Codex unless configuration marks the tool as gateway-owned. |
-| `namespace` | Codex CLI | Accept the model-facing container shape and preserve child function metadata. Calls still arrive as `function_call` with an optional namespace. |
-| `tool_search` | Codex CLI | Accept deferred-discovery shape, preserve `execution`, and return calls/output handling to Codex when `execution = "client"`. |
-| `custom` | Codex CLI | Accept free-form/grammar tool shape and preserve `format` metadata for Codex. |
-
-The key requirement is compatibility, not server execution. These tools should pass through `agentic-api`
-without request validation failures, and any client-owned model-emitted calls should be surfaced to Codex CLI
-rather than executed inside the gateway.
-
-The key distinction is execution owner, not just wire type. `function` is a shared wire type: Codex can own local
-functions, while future gateway integrations may also expose server-executed functions. The request normalizer must
-classify every model-visible tool before inference and carry that registry through response handling.
-
-### Public Type Additions
-
-The current `ResponsesTool = FunctionTool` alias is too narrow for Codex. Replace it with a tagged tool enum that
-preserves unknown shapes while giving Codex-used variants first-class names.
-
-Do not implement this as `#[serde(tag = "type")]` wrapping the existing `FunctionTool` struct directly, because
-`FunctionTool` already stores the wire `type` field. Either use variant-specific payload structs that omit the
-already-consumed tag, as sketched below, or write manual deserialization that preserves the raw object.
-
-```rust
-#[derive(Debug, Clone, Serialize, Deserialize)]
-#[serde(untagged)]
-pub enum ResponsesTool {
-    Known(KnownResponsesTool),
-    Unknown(Value),
-}
-
-#[derive(Debug, Clone, Serialize, Deserialize)]
-#[serde(tag = "type")]
-pub enum KnownResponsesTool {
-    #[serde(rename = "function")]
-    Function(ResponsesFunctionTool),
-    #[serde(rename = "namespace")]
-    Namespace(CodexNamespaceTool),
-    #[serde(rename = "tool_search")]
-    ToolSearch(CodexToolSearchTool),
-    #[serde(rename = "custom")]
-    Custom(CodexCustomTool),
-}
-
-pub struct ResponsesFunctionTool {
-    pub name: String,
-    pub description: Option<String>,
-    pub parameters: Option<Value>,
-    pub strict: Option<bool>,
-    #[serde(default)]
-    pub defer_loading: bool,
-}
-
-pub struct CodexNamespaceTool {
-    pub name: String,
-    pub description: Option<String>,
-    #[serde(default)]
-    pub tools: Vec<CodexNamespaceMember>,
-}
-
-#[derive(Debug, Clone, Serialize, Deserialize)]
-#[serde(tag = "type")]
-pub enum CodexNamespaceMember {
-    #[serde(rename = "function")]
-    Function(ResponsesFunctionTool),
-}
-
-#[derive(Debug, Clone, Serialize, Deserialize)]
-#[serde(rename_all = "snake_case")]
-pub enum ToolSearchExecution {
-    Server,
-    Client,
-}
-
-pub struct CodexToolSearchTool {
-    pub description: Option<String>,
-    pub execution: Option<ToolSearchExecution>,
-    pub parameters: Option<Value>,
-}
-
-pub struct CodexCustomTool {
-    pub name: String,
-    pub description: Option<String>,
-    pub format: Option<Value>,
-    #[serde(default)]
-    pub defer_loading: bool,
-}
-```
-
-The storage and upstream request paths should preserve raw unknown tools as `Value`. Unknown tool types must not
-be executed by default.
-
-### Codex Call Shapes
-
-Codex's local router treats namespace as part of a function tool name, not as a separate payload type:
-
-| Response item | Local Codex payload | Gateway behavior |
-|---------------|---------------------|------------------|
-| `function_call` | `ToolPayload::Function { arguments }` | Preserve optional `namespace` and return the call to Codex when the tool is client-owned. |
-| `tool_search_call` with `execution = "client"` | `ToolPayload::ToolSearch` | Return to Codex for local deferred discovery. |
-| Hosted `tool_search_call` | Provider-owned | Do not execute locally; provider/upstream owns it. |
-| `custom_tool_call` | `ToolPayload::Custom { input }` | Preserve free-form `input`, not JSON-schema function arguments. |
-
-`custom_tool_call` is for free-form/custom Responses tools, including grammar-based patch or code tools. It should
-not be normalized as JSON-schema function arguments unless the adapter can reconstruct the original custom call
-exactly.
-
-### Normalization And Registry
-
-Codex compatibility needs two related operations:
-
-1. Build an upstream-safe tool list for the selected inference backend.
-2. Keep a lossless registry that maps model-emitted calls back to the original client-visible tool declaration.
-
-If vLLM only accepts flat function declarations, `tool_search` and `custom` become request normalization concerns.
-`namespace` is mostly model-facing spec organization: `ToolSpec::Namespace` wraps function tools, and the model
-still emits `function_call` with an optional `namespace`. Any backend-specific flattening is an adapter detail,
-not the public semantics of the tool.
-
-```rust
-pub struct NormalizedTools {
-    pub upstream_tools: Vec<Value>,
-    pub registry: ToolRegistry,
-}
-
-pub struct ToolName {
-    pub namespace: Option<String>,
-    pub name: String,
-}
-
-pub enum ToolExecutionOwner {
-    Client,
-    Gateway,
-}
-
-pub struct ToolRegistryEntry {
-    pub owner: ToolExecutionOwner,
-    pub original_type: String,
-    pub original_name: ToolName,
-    pub model_name: ToolName,
-    pub original_tool: Value,
-}
-```
-
-For a namespace tool:
-
-```json
-{
-  "type": "namespace",
-  "name": "mcp__github",
-  "tools": [
-    { "name": "create_issue", "description": "Create issue", "parameters": {} }
-  ]
-}
-```
-
-the normalizer records a registry entry keyed by the split `ToolName`:
-
-```text
-ToolName { namespace: Some("mcp__github"), name: "create_issue" }
-  -> owner = Client, original_type = "namespace"
-```
-
-Because the registry keys by split `ToolName`, two tools named `run` in different namespaces can coexist. When the
-upstream response includes a `namespace` field on a `function_call`, preserve it. When an upstream backend only
-returns an encoded flat name, recover the namespace and child tool name from the registry before returning the
-response to Codex. This likely requires extending `FunctionToolCall` with:
-
-```rust
-pub namespace: Option<String>,
-```
-
-`tool_search` may need to be adapted for backends that only understand functions, but the registry must preserve
-`execution` and map client-executed `tool_search_call` / `tool_search_output` items back to the
-Responses-compatible shape. Hosted/non-client tool search remains provider-owned and should not be handled by the
-local Codex route. `custom` carries free-form `input` instead of JSON arguments, so the registry must retain
-`format` and enough raw metadata to reconstruct the Codex-visible call.
-
-### Pass-Through Behavior
-
-Routing rules:
-
-1. Client-owned calls (`function`, namespaced `function`, client-executed `tool_search`, and `custom`) are returned
-   to Codex without gateway execution.
-2. Gateway-owned calls execute inside `agentic-api` only when explicitly supported by a registered executor.
-3. `namespace`, `tool_search`, and `custom` request declarations must not fail deserialization or validation.
-4. The registry preserves the original request tool shape so the returned call can be interpreted by Codex CLI.
-5. Unknown tool types are not executed by default. Preserve them when possible and reject them only when the
-   upstream cannot receive a safe normalized declaration.
-
-For Codex-owned calls, the gateway should not synthesize tool outputs. It persists the assistant call item, returns
-the call to Codex, and expects Codex to continue the conversation with the corresponding tool output item after
-local execution.
-
-The loop needs an explicit client-action decision so this path is not confused with either `Done` or
-`Continue`:
-
-```rust
-pub enum LoopDecision {
-    Continue(Vec<InputItem>),
-    RequiresClientAction(Vec<OutputItem>),
-    Done,
-    Incomplete(String),
-}
-```
-
-### Auto-Approval Model Alias
-
-[Issue #54](https://github.com/vllm-project/agentic-api/issues/54) also notes Codex's auto-approval request path. MVP support should add a simple model alias map in
-configuration:
-
-```toml
-[model_aliases]
-codex-auto-review = "real-upstream-model"
-```
-
-`ExecutionContext` resolves aliases before `call_inference()`. A Codex-specific `/v1/models` response with model metadata can come later; the alias map is sufficient to unblock CLI compatibility without expanding the public API.
-
----
-
 ## Open Questions
 
 1. **`execute_loop` vs refactoring `execute`:** Should the loop wrapper be a new function or replace PR #46's `execute()`? Pending maralbahari's response on PR #46 review.

From 80fe88222ff992ec1f3d485336eddb1746a5de5b Mon Sep 17 00:00:00 2001
From: haoshan98 <haoshanw@gmail.com>
Date: Thu, 25 Jun 2026 11:03:40 +0000
Subject: [PATCH 3/3] Codex compatibility layer

Signed-off-by: haoshan98 <haoshanw@gmail.com>
---
 docs/design/codex-integration.md | 153 +++++++++++++++++++++++++++++++
 1 file changed, 153 insertions(+)
 create mode 100644 docs/design/codex-integration.md

diff --git a/docs/design/codex-integration.md b/docs/design/codex-integration.md
new file mode 100644
index 0000000..f73b6d9
--- /dev/null
+++ b/docs/design/codex-integration.md
@@ -0,0 +1,153 @@
+# Design: Codex CLI Integration
+
+> **References:** [Issue #54](https://github.com/vllm-project/agentic-api/issues/54),
+> [PR #67](https://github.com/vllm-project/agentic-api/pull/67)
+> **Owner:** @haoshan98 for Codex compatibility. @ashwing PR #67 owns the generic tool framework.
+
+---
+
+## Summary
+
+`agentic-api` should work as an upstream layer for Codex CLI while routing inference to vLLM-supported models.
+
+This PR is an MVP compatibility slice. It lets `agentic-api` accept and preserve Codex-used Responses traffic now,
+without waiting for the full generic tool framework from PR #67.
+
+The important split:
+
+- **This PR:** preserve Codex request/response shapes and continuation state.
+- **PR #67:** formalize generic tool normalization, execution, registry, ownership, and loop decisions.
+
+---
+
+## Current PR Scope
+
+This PR should do only the minimum needed for Codex compatibility:
+
+- Add this standalone design doc.
+- Accept Codex-used tool declarations without rejecting requests.
+- Preserve unknown tool declarations and unknown input/output items as raw JSON.
+- Preserve optional `namespace` on `function_call`.
+- Preserve `tool_search_call` and `custom_tool_call` shapes.
+- Preserve assistant tool-call items through `previous_response_id` rehydration.
+- Add model alias routing for Codex-facing model names to local vLLM models.
+- Add lightweight helper types/tests that document what #67 should formalize later.
+
+This PR should **not** build a second generic tool framework.
+
+---
+
+## Deferred To PR #67
+
+PR #67 should own the formal shared tool system:
+
+- `ToolHandler` / `Tool` trait shape.
+- Generic tool normalization before `call_inference()`.
+- Request-scoped tool registry.
+- Client-owned vs gateway-owned dispatch.
+- Requires-action / client-action loop decision.
+- Live `execution_loop` orchestration and streaming tool events.
+
+The helper types in this PR are temporary. They express Codex requirements, but the canonical versions should come
+from #67. After #67 lands, this slice should plug into or be refactored onto those abstractions.
+
+---
+
+## Compatibility Rules
+
+The gateway should not detect requests by user agent, route, or "is this Codex?" heuristics. Compatibility is
+driven by Responses tool shapes and execution semantics, so it can be always on.
+
+| Shape | Behavior |
+|-------|----------|
+| `function` | Client-owned by default. Preserve declaration and return matching calls to the client unless configured as gateway-owned. |
+| `namespace` | Model-facing grouping for function tools. Do not treat namespace as a separate executable call type. |
+| `tool_search` | Client-owned only when `execution == "client"`. Hosted/non-client search is provider-owned. |
+| `custom` | Client-owned by default. Preserve free-form / grammar metadata. |
+| Unknown tool | Preserve as raw JSON. Never execute by default. |
+
+For response items:
+
+| Response item | Behavior |
+|---------------|----------|
+| `function_call` | Preserve optional `namespace`. |
+| `tool_search_call` with `execution == "client"` | Return to the client for local deferred discovery. |
+| Hosted / non-client `tool_search_call` | Do not execute locally. Leave to provider-specific handling. |
+| `custom_tool_call` | Preserve free-form `input`; do not coerce into JSON function arguments. |
+| Unknown output item | Preserve as raw JSON. Never execute by default. |
+
+---
+
+## Requirements For #67
+
+The generic framework should preserve enough metadata for Codex-compatible behavior:
+
+- raw original tool JSON
+- model-visible tool name
+- original client-visible identity
+- optional namespace or an equivalent unambiguous key
+- execution owner: `Client`, `Gateway`, or provider-owned
+- raw hints such as `execution`, `format`, and `defer_loading`
+
+If namespaced tools need disambiguation, a split identity is useful:
+
+```rust
+pub struct ToolName {
+    pub namespace: Option<String>,
+    pub name: String,
+}
+```
+
+This avoids collisions such as two different namespaces both defining a tool named `run`.
+
+---
+
+## Continuation
+
+Codex-owned tool calls must survive response-store continuation.
+
+Expected rehydration shape:
+
+```text
+prior context + assistant tool call + Codex tool output + new input
+```
+
+On a turn that returns client-owned tool calls, storage should keep the assistant call item. On the next turn, Codex
+submits the matching tool output item, and `previous_response_id` should rebuild the full sequence.
+
+---
+
+## Model Aliases
+
+Model aliases route Codex-facing model names to local vLLM models:
+
+```toml
+[model_aliases]
+codex-compatible = "qwen3-coder"
+```
+
+Alias resolution is only model routing. It must not imply approval, auto-review, or human-confirmation behavior.
+
+---
+
+## Test Plan
+
+Current PR tests should cover:
+
+- `function`, `namespace`, `tool_search`, `custom`, and unknown tools round-trip.
+- Extra fields remain preserved.
+- `function_call.namespace` round-trips.
+- `tool_search_call` and `custom_tool_call` remain raw-compatible.
+- Unknown input/output items remain raw JSON.
+- `previous_response_id` rehydrates assistant tool calls before tool outputs.
+- Model aliases resolve on executor and proxy paths.
+
+Post-#67 tests should prove the same behavior through the formal tool framework.
+
+---
+
+## Open Questions
+
+1. What exact requires-action payload type should #67 expose?
+2. Should #67 use split `ToolName { namespace, name }` or a different unambiguous registry key?
+3. Which Codex-used fields should become typed framework fields, and which should remain raw metadata?