-
Notifications
You must be signed in to change notification settings - Fork 277
Description
Summary
openshell inference set works for older OpenAI chat-completions models like gpt-4.1, but verification fails for GPT-5-family models like gpt-5.4 because the validation request appears to send max_tokens, which OpenAI rejects for that model family.
Environment
- OpenShell:
0.0.12 - Host: macOS (Apple Silicon workstation)
- Provider type:
openai - Base URL:
https://api.openai.com/v1
Reproduction
Provider setup succeeds:
export OPENAI_API_KEY=...
openshell provider create --name openai-api --type openai --credential OPENAI_API_KEY --config OPENAI_BASE_URL=https://api.openai.com/v1This succeeds:
openshell inference set --provider openai-api --model gpt-4.1Output:
Gateway inference configured:
Route: inference.local
Provider: openai-api
Model: gpt-4.1
Version: 1
Validated Endpoints:
- https://api.openai.com/v1/chat/completions (openai_chat_completions)
This fails:
openshell inference set --provider openai-api --model gpt-5.4Output:
Error: × failed to verify inference endpoint for provider 'openai-api' and model 'gpt-5.4' at 'https://api.openai.com/v1': upstream rejected the validation request with HTTP 400 Bad Request. Response body: {
"error": {
"message": "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.",
"type": "invalid_request_error",
"param": "max_token"
Expected behavior
openshell inference set --provider openai-api --model gpt-5.4 should validate successfully for supported OpenAI GPT-5-family models, or OpenShell should adapt the verification request to the model family requirements.
Actual behavior
Verification appears to send a request shaped for older chat-completions semantics (max_tokens), which GPT-5-family models reject.
Notes
- This does not appear to be a provider creation problem.
- The same provider and base URL work for
gpt-4.1. - As a workaround,
--no-verifycan be used, but that is not ideal for normal onboarding flows.
Likely fix area
OpenShell inference endpoint verification for OpenAI provider/model combinations likely needs model-aware request shaping for GPT-5-family models, specifically around max_completion_tokens vs max_tokens.