From b70ca398cb3714bc29a48221c7e4cf0f14086828 Mon Sep 17 00:00:00 2001 From: villyes Date: Thu, 21 May 2026 18:17:55 +0200 Subject: [PATCH 1/6] fix(genapis): add missing supported models MTA-7156 --- .../reference-content/supported-models.mdx | 241 +++++++++++++++++- 1 file changed, 240 insertions(+), 1 deletion(-) diff --git a/pages/generative-apis/reference-content/supported-models.mdx b/pages/generative-apis/reference-content/supported-models.mdx index be2d3ab1b8..3c557ca586 100644 --- a/pages/generative-apis/reference-content/supported-models.mdx +++ b/pages/generative-apis/reference-content/supported-models.mdx @@ -51,7 +51,17 @@ This page provides a quick overview of available models in Scaleway's catalog an | [`qwen3-coder-30b-a3b-instruct`](#qwen3-coder-30b-a3b-instruct) | Yes | 128k | 32k | Code | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | | [`qwen2.5-coder-32b-instruct`](#qwen25-coder-32b-instruct) | [EOL for Serverless](#end-of-life-eol-models-for-serverless) | 32k | [EOL for Serverless](#end-of-life-eol-models-for-serverless) | Code | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | | [`bge-multilingual-gemma2`](#bge-multilingual-gemma2) | Yes | 8k | N/A | Embeddings | [Gemma](https://ai.google.dev/gemma/terms) | -| [`sentence-t5-xxl`](#sentence-t5-xxl) | [EOL for Serverless](#end-of-life-eol-models-for-serverless) | 512 | [EOL for Serverless](#end-of-life-eol-models-for-serverless) | Embeddings | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |qwen3-235b-a22b-instruct-2507 +| [`sentence-t5-xxl`](#sentence-t5-xxl) | [EOL for Serverless](#end-of-life-eol-models-for-serverless) | 512 | [EOL for Serverless](#end-of-life-eol-models-for-serverless) | Embeddings | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | +| [`qwen3.6-35b-a3b`](#qwen36-35b-a3b)| Yes | 262k | 32k | Text, Code, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | +| [`gemma-4-26b-a4b-it`](#gemma-4-26b-a4b-it) | Yes | 262k | 32k | Text, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | +| [`mistral-medium-3.5-128b`](#mistral-medium-35-128b) | Yes | 256k | | Text, Vision | [Modified MIT License](https://huggingface.co/mistralai/Mistral-Medium-3.5-128B/blob/main/LICENSE) | +| [`qwen3.5-35b-a3b`](#qwen35-35b-a3b) | No | 262k | N/A | Text, Code, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | +| [`qwen3.5-122b-a10b`](#qwen35-122b-a10b) | No | 262k | N/A | Text, Code, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | +| [`gemma-4-31b-it`](#gemma-4-31b-it) | No | 256k | N/A | Text, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | +| [`minimax-m2.5`](#minimax-m25) | No | 197k | N/A | Code | [MIT](https://choosealicense.com/licenses/mit/) | +| [`qwen3-235b-a22b-thinking-2507`](#qwen3-235b-a22b-thinking-2507) | No | 262k | N/A | Text | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | +| [`gpt-oss-20b`](#gpt-oss-20b) | No | 131k | N/A | Text | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | +| [`llama-3-8b-instruct`](#llama-3-8b-instruct) | No | 8k | N/A | Text | [llama3](https://www.llama.com/llama3/license/) | \*Licences which are not open-weight and may restrict commercial usage (such as `CC-BY-NC-4.0`), do not apply to usage through Scaleway Products due to existing partnerships between Scaleway and the corresponding providers. Original licences are provided for transparency only. @@ -258,6 +268,158 @@ Vision-language models like Molmo can analyze an image and offer insights from v allenai/molmo-72b-0924:fp8 ``` +### Qwen3.6-35b-a3b + +Released in April 2026, Qwen3.6-35b-a3b is a state-of-the-art small-sized model optimized for agentic tasks and logical reasoning. + +| Attribute | Value | +|-----------|-------| +| Provider | Qwen | +| Supports parallel tool-calling | Yes | +| Maximum images per request | 12 | +| Maximum videos per request | 1 | +| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100, H100-2, H100-SXM-2 | +| Hugging Face model card | [qwen3.6-35b-a3b](https://huggingface.co/Qwen/Qwen3.6-35B-A3B) | +{/* | Supports structured output | | +| Supports function calling | | +| Supported image formats | | +| Supported video formats | | +| Maximum image resolution (pixels) | | +| Token dimension (pixels)| | +| Supported languages | | */} + +\*Maximum context length is only mentioned when an instance's VRAM size limits context length. Otherwise, maximum context length is the one defined by the model. + +#### Model name +``` +qwen/qwen3.6-35b-a3b:bf16 +qwen/qwen3.6-35b-a3b:fp8 +``` + +### Gemma-4-26b-a4b-it + +Released in April 2026, Gemma-4-26b-a4b-it is a state-of-the-art small-sized model optimized for agentic tasks and logical reasoning. + +| Attribute | Value | +|-----------|-------| +| Provider | Google | +| Supports parallel tool-calling | Yes | +| Maximum images per request | 12 | +| Supported languages | Out-of-the-box support for 35+ languages, pre-trained on 140+ languages | +| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100 (262k), H100-2 (262k), H100-SXM-2 (262k), H100-SXM-4 | +| Hugging Face model card | [gemma-4-26B-A4B-it](https://huggingface.co/google/gemma-4-26B-A4B-it) | +{/* | Supports structured output | | +| Supports function calling | | +| Supported image formats | | +| Maximum image resolution (pixels) | | +| Token dimension (pixels)| | */} + +\*Maximum context length is only mentioned when an instance's VRAM size limits context length. Otherwise, maximum context length is the one defined by the model. + +#### Model name +``` +google/gemma-4-26b-a4b-it:bf16 +``` + +### Mistral Medium 3.5-128b + +Mistral Medium 3.5 is a unified model from Mistral with strong performance for instruct, reasoning, and coding tasks. + +| Attribute | Value | +|-----------|-------| +| Provider | Mistral | +| Supports function calling | Yes | +| Supports parallel tool-calling | Yes | +| Supported languages | English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic, and other languages | +| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100-SXM-4 (256k), H100-SXM-8 (256k) | +| Hugging Face model card | [mistralai/Mistral-Medium-3.5-128B](https://huggingface.co/mistralai/Mistral-Medium-3.5-128B) | +{/* | Supports structured output | | +| Supported image formats | | +| Maximum image resolution (pixels) | | +| Token dimension (pixels)| | */} + +\*Maximum context length is only mentioned when an instance's VRAM size limits context length. Otherwise, maximum context length is the one defined by the model. + +#### Model name +``` +mistral/mistral-medium-3.5-128b:fp8 +``` + +### Qwen3.5-35b-a3b + +Released in March 2026, Qwen3.5-35b-a3b is a state-of-the-art small-sized model optimized for agentic and coding tasks, as well as logical reasoning. + +| Attribute | Value | +|-----------|-------| +| Provider | Qwen | +| Supports parallel tool-calling | Yes | +| Maximum images per request | 12 | +| Supported languages | 201 languages and dialects | +| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100-2, H100-SXM-2, H100-SXM-4, H100-SXM-8 | +| Hugging Face model card | [Qwen3.5-35B-A3B-GPTQ-Int4](https://huggingface.co/Qwen/Qwen3.5-35B-A3B-GPTQ-Int4) | +{/* | Supports structured output | | +| Supports function calling | | +| Supported image formats | | +| Maximum image resolution (pixels) | | +| Token dimension (pixels)| | */} + +\*Maximum context length is only mentioned when an instance's VRAM size limits context length. Otherwise, maximum context length is the one defined by the model. + +#### Model name +``` +qwen/qwen3.5-35b-a3b:int4 +``` + +### Qwen3.5-122b-a10b + +Released in March 2026, Qwen3.5-122b-a10b is a state-of-the-art medium-sized model optimized for agentic and coding tasks, as well as logical reasoning. + +| Attribute | Value | +|-----------|-------| +| Provider | Qwen | +| Supports parallel tool-calling | Yes | +| Maximum images per request | 12 | +| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100-2, H100-SXM-2, H100-SXM-4, H100-SXM-8 | +| Hugging Face model card | [Qwen/Qwen3.5-122B-A10B-GPTQ-Int4](https://huggingface.co/Qwen/Qwen3.5-122B-A10B-GPTQ-Int4) | +{/* | Supports structured output | | +| Supports function calling | | +| Supported image formats | | +| Maximum image resolution (pixels) | | +| Token dimension (pixels)| | +| Supported languages | | */} + +\*Maximum context length is only mentioned when an instance's VRAM size limits context length. Otherwise, maximum context length is the one defined by the model. + +#### Model name +``` +qwen/qwen3.5-122b-a10b:int4 +``` +### Gemma-4-31b-it + +Released in April 2026, Gemma-4-31b-it is a state-of-the-art small-sized model optimized for agentic tasks and logical reasoning. + +| Attribute | Value | +|-----------|-------| +| Provider | Google | +| Supports function calling | Yes | +| Maximum images per request | 12 | +| Supported languages | Out-of-the-box support for 35+ languages, pre-trained on 140+ languages | +| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100_16 (66k), | +| Hugging Face model card | [google/gemma-4-31B-it](https://huggingface.co/google/gemma-4-31B-it) | +{/* | Supports structured output | | +| Supports parallel tool-calling | | +| Supported image formats | | +| Maximum image resolution (pixels) | | +| Token dimension (pixels)| | */} + +\*Maximum context length is only mentioned when an instance's VRAM size limits context length. Otherwise, maximum context length is the one defined by the model. + +#### Model name +``` +google/gemma-4-31b-it:bf16 +``` + + ## Multimodal models (Text and Audio) ### Voxtral-small-24b-2507 @@ -599,6 +761,63 @@ mistral/magistral-small-2506:fp8 mistral/magistral-small-2506:bf16 ``` +### Qwen3-235B-A22B-Thinking-2507 +Qwen3-235B-A22B-Thinking-2507 is a highly capable and versatile language model, optimized for instruction following, logical reasoning through thinking capabilities, and long-context understanding, with enhanced performance across various domains and preferences. + +| Attribute | Value | +|-----------|-------| +| Provider | Qwen | +| Supports parallel tool-calling | Yes | +| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100-2 (100k), H100-SXM-2 (100k), H100-SXM-4 (262k) | +| Hugging Face model card | [Qwen3-235B-A22B-Thinking-2507-AWQ](https://huggingface.co/QuantTrio/Qwen3-235B-A22B-Thinking-2507-AWQ) | +{/* | Supports structured output | | +| Supports function calling | | +| Supported languages | | */} + + +#### Model name +``` +qwen/qwen3-235b-a22b-thinking-2507:awq +``` + +### Gpt-oss-20b + + +| Attribute | Value | +|-----------|-------| +| Provider | OpenAI | +| Supports parallel tool-calling | Yes | +| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100, H100-2, H100-SXM-2, H100-SXM-4, H100-SXM-8 | +| Hugging Face model card | [gpt-oss-20b](https://huggingface.co/openai/gpt-oss-20b) | +{/* | Supports structured output | | +| Supports function calling | | +| Supported languages | | */} + + +#### Model name +``` +openai/gpt-oss-20b:fp4 +``` + +### Llama-3-8b-instruct +Llama-3-8b-instruct is the first generation of 8B-param models by Meta, fine-tuned for instruction and automation. + +| Attribute | Value | +|-----------|-------| +| Provider | OpenAI | +| Compatible Instances (max context in tokens\*) - Dedicated Deployment | L40, L40S, H100, H100-2, H100-SXM-2, H100-SXM-4, H100-SXM-8 | +| Hugging Face model card | [Meta-Llama-3-8B-Instruct-FP8](https://huggingface.co/neuralmagic/Meta-Llama-3-8B-Instruct-FP8) | +{/* | Supports structured output | | +| Supports function calling | | +| Supports parallel tool-calling | | +| Supported languages | | */} + + +#### Model name +``` +meta/llama-3-8b-instruct:fp8 +``` + ## Code models ### Devstral-2-123b-instruct-2512 @@ -677,6 +896,26 @@ With Qwen2.5-coder deployed at Scaleway, your company can benefit from code gene qwen/qwen2.5-coder-32b-instruct:int8 ``` +### MiniMax-M2.5 +A state-of-the-art coding and agentic model excelling in tool use, search, and office tasks with exceptional efficiency and cost-effectiveness. + +| Attribute | Value | +|-----------|-------| +| Provider | MiniMaxAI | +| Supports parallel tool-calling | Yes | +| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100-SXM-4, H100-SXM-8 | +| Hugging Face model card | [lukealonso/MiniMax-M2.5-NVFP4](https://huggingface.co/lukealonso/MiniMax-M2.5-NVFP4) | +{/* | Supports structured output | | +| Supports function calling | | +| Supported languages | | */} + +\*Maximum context length is only mentioned when an instance's VRAM size limits context length. Otherwise, maximum context length is the one defined by the model. + +#### Model name +``` +minimaxai/minimax-m2.5:nvfp4 +``` + ## Embeddings models ### Qwen3-embedding-8b From 6a2a59d92aca9594cdfebf3c81928aaaf5011667 Mon Sep 17 00:00:00 2001 From: villyes Date: Fri, 22 May 2026 17:36:11 +0200 Subject: [PATCH 2/6] fix(genapis): add missing supported models MTA-7156 --- .../reference-content/supported-models.mdx | 427 ++++++++---------- 1 file changed, 190 insertions(+), 237 deletions(-) diff --git a/pages/generative-apis/reference-content/supported-models.mdx b/pages/generative-apis/reference-content/supported-models.mdx index 3c557ca586..b370932e50 100644 --- a/pages/generative-apis/reference-content/supported-models.mdx +++ b/pages/generative-apis/reference-content/supported-models.mdx @@ -3,7 +3,7 @@ title: Generative APIs supported models description: This page lists the open-source large language models supported by Scaleway. tags: dates: - validation: 2026-04-24 + validation: 2026-05-22 posted: 2024-04-18 --- This page provides a quick overview of available models in Scaleway's catalog and their core attributes. Expand any model below to see usage examples and detailed capabilities. @@ -22,19 +22,31 @@ This page provides a quick overview of available models in Scaleway's catalog an | Model name | Available in Serverless? | Maximum context window (tokens) | Maximum output (tokens) - Serverless | Modalities | License \* | |------------|--------------|--------------|------------|-----------|-----------| | [`gpt-oss-120b`](#gpt-oss-120b)| Yes | 128k | 32k | Text | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | +| [`gpt-oss-20b`](#gpt-oss-20b) | No | 131k | N/A | Text | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | | [`whisper-large-v3`](#whisper-large-v3) | Yes | - | - | Audio transcription | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | +| [`qwen3.6-35b-a3b`](#qwen36-35b-a3b)| Yes | 262k | 32k | Text, Code, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | | [`qwen3.5-397b-a17b`](#qwen35-397b-a17b)| Yes | 250k | 16k | Text, Code, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | +| [`qwen3.5-35b-a3b`](#qwen35-35b-a3b) | No | 262k | N/A | Text, Code, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | +| [`qwen3.5-122b-a10b`](#qwen35-122b-a10b) | No | 262k | N/A | Text, Code, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | | [`qwen3-235b-a22b-instruct-2507`](#qwen3-235b-a22b-instruct-2507) | Yes | 250k | 16k | Text | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | +| [`qwen3-235b-a22b-thinking-2507`](#qwen3-235b-a22b-thinking-2507) | No | 262k | N/A | Text | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | +| [`qwen3-embedding-8b`](#qwen3-embedding-8b) | Yes | 32k | N/A | Embeddings | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | +| [`qwen3-coder-30b-a3b-instruct`](#qwen3-coder-30b-a3b-instruct) | Yes | 128k | 32k | Code | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | +| [`qwen2.5-coder-32b-instruct`](#qwen25-coder-32b-instruct) | [EOL for Serverless](#end-of-life-eol-models-for-serverless) | 32k | [EOL for Serverless](#end-of-life-eol-models-for-serverless) | Code | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | +| [`gemma-4-31b-it`](#gemma-4-31b-it) | No | 262k | N/A | Text, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | +| [`gemma-4-26b-a4b-it`](#gemma-4-26b-a4b-it) | Yes | 262k | 32k | Text, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | | [`gemma-3-27b-it`](#gemma-3-27b-it) | Yes | 40k | 8k | Text, Vision | [Gemma](https://ai.google.dev/gemma/terms) | | [`llama-3.3-70b-instruct`](#llama-33-70b-instruct) | Yes | 100k (Serverless)/ 128k (Dedicated)| 16k | Text | [Llama 3.3 Community](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) | | [`llama-3.1-70b-instruct`](#llama-31-70b-instruct) | [EOL for Serverless](#end-of-life-eol-models-for-serverless) | 128k | [EOL for Serverless](#end-of-life-eol-models-for-serverless) | Text | [Llama 3.1 Community](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct/blob/main/LICENSE) | | [`llama-3.1-8b-instruct`](#llama-31-8b-instruct) | [EOL for Serverless](#end-of-life-eol-models-for-serverless) | 128k | [EOL for Serverless](#end-of-life-eol-models-for-serverless) | Text | [Llama 3.1 Community](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct/blob/main/LICENSE) | +| [`llama-3-8b-instruct`](#llama-3-8b-instruct) | No | 8k | N/A | Text | [Meta Llama 3](https://www.llama.com/llama3/license/) | | [`llama-3-70b-instruct`](#llama-3-70b-instruct) | No | 8k | N/A | Text | [Llama 3 Community](https://huggingface.co/meta-llama/Meta-Llama-3-8B/blob/main/LICENSE) | | [`llama-3.1-nemotron-70b-instruct`](#llama-31-nemotron-70b-instruct) | No | 128k | N/A | Text | [Llama 3.1 Community](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct/blob/main/LICENSE) | | [`deepseek-r1-distill-llama-70b`](#deepseek-r1-distill-llama-70b) | [EOL for Serverless](#end-of-life-eol-models-for-serverless) | 16k (Serverless) / 128k (Dedicated) | 4k | Text | [MIT](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B/blob/main/LICENSE) and [Llama 3.3 Community](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct/blob/main/LICENSE) | | [`deepseek-r1-distill-llama-8b`](#deepseek-r1-distill-llama-8b) | No | 128k | N/A | Text | [MIT](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B/blob/main/LICENSE) and [Llama 3.1 Community](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct/blob/main/LICENSE) | | [`mistral-7b-instruct-v0.3`](#mistral-7b-instruct-v03) | No | 32k | N/A | Text | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | | [`mistral-large-3-675b-instruct-2512`](#mistral-large-3-675b-instruct-2512) | No | 250k | N/A | Text, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | +| [`mistral-medium-3.5-128b`](#mistral-medium-35-128b) | Yes | 256k | 16k | Text, Vision | [Modified MIT License](https://huggingface.co/mistralai/Mistral-Medium-3.5-128B/blob/main/LICENSE) | | [`mistral-small-3.2-24b-instruct-2506`](#mistral-small-32-24b-instruct-2506) | Yes | 128k | 32k | Text, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | | [`mistral-small-3.1-24b-instruct-2503`](#mistral-small-31-24b-instruct-2503) | [EOL for Serverless](#end-of-life-eol-models-for-serverless) | 128k | [EOL for Serverless](#end-of-life-eol-models-for-serverless) | Text, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | | [`mistral-small-24b-instruct-2501`](#mistral-small-24b-instruct-2501) | No | 32k | N/A | Text | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | @@ -47,21 +59,9 @@ This page provides a quick overview of available models in Scaleway's catalog an | [`pixtral-12b-2409`](#pixtral-12b-2409) | Yes | 128k | 4k | Text, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | | [`molmo-72b-0924`](#molmo-72b-0924) | No | 50k | N/A | Text, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) and [Twonyi Qianwen license](https://huggingface.co/Qwen/Qwen2-72B/blob/main/LICENSE)| | [`holo2-30b-a3b`](#holo2-30b-a3b)| Yes | 22k | 32k | Text, Vision | [CC-BY-NC-4.0](https://spdx.org/licenses/CC-BY-NC-4.0)| -| [`qwen3-embedding-8b`](#qwen3-embedding-8b) | Yes | 32k | N/A | Embeddings | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | -| [`qwen3-coder-30b-a3b-instruct`](#qwen3-coder-30b-a3b-instruct) | Yes | 128k | 32k | Code | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | -| [`qwen2.5-coder-32b-instruct`](#qwen25-coder-32b-instruct) | [EOL for Serverless](#end-of-life-eol-models-for-serverless) | 32k | [EOL for Serverless](#end-of-life-eol-models-for-serverless) | Code | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | | [`bge-multilingual-gemma2`](#bge-multilingual-gemma2) | Yes | 8k | N/A | Embeddings | [Gemma](https://ai.google.dev/gemma/terms) | | [`sentence-t5-xxl`](#sentence-t5-xxl) | [EOL for Serverless](#end-of-life-eol-models-for-serverless) | 512 | [EOL for Serverless](#end-of-life-eol-models-for-serverless) | Embeddings | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | -| [`qwen3.6-35b-a3b`](#qwen36-35b-a3b)| Yes | 262k | 32k | Text, Code, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | -| [`gemma-4-26b-a4b-it`](#gemma-4-26b-a4b-it) | Yes | 262k | 32k | Text, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | -| [`mistral-medium-3.5-128b`](#mistral-medium-35-128b) | Yes | 256k | | Text, Vision | [Modified MIT License](https://huggingface.co/mistralai/Mistral-Medium-3.5-128B/blob/main/LICENSE) | -| [`qwen3.5-35b-a3b`](#qwen35-35b-a3b) | No | 262k | N/A | Text, Code, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | -| [`qwen3.5-122b-a10b`](#qwen35-122b-a10b) | No | 262k | N/A | Text, Code, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | -| [`gemma-4-31b-it`](#gemma-4-31b-it) | No | 256k | N/A | Text, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | | [`minimax-m2.5`](#minimax-m25) | No | 197k | N/A | Code | [MIT](https://choosealicense.com/licenses/mit/) | -| [`qwen3-235b-a22b-thinking-2507`](#qwen3-235b-a22b-thinking-2507) | No | 262k | N/A | Text | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | -| [`gpt-oss-20b`](#gpt-oss-20b) | No | 131k | N/A | Text | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | -| [`llama-3-8b-instruct`](#llama-3-8b-instruct) | No | 8k | N/A | Text | [llama3](https://www.llama.com/llama3/license/) | \*Licences which are not open-weight and may restrict commercial usage (such as `CC-BY-NC-4.0`), do not apply to usage through Scaleway Products due to existing partnerships between Scaleway and the corresponding providers. Original licences are provided for transparency only. @@ -76,6 +76,25 @@ This page provides a quick overview of available models in Scaleway's catalog an Vision models can understand and analyze images, not generate them. You will use vision models through the `/v1/chat/completions` endpoint. +### Gemma-4-31b-it + +Released in April 2026, Gemma-4-31b-it is a state-of-the-art small-sized model optimized for agentic tasks and logical reasoning. + +| Attribute | Value | +|-----------|-------| +| Provider | Google | +| Supports function calling | Yes | +| Maximum images per request | 12 | +| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100_16 (66k), H100-2 (262k), H100-SXM-2 (262k), H100-SXM-4 | +| Hugging Face model card | [google/gemma-4-31B-it](https://huggingface.co/google/gemma-4-31B-it) | + +\*Maximum context length is only mentioned when an instance's VRAM size limits context length. Otherwise, maximum context length is the one defined by the model. + +#### Model name +``` +google/gemma-4-31b-it:bf16 +``` + ### Gemma-3-27b-it Gemma-3-27b-it is a model developed by Google to perform text processing and image analysis on many languages. The model was not trained specifically to output function / tool call tokens. Hence function calling is currently supported, but reliability remains limited. @@ -102,6 +121,26 @@ Pan & Scan is not yet supported for Gemma 3 images. This means that high-resolut google/gemma-3-27b-it:bf16 ``` +### Gemma-4-26b-a4b-it + +Released in April 2026, Gemma-4-26b-a4b-it is a state-of-the-art small-sized model optimized for agentic tasks and logical reasoning. + +| Attribute | Value | +|-----------|-------| +| Provider | Google | +| Supports parallel tool-calling | Yes | +| Maximum images per request | 12 | +| Supported languages | English, German, French, Chinese, Japanese, Korean | +| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100 (262k), H100-2 (262k), H100-SXM-2 (262k), H100-SXM-4 | +| Hugging Face model card | [gemma-4-26B-A4B-it](https://huggingface.co/google/gemma-4-26B-A4B-it) | + +\*Maximum context length is only mentioned when an instance's VRAM size limits context length. Otherwise, maximum context length is the one defined by the model. + +#### Model name +``` +google/gemma-4-26b-a4b-it:bf16 +``` + ### Mistral-large-3-675b-instruct-2512 Mistral-large-3-675b-instruct-2512 is a frontier model, performing among the best open-weight models as of December 2025. It is ideal for agentic workflows and image understanding. @@ -123,6 +162,25 @@ Mistral-large-3-675b-instruct-2512 is a frontier model, performing among the bes ``` mistral/mistral-large-3-675b-instruct-2512:fp4 ``` +### Mistral-medium-3.5-128b + +Mistral-medium-3.5 is a unified model from Mistral with strong performance for instruct, reasoning, and coding tasks. + +| Attribute | Value | +|-----------|-------| +| Provider | Mistral | +| Supports function calling | Yes | +| Supports parallel tool-calling | Yes | +| Supported languages | English, French, German, Spanish, Portuguese, Italian, and 18 additional languages and dialects | +| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100-SXM-4 (256k), H100-SXM-8 (256k) | +| Hugging Face model card | [mistralai/Mistral-Medium-3.5-128B](https://huggingface.co/mistralai/Mistral-Medium-3.5-128B) | + +\*Maximum context length is only mentioned when an instance's VRAM size limits context length. Otherwise, maximum context length is the one defined by the model. + +#### Model name +``` +mistral/mistral-medium-3.5-128b:fp8 +``` ### Mistral-small-3.2-24b-instruct-2506 Mistral-small-3.2-24b-instruct-2506 is an improved version of Mistral-small-3.1, which performs better on tool-calling. @@ -177,6 +235,28 @@ mistral/mistral-small-3.1-24b-instruct-2503:bf16 mistral/mistral-small-3.1-24b-instruct-2503:fp8 ``` +### Qwen3.6-35b-a3b + +Released in April 2026, Qwen3.6-35b-a3b is a state-of-the-art small-sized model optimized for agentic tasks and logical reasoning. + +| Attribute | Value | +|-----------|-------| +| Provider | Qwen | +| Supports parallel tool-calling | Yes | +| Maximum images per request | 12 | +| Maximum videos per request | 1 | +| Supported languages | English, French, Portuguese, German, Romanian, Swedish, and 70 additional languages and dialects | +| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100, H100-2, H100-SXM-2 | +| Hugging Face model card | [qwen3.6-35b-a3b](https://huggingface.co/Qwen/Qwen3.6-35B-A3B) | + +\*Maximum context length is only mentioned when an instance's VRAM size limits context length. Otherwise, maximum context length is the one defined by the model. + +#### Model name +``` +qwen/qwen3.6-35b-a3b:bf16 +qwen/qwen3.6-35b-a3b:fp8 +``` + ### Qwen3.5-397b-a17b Qwen3.5-397b-a17b is a model developed by Qwen to perform text processing, agentic coding, image, and video analysis in several languages. This model was released as a frontier reasoning model on 16 February 2026. @@ -202,6 +282,44 @@ This model was released as a frontier reasoning model on 16 February 2026. qwen/qwen3.5-397b-a17b:int4 ``` +### Qwen3.5-35b-a3b + +Released in March 2026, Qwen3.5-35b-a3b is a state-of-the-art small-sized model optimized for agentic and coding tasks, as well as logical reasoning. + +| Attribute | Value | +|-----------|-------| +| Provider | Qwen | +| Supports parallel tool-calling | Yes | +| Maximum images per request | 12 | +| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100-2, H100-SXM-2, H100-SXM-4, H100-SXM-8 | +| Hugging Face model card | [Qwen3.5-35B-A3B-GPTQ-Int4](https://huggingface.co/Qwen/Qwen3.5-35B-A3B-GPTQ-Int4) | + +\*Maximum context length is only mentioned when an instance's VRAM size limits context length. Otherwise, maximum context length is the one defined by the model. + +#### Model name +``` +qwen/qwen3.5-35b-a3b:int4 +``` + +### Qwen3.5-122b-a10b + +Released in March 2026, Qwen3.5-122b-a10b is a state-of-the-art medium-sized model optimized for agentic and coding tasks, as well as logical reasoning. + +| Attribute | Value | +|-----------|-------| +| Provider | Qwen | +| Supports parallel tool-calling | Yes | +| Maximum images per request | 12 | +| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100-2, H100-SXM-2, H100-SXM-4, H100-SXM-8 | +| Hugging Face model card | [Qwen/Qwen3.5-122B-A10B-GPTQ-Int4](https://huggingface.co/Qwen/Qwen3.5-122B-A10B-GPTQ-Int4) | + +\*Maximum context length is only mentioned when an instance's VRAM size limits context length. Otherwise, maximum context length is the one defined by the model. + +#### Model name +``` +qwen/qwen3.5-122b-a10b:int4 +``` + ### Pixtral-12b-2409 Pixtral is a vision language model introducing a novel architecture: 12B parameter multimodal decoder plus 400M parameter vision encoder. It can analyze images and offer insights from visual content alongside text. @@ -268,158 +386,6 @@ Vision-language models like Molmo can analyze an image and offer insights from v allenai/molmo-72b-0924:fp8 ``` -### Qwen3.6-35b-a3b - -Released in April 2026, Qwen3.6-35b-a3b is a state-of-the-art small-sized model optimized for agentic tasks and logical reasoning. - -| Attribute | Value | -|-----------|-------| -| Provider | Qwen | -| Supports parallel tool-calling | Yes | -| Maximum images per request | 12 | -| Maximum videos per request | 1 | -| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100, H100-2, H100-SXM-2 | -| Hugging Face model card | [qwen3.6-35b-a3b](https://huggingface.co/Qwen/Qwen3.6-35B-A3B) | -{/* | Supports structured output | | -| Supports function calling | | -| Supported image formats | | -| Supported video formats | | -| Maximum image resolution (pixels) | | -| Token dimension (pixels)| | -| Supported languages | | */} - -\*Maximum context length is only mentioned when an instance's VRAM size limits context length. Otherwise, maximum context length is the one defined by the model. - -#### Model name -``` -qwen/qwen3.6-35b-a3b:bf16 -qwen/qwen3.6-35b-a3b:fp8 -``` - -### Gemma-4-26b-a4b-it - -Released in April 2026, Gemma-4-26b-a4b-it is a state-of-the-art small-sized model optimized for agentic tasks and logical reasoning. - -| Attribute | Value | -|-----------|-------| -| Provider | Google | -| Supports parallel tool-calling | Yes | -| Maximum images per request | 12 | -| Supported languages | Out-of-the-box support for 35+ languages, pre-trained on 140+ languages | -| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100 (262k), H100-2 (262k), H100-SXM-2 (262k), H100-SXM-4 | -| Hugging Face model card | [gemma-4-26B-A4B-it](https://huggingface.co/google/gemma-4-26B-A4B-it) | -{/* | Supports structured output | | -| Supports function calling | | -| Supported image formats | | -| Maximum image resolution (pixels) | | -| Token dimension (pixels)| | */} - -\*Maximum context length is only mentioned when an instance's VRAM size limits context length. Otherwise, maximum context length is the one defined by the model. - -#### Model name -``` -google/gemma-4-26b-a4b-it:bf16 -``` - -### Mistral Medium 3.5-128b - -Mistral Medium 3.5 is a unified model from Mistral with strong performance for instruct, reasoning, and coding tasks. - -| Attribute | Value | -|-----------|-------| -| Provider | Mistral | -| Supports function calling | Yes | -| Supports parallel tool-calling | Yes | -| Supported languages | English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic, and other languages | -| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100-SXM-4 (256k), H100-SXM-8 (256k) | -| Hugging Face model card | [mistralai/Mistral-Medium-3.5-128B](https://huggingface.co/mistralai/Mistral-Medium-3.5-128B) | -{/* | Supports structured output | | -| Supported image formats | | -| Maximum image resolution (pixels) | | -| Token dimension (pixels)| | */} - -\*Maximum context length is only mentioned when an instance's VRAM size limits context length. Otherwise, maximum context length is the one defined by the model. - -#### Model name -``` -mistral/mistral-medium-3.5-128b:fp8 -``` - -### Qwen3.5-35b-a3b - -Released in March 2026, Qwen3.5-35b-a3b is a state-of-the-art small-sized model optimized for agentic and coding tasks, as well as logical reasoning. - -| Attribute | Value | -|-----------|-------| -| Provider | Qwen | -| Supports parallel tool-calling | Yes | -| Maximum images per request | 12 | -| Supported languages | 201 languages and dialects | -| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100-2, H100-SXM-2, H100-SXM-4, H100-SXM-8 | -| Hugging Face model card | [Qwen3.5-35B-A3B-GPTQ-Int4](https://huggingface.co/Qwen/Qwen3.5-35B-A3B-GPTQ-Int4) | -{/* | Supports structured output | | -| Supports function calling | | -| Supported image formats | | -| Maximum image resolution (pixels) | | -| Token dimension (pixels)| | */} - -\*Maximum context length is only mentioned when an instance's VRAM size limits context length. Otherwise, maximum context length is the one defined by the model. - -#### Model name -``` -qwen/qwen3.5-35b-a3b:int4 -``` - -### Qwen3.5-122b-a10b - -Released in March 2026, Qwen3.5-122b-a10b is a state-of-the-art medium-sized model optimized for agentic and coding tasks, as well as logical reasoning. - -| Attribute | Value | -|-----------|-------| -| Provider | Qwen | -| Supports parallel tool-calling | Yes | -| Maximum images per request | 12 | -| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100-2, H100-SXM-2, H100-SXM-4, H100-SXM-8 | -| Hugging Face model card | [Qwen/Qwen3.5-122B-A10B-GPTQ-Int4](https://huggingface.co/Qwen/Qwen3.5-122B-A10B-GPTQ-Int4) | -{/* | Supports structured output | | -| Supports function calling | | -| Supported image formats | | -| Maximum image resolution (pixels) | | -| Token dimension (pixels)| | -| Supported languages | | */} - -\*Maximum context length is only mentioned when an instance's VRAM size limits context length. Otherwise, maximum context length is the one defined by the model. - -#### Model name -``` -qwen/qwen3.5-122b-a10b:int4 -``` -### Gemma-4-31b-it - -Released in April 2026, Gemma-4-31b-it is a state-of-the-art small-sized model optimized for agentic tasks and logical reasoning. - -| Attribute | Value | -|-----------|-------| -| Provider | Google | -| Supports function calling | Yes | -| Maximum images per request | 12 | -| Supported languages | Out-of-the-box support for 35+ languages, pre-trained on 140+ languages | -| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100_16 (66k), | -| Hugging Face model card | [google/gemma-4-31B-it](https://huggingface.co/google/gemma-4-31B-it) | -{/* | Supports structured output | | -| Supports parallel tool-calling | | -| Supported image formats | | -| Maximum image resolution (pixels) | | -| Token dimension (pixels)| | */} - -\*Maximum context length is only mentioned when an instance's VRAM size limits context length. Otherwise, maximum context length is the one defined by the model. - -#### Model name -``` -google/gemma-4-31b-it:bf16 -``` - - ## Multimodal models (Text and Audio) ### Voxtral-small-24b-2507 @@ -487,6 +453,43 @@ openai/whisper-large-v3:bf16 ## Text models +### Gpt-oss-120b +Released 5 August 2025, GPT OSS 120B is an open-weight model providing significant throughput performance and reasoning capabilities. +Currently, this model should be used through Responses API, as Chat Completion does not yet support tool-calling for this model. + +| Attribute | Value | +|-----------|-------| +| Provider | OpenAI | +| Supports structured output | Yes | +| Supports function calling | Yes | +| Supports parallel tool-calling | Yes | +| Supported languages | English | +| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100 | +| Hugging Face model card | [gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) | + +\*Maximum context length is only mentioned when an instance's VRAM size limits context length. Otherwise, maximum context length is the one defined by the model. + +#### Model name +``` +openai/gpt-oss-120b:fp4 +``` + +### Gpt-oss-20b +GPT OSS 20b is an OpenAI open-weight model designed for powerful reasoning, agentic tasks, and versatile developer use cases. + +| Attribute | Value | +|-----------|-------| +| Provider | OpenAI | +| Supports parallel tool-calling | Yes | +| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100, H100-2, H100-SXM-2, H100-SXM-4, H100-SXM-8 | +| Hugging Face model card | [gpt-oss-20b](https://huggingface.co/openai/gpt-oss-20b) | + + +#### Model name +``` +openai/gpt-oss-20b:fp4 +``` + ### Qwen3-235b-a22b-instruct-2507 Released 23 July 2025, Qwen 3 235B A22B is an open-weight model, competitive in multiple benchmarks (such as [LM Arena for text use cases](https://lmarena.ai/leaderboard)) compared to Gemini 2.5 Pro and GPT4.5. @@ -508,25 +511,20 @@ Released 23 July 2025, Qwen 3 235B A22B is an open-weight model, competitive in qwen/qwen3-235b-a22b-instruct-2507 ``` -### Gpt-oss-120b -Released 5 August 2025, GPT OSS 120B is an open-weight model providing significant throughput performance and reasoning capabilities. -Currently, this model should be used through Responses API, as Chat Completion does not yet support tool-calling for this model. +### Qwen3-235b-a22b-thinking-2507 +Qwen3-235b-a22b-thinking-2507 is a highly capable and versatile language model, optimized for instruction following, logical reasoning through thinking capabilities, and long-context understanding, with enhanced performance across various domains and preferences. | Attribute | Value | |-----------|-------| -| Provider | OpenAI | -| Supports structured output | Yes | -| Supports function calling | Yes | -| Supports parallel tool-calling | Yes | -| Supported languages | English | -| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100 | -| Hugging Face model card | [gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) | +| Provider | Qwen | +| Supports parallel tool-calling | Yes | +| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100-2 (100k), H100-SXM-2 (100k), H100-SXM-4 (262k) | +| Hugging Face model card | [Qwen3-235B-A22B-Thinking-2507-AWQ](https://huggingface.co/QuantTrio/Qwen3-235B-A22B-Thinking-2507-AWQ) | -\*Maximum context length is only mentioned when an instance's VRAM size limits context length. Otherwise, maximum context length is the one defined by the model. #### Model name ``` -openai/gpt-oss-120b:fp4 +qwen/qwen3-235b-a22b-thinking-2507:awq ``` ### Llama-3.3-70b-instruct @@ -592,6 +590,21 @@ Llama 3.1 was designed to match the best proprietary models and outperform many meta/llama-3.1-8b-instruct:fp8 meta/llama-3.1-8b-instruct:bf16 ``` +### Llama-3-8b-instruct +Llama-3-8b-instruct is the first generation of 8B-param models by Meta, fine-tuned for instruction and automation. + +| Attribute | Value | +|-----------|-------| +| Provider | Meta | +| Compatible Instances (max context in tokens\*) - Dedicated Deployment | L4, L40S, H100, H100-2, H100-SXM-2, H100-SXM-4, H100-SXM-8 | +| Hugging Face model card | [Meta-Llama-3-8B-Instruct-FP8](https://huggingface.co/neuralmagic/Meta-Llama-3-8B-Instruct-FP8) | + + +#### Model name +``` +meta/llama-3-8b-instruct:fp8 +meta/llama-3-8b-instruct:bf16 +``` ### Llama-3-70b-instruct Meta’s Llama 3 is an iteration of the open-access Llama family. @@ -761,63 +774,6 @@ mistral/magistral-small-2506:fp8 mistral/magistral-small-2506:bf16 ``` -### Qwen3-235B-A22B-Thinking-2507 -Qwen3-235B-A22B-Thinking-2507 is a highly capable and versatile language model, optimized for instruction following, logical reasoning through thinking capabilities, and long-context understanding, with enhanced performance across various domains and preferences. - -| Attribute | Value | -|-----------|-------| -| Provider | Qwen | -| Supports parallel tool-calling | Yes | -| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100-2 (100k), H100-SXM-2 (100k), H100-SXM-4 (262k) | -| Hugging Face model card | [Qwen3-235B-A22B-Thinking-2507-AWQ](https://huggingface.co/QuantTrio/Qwen3-235B-A22B-Thinking-2507-AWQ) | -{/* | Supports structured output | | -| Supports function calling | | -| Supported languages | | */} - - -#### Model name -``` -qwen/qwen3-235b-a22b-thinking-2507:awq -``` - -### Gpt-oss-20b - - -| Attribute | Value | -|-----------|-------| -| Provider | OpenAI | -| Supports parallel tool-calling | Yes | -| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100, H100-2, H100-SXM-2, H100-SXM-4, H100-SXM-8 | -| Hugging Face model card | [gpt-oss-20b](https://huggingface.co/openai/gpt-oss-20b) | -{/* | Supports structured output | | -| Supports function calling | | -| Supported languages | | */} - - -#### Model name -``` -openai/gpt-oss-20b:fp4 -``` - -### Llama-3-8b-instruct -Llama-3-8b-instruct is the first generation of 8B-param models by Meta, fine-tuned for instruction and automation. - -| Attribute | Value | -|-----------|-------| -| Provider | OpenAI | -| Compatible Instances (max context in tokens\*) - Dedicated Deployment | L40, L40S, H100, H100-2, H100-SXM-2, H100-SXM-4, H100-SXM-8 | -| Hugging Face model card | [Meta-Llama-3-8B-Instruct-FP8](https://huggingface.co/neuralmagic/Meta-Llama-3-8B-Instruct-FP8) | -{/* | Supports structured output | | -| Supports function calling | | -| Supports parallel tool-calling | | -| Supported languages | | */} - - -#### Model name -``` -meta/llama-3-8b-instruct:fp8 -``` - ## Code models ### Devstral-2-123b-instruct-2512 @@ -905,9 +861,6 @@ A state-of-the-art coding and agentic model excelling in tool use, search, and o | Supports parallel tool-calling | Yes | | Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100-SXM-4, H100-SXM-8 | | Hugging Face model card | [lukealonso/MiniMax-M2.5-NVFP4](https://huggingface.co/lukealonso/MiniMax-M2.5-NVFP4) | -{/* | Supports structured output | | -| Supports function calling | | -| Supported languages | | */} \*Maximum context length is only mentioned when an instance's VRAM size limits context length. Otherwise, maximum context length is the one defined by the model. From 7d3f1f19a0229a10511b140f038078da9fdcdc90 Mon Sep 17 00:00:00 2001 From: villyes Date: Fri, 22 May 2026 17:55:47 +0200 Subject: [PATCH 3/6] fix(genapis): add missing supported models MTA-7156 --- pages/generative-apis/reference-content/supported-models.mdx | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/pages/generative-apis/reference-content/supported-models.mdx b/pages/generative-apis/reference-content/supported-models.mdx index b370932e50..10fae15a1a 100644 --- a/pages/generative-apis/reference-content/supported-models.mdx +++ b/pages/generative-apis/reference-content/supported-models.mdx @@ -34,7 +34,7 @@ This page provides a quick overview of available models in Scaleway's catalog an | [`qwen3-coder-30b-a3b-instruct`](#qwen3-coder-30b-a3b-instruct) | Yes | 128k | 32k | Code | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | | [`qwen2.5-coder-32b-instruct`](#qwen25-coder-32b-instruct) | [EOL for Serverless](#end-of-life-eol-models-for-serverless) | 32k | [EOL for Serverless](#end-of-life-eol-models-for-serverless) | Code | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | | [`gemma-4-31b-it`](#gemma-4-31b-it) | No | 262k | N/A | Text, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | -| [`gemma-4-26b-a4b-it`](#gemma-4-26b-a4b-it) | Yes | 262k | 32k | Text, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | +| [`gemma-4-26b-a4b-it`](#gemma-4-26b-a4b-it) | Yes | 256k | 32k | Text, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | | [`gemma-3-27b-it`](#gemma-3-27b-it) | Yes | 40k | 8k | Text, Vision | [Gemma](https://ai.google.dev/gemma/terms) | | [`llama-3.3-70b-instruct`](#llama-33-70b-instruct) | Yes | 100k (Serverless)/ 128k (Dedicated)| 16k | Text | [Llama 3.3 Community](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) | | [`llama-3.1-70b-instruct`](#llama-31-70b-instruct) | [EOL for Serverless](#end-of-life-eol-models-for-serverless) | 128k | [EOL for Serverless](#end-of-life-eol-models-for-serverless) | Text | [Llama 3.1 Community](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct/blob/main/LICENSE) | @@ -85,6 +85,7 @@ Released in April 2026, Gemma-4-31b-it is a state-of-the-art small-sized model o | Provider | Google | | Supports function calling | Yes | | Maximum images per request | 12 | +| Supported languages | English, Chinese, Japanese, Korean, and 136 additional languages | | Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100_16 (66k), H100-2 (262k), H100-SXM-2 (262k), H100-SXM-4 | | Hugging Face model card | [google/gemma-4-31B-it](https://huggingface.co/google/gemma-4-31B-it) | @@ -130,7 +131,7 @@ Released in April 2026, Gemma-4-26b-a4b-it is a state-of-the-art small-sized mod | Provider | Google | | Supports parallel tool-calling | Yes | | Maximum images per request | 12 | -| Supported languages | English, German, French, Chinese, Japanese, Korean | +| Supported languages | English, Chinese, Japanese, Korean, and 136 additional languages | | Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100 (262k), H100-2 (262k), H100-SXM-2 (262k), H100-SXM-4 | | Hugging Face model card | [gemma-4-26B-A4B-it](https://huggingface.co/google/gemma-4-26B-A4B-it) | From 086e7d662a3167ae3d73d0f217f95f2f992b7696 Mon Sep 17 00:00:00 2001 From: villyes Date: Tue, 26 May 2026 17:41:37 +0200 Subject: [PATCH 4/6] fix(genapis): add missing supported models MTA-7156 --- .../reference-content/supported-models.mdx | 40 +++++++++++-------- 1 file changed, 24 insertions(+), 16 deletions(-) diff --git a/pages/generative-apis/reference-content/supported-models.mdx b/pages/generative-apis/reference-content/supported-models.mdx index 10fae15a1a..04f1b9d2f2 100644 --- a/pages/generative-apis/reference-content/supported-models.mdx +++ b/pages/generative-apis/reference-content/supported-models.mdx @@ -3,7 +3,7 @@ title: Generative APIs supported models description: This page lists the open-source large language models supported by Scaleway. tags: dates: - validation: 2026-05-22 + validation: 2026-05-26 posted: 2024-04-18 --- This page provides a quick overview of available models in Scaleway's catalog and their core attributes. Expand any model below to see usage examples and detailed capabilities. @@ -24,7 +24,7 @@ This page provides a quick overview of available models in Scaleway's catalog an | [`gpt-oss-120b`](#gpt-oss-120b)| Yes | 128k | 32k | Text | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | | [`gpt-oss-20b`](#gpt-oss-20b) | No | 131k | N/A | Text | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | | [`whisper-large-v3`](#whisper-large-v3) | Yes | - | - | Audio transcription | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | -| [`qwen3.6-35b-a3b`](#qwen36-35b-a3b)| Yes | 262k | 32k | Text, Code, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | +| [`qwen3.6-35b-a3b`](#qwen36-35b-a3b)| Yes | 256k | 32k | Text, Code, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | | [`qwen3.5-397b-a17b`](#qwen35-397b-a17b)| Yes | 250k | 16k | Text, Code, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | | [`qwen3.5-35b-a3b`](#qwen35-35b-a3b) | No | 262k | N/A | Text, Code, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | | [`qwen3.5-122b-a10b`](#qwen35-122b-a10b) | No | 262k | N/A | Text, Code, Vision | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | @@ -78,15 +78,20 @@ This page provides a quick overview of available models in Scaleway's catalog an ### Gemma-4-31b-it -Released in April 2026, Gemma-4-31b-it is a state-of-the-art small-sized model optimized for agentic tasks and logical reasoning. +Released in April 2026, Gemma-4-31b-it is a frontier small-sized model to perform agentic and reasoning tasks on many languages. | Attribute | Value | |-----------|-------| | Provider | Google | -| Supports function calling | Yes | -| Maximum images per request | 12 | -| Supported languages | English, Chinese, Japanese, Korean, and 136 additional languages | -| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100_16 (66k), H100-2 (262k), H100-SXM-2 (262k), H100-SXM-4 | +| Supports structured output | Yes | +| Supports function calling | Yes | +| Supports parallel tool calling | Yes | +| Supported reasoning efforts | `none`, `low`, `medium`, `high` | +| Supported image formats | PNG, JPEG, WEBP, and non-animated GIFs | +| Maximum image resolution (pixels) | 896x896 | +| Token dimension (pixels)| 64x64 | +| Supported languages | English, Chinese, Japanese, Korean, and 136 additional languages | +| Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100 (66k), H100-2 (262k), H100-SXM-2 (262k), H100-SXM-4 | | Hugging Face model card | [google/gemma-4-31B-it](https://huggingface.co/google/gemma-4-31B-it) | \*Maximum context length is only mentioned when an instance's VRAM size limits context length. Otherwise, maximum context length is the one defined by the model. @@ -124,13 +129,19 @@ google/gemma-3-27b-it:bf16 ### Gemma-4-26b-a4b-it -Released in April 2026, Gemma-4-26b-a4b-it is a state-of-the-art small-sized model optimized for agentic tasks and logical reasoning. +Released in April 2026, Gemma-4-26b-a4b-it is a frontier small-sized model to perform agentic and reasoning tasks on many languages. +This model has a Mixture-of-Expert (MoE) architecture, providing significant throughput and fitting on a single H100 GPU while supporting its maximum context size. | Attribute | Value | |-----------|-------| | Provider | Google | +| Supports structured output | Yes | +| Supports function calling | Yes | | Supports parallel tool-calling | Yes | -| Maximum images per request | 12 | +| Supported reasoning efforts | `none`, `low`, `medium`, `high` | +| Supported image formats | PNG, JPEG, WEBP, and non-animated GIFs | +| Maximum image resolution (pixels) | 896x896 | +| Token dimension (pixels) | 64x64 | | Supported languages | English, Chinese, Japanese, Korean, and 136 additional languages | | Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100 (262k), H100-2 (262k), H100-SXM-2 (262k), H100-SXM-4 | | Hugging Face model card | [gemma-4-26B-A4B-it](https://huggingface.co/google/gemma-4-26B-A4B-it) | @@ -244,8 +255,6 @@ Released in April 2026, Qwen3.6-35b-a3b is a state-of-the-art small-sized model |-----------|-------| | Provider | Qwen | | Supports parallel tool-calling | Yes | -| Maximum images per request | 12 | -| Maximum videos per request | 1 | | Supported languages | English, French, Portuguese, German, Romanian, Swedish, and 70 additional languages and dialects | | Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100, H100-2, H100-SXM-2 | | Hugging Face model card | [qwen3.6-35b-a3b](https://huggingface.co/Qwen/Qwen3.6-35B-A3B) | @@ -268,6 +277,7 @@ This model was released as a frontier reasoning model on 16 February 2026. | Supports structured output | Yes | | Supports function calling | Yes | | Supports parallel tool-calling | Yes | +| Supported reasoning efforts | `none`, `low`, `medium`, `high` | | Supported image formats | PNG, JPEG, WEBP, and non-animated GIFs | | Supported video formats | MP4, MPEG, MOV, OGG and WEBM | | Maximum image resolution (pixels) | 4096x4096 | @@ -291,7 +301,6 @@ Released in March 2026, Qwen3.5-35b-a3b is a state-of-the-art small-sized model |-----------|-------| | Provider | Qwen | | Supports parallel tool-calling | Yes | -| Maximum images per request | 12 | | Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100-2, H100-SXM-2, H100-SXM-4, H100-SXM-8 | | Hugging Face model card | [Qwen3.5-35B-A3B-GPTQ-Int4](https://huggingface.co/Qwen/Qwen3.5-35B-A3B-GPTQ-Int4) | @@ -310,7 +319,6 @@ Released in March 2026, Qwen3.5-122b-a10b is a state-of-the-art medium-sized mod |-----------|-------| | Provider | Qwen | | Supports parallel tool-calling | Yes | -| Maximum images per request | 12 | | Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100-2, H100-SXM-2, H100-SXM-4, H100-SXM-8 | | Hugging Face model card | [Qwen/Qwen3.5-122B-A10B-GPTQ-Int4](https://huggingface.co/Qwen/Qwen3.5-122B-A10B-GPTQ-Int4) | @@ -464,6 +472,7 @@ Currently, this model should be used through Responses API, as Chat Completion d | Supports structured output | Yes | | Supports function calling | Yes | | Supports parallel tool-calling | Yes | +| Supported reasoning efforts | `low`, `medium`, `high` | | Supported languages | English | | Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100 | | Hugging Face model card | [gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) | @@ -518,7 +527,7 @@ Qwen3-235b-a22b-thinking-2507 is a highly capable and versatile language model, | Attribute | Value | |-----------|-------| | Provider | Qwen | -| Supports parallel tool-calling | Yes | +| Supports parallel tool-calling | Yes | | Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100-2 (100k), H100-SXM-2 (100k), H100-SXM-4 (262k) | | Hugging Face model card | [Qwen3-235B-A22B-Thinking-2507-AWQ](https://huggingface.co/QuantTrio/Qwen3-235B-A22B-Thinking-2507-AWQ) | @@ -598,12 +607,11 @@ Llama-3-8b-instruct is the first generation of 8B-param models by Meta, fine-tun |-----------|-------| | Provider | Meta | | Compatible Instances (max context in tokens\*) - Dedicated Deployment | L4, L40S, H100, H100-2, H100-SXM-2, H100-SXM-4, H100-SXM-8 | -| Hugging Face model card | [Meta-Llama-3-8B-Instruct-FP8](https://huggingface.co/neuralmagic/Meta-Llama-3-8B-Instruct-FP8) | +| Hugging Face model card | [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | #### Model name ``` -meta/llama-3-8b-instruct:fp8 meta/llama-3-8b-instruct:bf16 ``` From a12ff75b4ce32691393aff1bbfc5b4d615023ee7 Mon Sep 17 00:00:00 2001 From: vanda-scw Date: Fri, 29 May 2026 17:46:53 +0200 Subject: [PATCH 5/6] Fix info re. supported feature Co-authored-by: fpagny --- .../generative-apis/reference-content/supported-models.mdx | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/pages/generative-apis/reference-content/supported-models.mdx b/pages/generative-apis/reference-content/supported-models.mdx index 04f1b9d2f2..af86e2be0c 100644 --- a/pages/generative-apis/reference-content/supported-models.mdx +++ b/pages/generative-apis/reference-content/supported-models.mdx @@ -254,7 +254,13 @@ Released in April 2026, Qwen3.6-35b-a3b is a state-of-the-art small-sized model | Attribute | Value | |-----------|-------| | Provider | Qwen | +| Supports structured output | Yes | +| Supports function calling | Yes | | Supports parallel tool-calling | Yes | +| Supported reasoning efforts | `none`, `low`, `medium`, `high` | +| Supported image formats | PNG, JPEG, GIF | +| Maximum image resolution (pixels) | 4096x4096 | +| Token dimension (pixels) | 32x32 | | Supported languages | English, French, Portuguese, German, Romanian, Swedish, and 70 additional languages and dialects | | Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100, H100-2, H100-SXM-2 | | Hugging Face model card | [qwen3.6-35b-a3b](https://huggingface.co/Qwen/Qwen3.6-35B-A3B) | From 5920c25270998095cb5d413530b5b65c9822efcc Mon Sep 17 00:00:00 2001 From: vanda-scw Date: Fri, 29 May 2026 18:08:09 +0200 Subject: [PATCH 6/6] Fix info related to supported features Co-authored-by: fpagny --- pages/generative-apis/reference-content/supported-models.mdx | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/pages/generative-apis/reference-content/supported-models.mdx b/pages/generative-apis/reference-content/supported-models.mdx index af86e2be0c..1bb44a8944 100644 --- a/pages/generative-apis/reference-content/supported-models.mdx +++ b/pages/generative-apis/reference-content/supported-models.mdx @@ -181,8 +181,13 @@ Mistral-medium-3.5 is a unified model from Mistral with strong performance for i | Attribute | Value | |-----------|-------| | Provider | Mistral | +| Supports structured output | Yes | | Supports function calling | Yes | | Supports parallel tool-calling | Yes | +| Supported reasoning efforts | `none`, `high` | +| Supported image formats | PNG, JPEG, WEBP, GIF | +| Maximum image resolution (pixels) | 1540x1540 | +| Token dimension (pixels) | 28x28 | | Supported languages | English, French, German, Spanish, Portuguese, Italian, and 18 additional languages and dialects | | Compatible Instances (max context in tokens\*) - Dedicated Deployment | H100-SXM-4 (256k), H100-SXM-8 (256k) | | Hugging Face model card | [mistralai/Mistral-Medium-3.5-128B](https://huggingface.co/mistralai/Mistral-Medium-3.5-128B) |