scaleway · vanda-scw · Jun 5, 2026 · Jun 4, 2026 · Jun 4, 2026 · Jun 4, 2026
@@ -6,5 +6,5 @@ category: ai-data
 product: generative-apis
 ---
 
-The [Responses API](https://www.scaleway.com/en/developers/api/generative-apis/#path-responses-beta-create-a-response) is now generally available. The Responses API is recommended for use only with the `gpt-oss-120b` model. For more information, see the [Chat Completions and Responses API comparison](/managed-inference/reference-content/openai-compatibility/). 
+The [Responses API](https://www.scaleway.com/en/developers/api/generative-apis/responses) is now generally available. The Responses API is recommended for use only with the `gpt-oss-120b` model. For more information, see the [Chat Completions and Responses API comparison](/managed-inference/reference-content/openai-compatibility/). 
 
@@ -8,5 +8,5 @@ product: generative-apis
 
 [Voxtral Small 2507](/generative-apis/reference-content/supported-models/) is now available on Generative APIs.
 
-Voxtral is a frontier chat and audio model that can transcribe or understand audio files using [Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/#path-chat-completions-create-a-chat-completion).
+Voxtral is a frontier chat and audio model that can transcribe or understand audio files using [Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/chat-completions).
 
@@ -2,7 +2,7 @@
 macro: chat-comp-vs-responses-api
 ---
 
-Both the [Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/#path-chat-completions-create-a-chat-completion) and the [Responses API](https://www.scaleway.com/en/developers/api/generative-apis/#path-responses-beta-create-a-response) are OpenAI-compatible REST APIs that can be used for generating and manipulating conversations, structured outputs, tool use, and multimodal inputs. The Chat Completions API is focused on generating conversational responses, while the Responses API is a more general REST API providing additional features such as stateful conversations and tool execution.
+Both the [Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/chat-completions) and the [Responses API](https://www.scaleway.com/en/developers/api/generative-apis/responses) are OpenAI-compatible REST APIs that can be used for generating and manipulating conversations, structured outputs, tool use, and multimodal inputs. The Chat Completions API is focused on generating conversational responses, while the Responses API is a more general REST API providing additional features such as stateful conversations and tool execution.
 
 The **Chat Completions** API was released in 2023, and is an industry standard for building AI applications, being initially designed for handling multi-turn conversations. It is stateless, but allows users to manage conversation history by appending each new message to the ongoing conversation. Messages in the conversation can include text, images, and audio extracts. The API supports `function` tool-calling, allowing developers to define functions that the model can choose to call. If it does so, it returns the function name and arguments, which the developer's code must execute and feed back into the conversation.
 

@@ -12,7 +12,7 @@ Scaleway's Generative APIs service allows users to interact with powerful audio
 
 There are several ways to interact with audio models:
 - The Scaleway [console](https://console.scaleway.com) provides a complete [playground](/generative-apis/how-to/query-audio-models/#accessing-the-playground), aiming to test models, adapt parameters, and observe how these changes affect the output in real-time.
-- Via the [Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/#path-chat-completions-create-a-chat-completion) or the [Audio Transcriptions API](https://www.scaleway.com/en/developers/api/generative-apis/#path-audio-create-an-audio-transcription)
+- Via the [Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/chat-completions) or the [Audio Transcriptions API](https://www.scaleway.com/en/developers/api/generative-apis/#path-audio-create-an-audio-transcription)
 - Via your own [dedicated deployment](/generative-apis/how-to/create-deployment/) of a chosen model
 
 <Requirements />
@@ -47,7 +47,7 @@ In the example that follows, we will use the OpenAI Python client.
 
 ###  Audio Transcriptions API or Chat Completions API?
 
-Both the [Audio Transcriptions API](https://www.scaleway.com/en/developers/api/generative-apis/#path-audio-create-an-audio-transcription) and the [Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/#path-chat-completions-create-a-chat-completion)  are OpenAI-compatible REST APIs that accept audio input.
+Both the [Audio Transcriptions API](https://www.scaleway.com/en/developers/api/generative-apis/#path-audio-create-an-audio-transcription) and the [Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/chat-completions)  are OpenAI-compatible REST APIs that accept audio input.
 
 The **Audio Transcriptions API** is designed for pure speech-to-text (audio transcription) tasks, such as transcribing a voice note or meeting recording file. It can be used with compatible audio models, such as `whisper-large-v3`.
 
@@ -213,6 +213,6 @@ You can now generate a text transcription of a given audio file using a suitable
     print(response.choices[0].message.content)
     ```
 
-    Various parameters such as `temperature` and `max_tokens` control the output. See the [dedicated API documentation](https://www.scaleway.com/en/developers/api/generative-apis/#path-chat-completions-create-a-chat-completion) for a full list of all available parameters.
+    Various parameters such as `temperature` and `max_tokens` control the output. See the [dedicated API documentation](https://www.scaleway.com/en/developers/api/generative-apis/chat-completions) for a full list of all available parameters.
     </TabsTab>
 </Tabs>
@@ -13,7 +13,7 @@ Scaleway's Generative APIs service allows users to interact with powerful langua
 
 There are several ways to interact with language models:
 - The Scaleway [console](https://console.scaleway.com) provides complete [playground](/generative-apis/how-to/query-language-models/#accessing-the-playground), aiming to test models, adapt parameters, and observe how these changes affect the output in real-time
-- Via the [Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/#path-chat-completions-create-a-chat-completion) or the [Responses API](https://www.scaleway.com/en/developers/api/generative-apis/#path-responses-create-a-response)
+- Via the [Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/chat-completions) or the [Responses API](https://www.scaleway.com/en/developers/api/generative-apis/responses)
 - Via your own [dedicated deployment](/generative-apis/how-to/create-deployment/) of a chosen model
 
 <Requirements />
@@ -153,7 +153,7 @@ The following parameters will influence the output of the model:
   - **`top_p`**: Recommended for advanced use cases only. You usually only need to use temperature. `top_p` controls the diversity of the output, using nucleus sampling, where the model considers the tokens with top probabilities until the cumulative probability reaches `top_p`.
   - **`stop`**: A string or list of strings where the model will stop generating further tokens. This is useful for controlling the end of the output.
 
-  See the [dedicated API documentation](https://www.scaleway.com/en/developers/api/generative-apis/#path-chat-completions-create-a-chat-completion) for a full list of all available parameters.
+  See the [dedicated API documentation](https://www.scaleway.com/en/developers/api/generative-apis/chat-completions) for a full list of all available parameters.
 
   </TabsTab>
 
@@ -165,7 +165,7 @@ are enforced for each model, to avoid edge cases where tokens are generated inde
   - **`temperature`**: Controls the output's randomness. Lower values (e.g., 0.2) make the output more deterministic, while higher values (e.g., 0.8) make it more creative.
   - **`top_p`**: Recommended for advanced use cases only. You usually only need to use temperature. `top_p` controls the diversity of the output, using nucleus sampling, where the model considers the tokens with top probabilities until the cumulative probability reaches `top_p`.
 
-  See the [dedicated API documentation](https://www.scaleway.com/en/developers/api/generative-apis/#path-responses-create-a-response) for a full list of all available parameters.
+  See the [dedicated API documentation](https://www.scaleway.com/en/developers/api/generative-apis/responses) for a full list of all available parameters.
 
   </TabsTab>
 </Tabs>

@@ -17,7 +17,7 @@ Language models supporting the reasoning feature include `gpt-oss-120b`. See [Su
 You can interact with reasoning models in the following ways:
 
 - Use the [playground](/generative-apis/how-to/query-reasoning-models/#accessing-the-playground) in the Scaleway [console](https://console.scaleway.com) to test models, adapt parameters, and observe how your changes affect the output in real-time
-- Use the [Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/#path-chat-completions-create-a-chat-completion) or the [Responses API](https://www.scaleway.com/en/developers/api/generative-apis/#path-responses-create-a-response)
+- Use the [Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/chat-completions) or the [Responses API](https://www.scaleway.com/en/developers/api/generative-apis/responses)
 - Use your own [dedicated deployment](/generative-apis/how-to/create-deployment/) of a chosen model
 
 <Requirements />
@@ -58,7 +58,7 @@ In the example that follows, we will use the OpenAI Python client.
 
 ### Chat Completions API or Responses API?
 
-Both the [Chat Completions API](/https://www.scaleway.com/en/developers/api/generative-apis/#path-chat-completions-create-a-chat-completion) and the [Responses API](https://www.scaleway.com/en/developers/api/generative-apis/#path-chat-completions-create-a-chat-completion) allow you to access and control reasoning for supported models.
+Both the [Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/chat-completions) and the [Responses API](https://www.scaleway.com/en/developers/api/generative-apis/responses) allow you to access and control reasoning for supported models.
 
 For more details on Chat Completions versus Responses API, see the information provided in the [querying language models](/generative-apis/how-to/query-language-models/#chat-completions-api-or-responses-api) documentation.
 
@@ -299,4 +299,3 @@ data: {
 ## Impact on token generation
 
 Reasoning models generate reasoning tokens, which are billable. Generally these are in the model's output as part of the reasoning content. To limit the generation of reasoning tokens, you can adjust settings for the `reasoning_effort` and `max_completion_tokens` / `max_output_tokens` parameters. Alternatively, use a non-reasoning model to avoid the generation of reasoning tokens and subsequent billing.
-
@@ -17,7 +17,7 @@ Scaleway's Generative APIs service allows users to interact with powerful vision
 There are several ways to interact with vision models:
 
 - **Scaleway console playground**: The Scaleway [console](https://console.scaleway.com) provides a complete [playground](/generative-apis/quickstart/#interacting-with-generative-apis-via-the-playground) for Generative APIs. This visual interface allows you to test models, adapt query parameters, and observe how these changes affect the output in real-time.
-- **[Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/#path-chat-completions-create-a-chat-completion)**: Use the chat completions API to query vision models programmatically.
+- **[Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/chat-completions)**: Use the chat completions API to query vision models programmatically.
 - **Your own [dedicated deployment](/generative-apis/how-to/create-deployment/)**: Deploy a model on your own Instance and interact with the model in an isolated environment
 
 <Requirements />

@@ -18,7 +18,7 @@ There are two main modes for generating JSON: **Object Mode** (schemaless) and *
 
 There are several ways to interact with language models:
 - The Scaleway [console](https://console.scaleway.com) provides a complete [playground](/generative-apis/how-to/query-language-models/#accessing-the-playground), aiming to test models, adapt parameters, and observe how these changes affect the output in real-time.
-- Via the [Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/#path-chat-completions-create-a-chat-completion) or the [Responses API](https://www.scaleway.com/en/developers/api/generative-apis/#path-responses-create-a-response)
+- Via the [Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/chat-completions) or the [Responses API](https://www.scaleway.com/en/developers/api/generative-apis/responses)
 
 <Requirements />
Original file line number	Diff line number	Diff line change
Expand Up		@@ -8,5 +8,5 @@ product: generative-apis

		[Voxtral Small 2507](/generative-apis/reference-content/supported-models/) is now available on Generative APIs.

		Voxtral is a frontier chat and audio model that can transcribe or understand audio files using [Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/#path-chat-completions-create-a-chat-completion).
		Voxtral is a frontier chat and audio model that can transcribe or understand audio files using [Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/chat-completions).