Skip to content

Commit bb7fdcb

Browse files
techpro-aimlapigitbook-bot
authored andcommitted
GITBOOK-489: docs: add 2 nemotrons nano
1 parent a9f5246 commit bb7fdcb

9 files changed

Lines changed: 333 additions & 5 deletions

File tree

docs/SUMMARY.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,8 @@
8989
* [hermes-4-405b](api-references/text-models-llm/nousresearch/hermes-4-405b.md)
9090
* [NVIDIA](api-references/text-models-llm/NVIDIA/README.md)
9191
* [llama-3.1-nemotron-70b](api-references/text-models-llm/NVIDIA/llama-3.1-nemotron-70b.md)
92+
* [nemotron-nano-9b-v2](api-references/text-models-llm/nvidia/nemotron-nano-9b-v2.md)
93+
* [nemotron-nano-12b-v2-vl](api-references/text-models-llm/nvidia/llama-3.1-nemotron-70b-1.md)
9294
* [OpenAI](api-references/text-models-llm/OpenAI/README.md)
9395
* [gpt-3.5-turbo](api-references/text-models-llm/OpenAI/gpt-3.5-turbo.md)
9496
* [gpt-4](api-references/text-models-llm/OpenAI/gpt-4.md)

docs/api-references/model-database.md

Lines changed: 2 additions & 2 deletions
Large diffs are not rendered by default.

docs/api-references/text-models-llm/README.md

Lines changed: 3 additions & 1 deletion
Large diffs are not rendered by default.

docs/api-references/text-models-llm/meta/llama-4-scout.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
## Model Overview
66

7-
A 17 billion active parameter model with 16 experts, is the best multimodal model in the world in its class and is more powerful than all previous generation Llama models. Additionally, the model offers an industry-leading context window of 10M and delivers better results than [Gemma 3](../google/gemma-3.md), Gemini 2.0 Flash-Lite, and Mistral 3.1 on a wide range of common benchmarks.
7+
A 17 billion active parameter model with 16 experts, is the best multimodal model in the world in its class and is more powerful than all previous generation Llama models. Additionally, the model offers an industry-leading context window of 1M and delivers better results than [Gemma 3](../google/gemma-3.md), Gemini 2.0 Flash-Lite, and Mistral 3.1 on a wide range of common benchmarks.
88

99
## How to Make a Call
1010

Lines changed: 158 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,158 @@
1+
# nemotron-nano-12b-v2-vl
2+
3+
<table data-header-hidden data-full-width="true"><thead><tr><th width="546.4443969726562" valign="top"></th><th width="202.666748046875" valign="top"></th></tr></thead><tbody><tr><td valign="top"><div data-gb-custom-block data-tag="hint" data-style="info" class="hint hint-info"><p>This documentation is valid for the following list of our models:</p><ul><li><code>nvidia/nemotron-nano-12b-v2-vl</code></li></ul></div></td><td valign="top"><a href="https://aimlapi.com/app/?model=nvidia/nemotron-nano-12b-v2-vl&#x26;mode=chat" class="button primary">Try in Playground</a></td></tr></tbody></table>
4+
5+
## Model Overview
6+
7+
The model offers strong document understanding and summarization capabilities.
8+
9+
## How to Make a Call
10+
11+
<details>
12+
13+
<summary>Step-by-Step Instructions</summary>
14+
15+
### :digit\_one: Setup You Can’t Skip
16+
17+
:black\_small\_square: [**Create an Account**](https://aimlapi.com/app/sign-up): Visit the AI/ML API website and create an account (if you don’t have one yet).\
18+
:black\_small\_square: [**Generate an API Key**](https://aimlapi.com/app/keys): After logging in, navigate to your account dashboard and generate your API key. Ensure that key is enabled on UI.
19+
20+
### &#x20;:digit\_two: Copy the code example
21+
22+
At the bottom of this page, you'll find [a code example](llama-3.1-nemotron-70b-1.md#code-example) that shows how to structure the request. Choose the code snippet in your preferred programming language and copy it into your development environment.
23+
24+
### :digit\_three: Modify the code example
25+
26+
:black\_small\_square: Replace `<YOUR_AIMLAPI_KEY>` with your actual AI/ML API key from your account.\
27+
:black\_small\_square: Insert your question or request into the `content` field—this is what the model will respond to.
28+
29+
### :digit\_four: <sup><sub><mark style="background-color:yellow;">(Optional)<mark style="background-color:yellow;"><sub></sup> Adjust other optional parameters if needed
30+
31+
Only `model` and `messages` are required parameters for this model (and we’ve already filled them in for you in the example), but you can include optional parameters if needed to adjust the model’s behavior. Below, you can find the corresponding [API schema](llama-3.1-nemotron-70b-1.md#api-schema), which lists all available parameters along with notes on how to use them.
32+
33+
### :digit\_five: Run your modified code
34+
35+
Run your modified code in your development environment. Response time depends on various factors, but for simple prompts it rarely exceeds a few seconds.
36+
37+
{% hint style="success" %}
38+
If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our [Quickstart guide](../../../quickstart/setting-up.md).
39+
{% endhint %}
40+
41+
</details>
42+
43+
## API Schema
44+
45+
{% openapi-operation spec="nemotron-nano-12b-v2-vl" path="/v1/chat/completions" method="post" %}
46+
[OpenAPI nemotron-nano-12b-v2-vl](https://raw.githubusercontent.com/aimlapi/api-docs/refs/heads/main/docs/api-references/text-models-llm/NVIDIA/nemotron-nano-12b-v2-vl.json)
47+
{% endopenapi-operation %}
48+
49+
## Code Example
50+
51+
{% tabs %}
52+
{% tab title="Python" %}
53+
{% code overflow="wrap" %}
54+
```python
55+
import requests
56+
import json # for getting a structured output with indentation
57+
58+
response = requests.post(
59+
"https://api.aimlapi.com/v1/chat/completions",
60+
headers={
61+
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
62+
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
63+
"Content-Type":"application/json"
64+
},
65+
json={
66+
"model":"nvidia/nemotron-nano-12b-v2-vl",
67+
"messages":[
68+
{
69+
"role":"user",
70+
"content":"Hello" # insert your prompt here, instead of Hello
71+
}
72+
]
73+
}
74+
)
75+
76+
data = response.json()
77+
print(json.dumps(data, indent=2, ensure_ascii=False))
78+
```
79+
{% endcode %}
80+
{% endtab %}
81+
82+
{% tab title="JavaScript" %}
83+
{% code overflow="wrap" %}
84+
```javascript
85+
async function main() {
86+
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
87+
method: 'POST',
88+
headers: {
89+
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
90+
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
91+
'Content-Type': 'application/json',
92+
},
93+
body: JSON.stringify({
94+
model: 'nvidia/nemotron-nano-12b-v2-vl',
95+
messages:[
96+
{
97+
role:'user',
98+
content: 'Hello' // insert your prompt here, instead of Hello
99+
}
100+
],
101+
}),
102+
});
103+
104+
const data = await response.json();
105+
console.log(JSON.stringify(data, null, 2));
106+
}
107+
108+
main();
109+
```
110+
{% endcode %}
111+
{% endtab %}
112+
{% endtabs %}
113+
114+
<details>
115+
116+
<summary>Response</summary>
117+
118+
{% code overflow="wrap" %}
119+
```json5
120+
{
121+
"id": "gen-1762343744-rdCcOL8byCQwRBZ8QCkv",
122+
"provider": "DeepInfra",
123+
"model": "nvidia/nemotron-nano-12b-v2-vl",
124+
"object": "chat.completion",
125+
"created": 1762343744,
126+
"choices": [
127+
{
128+
"logprobs": null,
129+
"finish_reason": "stop",
130+
"native_finish_reason": "stop",
131+
"index": 0,
132+
"message": {
133+
"role": "assistant",
134+
"content": "\n\nHello! How can I assist you today?\n",
135+
"refusal": null,
136+
"reasoning": "Okay, the user said \"Hello\". Let me start by greeting them back in a friendly and welcoming way. I should keep it simple and approachable, maybe something like \"Hello! How can I assist you today?\" That should work. I want to make sure they feel comfortable and open to asking for help. Let me check if there's anything else I need to add. No, keeping it straightforward is best here. Ready to respond.\n",
137+
"reasoning_details": [
138+
{
139+
"type": "reasoning.text",
140+
"text": "Okay, the user said \"Hello\". Let me start by greeting them back in a friendly and welcoming way. I should keep it simple and approachable, maybe something like \"Hello! How can I assist you today?\" That should work. I want to make sure they feel comfortable and open to asking for help. Let me check if there's anything else I need to add. No, keeping it straightforward is best here. Ready to respond.\n",
141+
"format": "unknown",
142+
"index": 0
143+
}
144+
]
145+
}
146+
}
147+
],
148+
"usage": {
149+
"prompt_tokens": 14,
150+
"completion_tokens": 102,
151+
"total_tokens": 116,
152+
"prompt_tokens_details": null
153+
}
154+
}
155+
```
156+
{% endcode %}
157+
158+
</details>
Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
# nemotron-nano-9b-v2
2+
3+
<table data-header-hidden data-full-width="true"><thead><tr><th width="546.4443969726562" valign="top"></th><th width="202.666748046875" valign="top"></th></tr></thead><tbody><tr><td valign="top"><div data-gb-custom-block data-tag="hint" data-style="info" class="hint hint-info"><p>This documentation is valid for the following list of our models:</p><ul><li><code>nvidia/nemotron-nano-9b-v2</code></li></ul></div></td><td valign="top"><a href="https://aimlapi.com/app/?model=nvidia/nemotron-nano-9b-v2&#x26;mode=chat" class="button primary">Try in Playground</a></td></tr></tbody></table>
4+
5+
## Model Overview
6+
7+
A unified model designed for both reasoning and non-reasoning tasks. It processes user inputs by first producing a reasoning trace, then delivering a final answer. The reasoning behavior can be adjusted through the system prompt — allowing the model to either show its intermediate reasoning steps or respond directly with the final result.\
8+
The model offers strong document understanding and summarization capabilities.&#x20;
9+
10+
## How to Make a Call
11+
12+
<details>
13+
14+
<summary>Step-by-Step Instructions</summary>
15+
16+
### :digit\_one: Setup You Can’t Skip
17+
18+
:black\_small\_square: [**Create an Account**](https://aimlapi.com/app/sign-up): Visit the AI/ML API website and create an account (if you don’t have one yet).\
19+
:black\_small\_square: [**Generate an API Key**](https://aimlapi.com/app/keys): After logging in, navigate to your account dashboard and generate your API key. Ensure that key is enabled on UI.
20+
21+
### &#x20;:digit\_two: Copy the code example
22+
23+
At the bottom of this page, you'll find [a code example](nemotron-nano-9b-v2.md#code-example) that shows how to structure the request. Choose the code snippet in your preferred programming language and copy it into your development environment.
24+
25+
### :digit\_three: Modify the code example
26+
27+
:black\_small\_square: Replace `<YOUR_AIMLAPI_KEY>` with your actual AI/ML API key from your account.\
28+
:black\_small\_square: Insert your question or request into the `content` field—this is what the model will respond to.
29+
30+
### :digit\_four: <sup><sub><mark style="background-color:yellow;">(Optional)<mark style="background-color:yellow;"><sub></sup> Adjust other optional parameters if needed
31+
32+
Only `model` and `messages` are required parameters for this model (and we’ve already filled them in for you in the example), but you can include optional parameters if needed to adjust the model’s behavior. Below, you can find the corresponding [API schema](nemotron-nano-9b-v2.md#api-schema), which lists all available parameters along with notes on how to use them.
33+
34+
### :digit\_five: Run your modified code
35+
36+
Run your modified code in your development environment. Response time depends on various factors, but for simple prompts it rarely exceeds a few seconds.
37+
38+
{% hint style="success" %}
39+
If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our [Quickstart guide](../../../quickstart/setting-up.md).
40+
{% endhint %}
41+
42+
</details>
43+
44+
## API Schema
45+
46+
{% openapi-operation spec="nemotron-nano-9b-v2" path="/v1/chat/completions" method="post" %}
47+
[OpenAPI nemotron-nano-9b-v2](https://raw.githubusercontent.com/aimlapi/api-docs/refs/heads/main/docs/api-references/text-models-llm/NVIDIA/nemotron-nano-9b-v2.json)
48+
{% endopenapi-operation %}
49+
50+
## Code Example
51+
52+
{% tabs %}
53+
{% tab title="Python" %}
54+
{% code overflow="wrap" %}
55+
```python
56+
import requests
57+
import json # for getting a structured output with indentation
58+
59+
response = requests.post(
60+
"https://api.aimlapi.com/v1/chat/completions",
61+
headers={
62+
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
63+
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
64+
"Content-Type":"application/json"
65+
},
66+
json={
67+
"model":"nvidia/nemotron-nano-9b-v2",
68+
"messages":[
69+
{
70+
"role":"user",
71+
"content":"Hello" # insert your prompt here, instead of Hello
72+
}
73+
]
74+
}
75+
)
76+
77+
data = response.json()
78+
print(json.dumps(data, indent=2, ensure_ascii=False))
79+
```
80+
{% endcode %}
81+
{% endtab %}
82+
83+
{% tab title="JavaScript" %}
84+
{% code overflow="wrap" %}
85+
```javascript
86+
async function main() {
87+
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
88+
method: 'POST',
89+
headers: {
90+
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
91+
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
92+
'Content-Type': 'application/json',
93+
},
94+
body: JSON.stringify({
95+
model: 'nvidia/nemotron-nano-9b-v2',
96+
messages:[
97+
{
98+
role:'user',
99+
content: 'Hello' // insert your prompt here, instead of Hello
100+
}
101+
],
102+
}),
103+
});
104+
105+
const data = await response.json();
106+
console.log(JSON.stringify(data, null, 2));
107+
}
108+
109+
main();
110+
```
111+
{% endcode %}
112+
{% endtab %}
113+
{% endtabs %}
114+
115+
<details>
116+
117+
<summary>Response</summary>
118+
119+
{% code overflow="wrap" %}
120+
```json5
121+
{
122+
"id": "gen-1762343928-hETm6La6igsboRxBM0fa",
123+
"provider": "DeepInfra",
124+
"model": "nvidia/nemotron-nano-9b-v2",
125+
"object": "chat.completion",
126+
"created": 1762343928,
127+
"choices": [
128+
{
129+
"logprobs": null,
130+
"finish_reason": "stop",
131+
"native_finish_reason": "stop",
132+
"index": 0,
133+
"message": {
134+
"role": "assistant",
135+
"content": "\n\nHello! How can I assist you today? 😊\n",
136+
"refusal": null,
137+
"reasoning": "Okay, the user just said \"Hello\". That's a greeting. I should respond politely. Let me make sure to acknowledge their greeting and offer help. Maybe say something like \"Hello! How can I assist you today?\" That's friendly and opens the door for them to ask questions. I should keep it simple and welcoming.\n",
138+
"reasoning_details": [
139+
{
140+
"type": "reasoning.text",
141+
"text": "Okay, the user just said \"Hello\". That's a greeting. I should respond politely. Let me make sure to acknowledge their greeting and offer help. Maybe say something like \"Hello! How can I assist you today?\" That's friendly and opens the door for them to ask questions. I should keep it simple and welcoming.\n",
142+
"format": "unknown",
143+
"index": 0
144+
}
145+
]
146+
}
147+
}
148+
],
149+
"usage": {
150+
"prompt_tokens": 14,
151+
"completion_tokens": 84,
152+
"total_tokens": 98,
153+
"prompt_tokens_details": null
154+
}
155+
}
156+
```
157+
{% endcode %}
158+
159+
</details>

0 commit comments

Comments
 (0)