add nvidia nemotron example by maxkahan · Pull Request #426 · GetStream/Vision-Agents

maxkahan · 2026-03-16T20:09:03Z

This pull request adds a new pyproject.toml configuration file for the nemotron-example project. The file sets up the basic project metadata and specifies dependencies required for running a fraud detection demo using NVIDIA Nemotron with Vision Agents.

Project setup and dependency management:

Added a new pyproject.toml file for the nemotron-example project, including project metadata, Python version requirement, and dependencies such as vision-agents and related plugins.
Configured workspace sources for core dependencies to ensure correct resolution and development workflow.

Summary by CodeRabbit

New Features
- Added a voice-based fraud-detection agent example that can join real-time calls, investigate transactions, and perform actions via conversation (flag transactions, issue replacement cards, cancel charges). It streams responses, handles tool-like action calls, and emits events for downstream TTS/UX.
Documentation
- Added example README and updated main README with a demo entry, capabilities list, tutorial link, and demo GIF.
Chores
- Added project configuration and dependencies for the new example.

coderabbitai · 2026-03-16T20:09:20Z

Important

Review skipped

Review was skipped due to path filters

⛔ Files ignored due to path filters (1)

assets/demo_gifs/fraud_detection.gif is excluded by !**/*.gif

CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including **/dist/** will override the default block on the dist directory, by removing the pattern from both the lists.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 64caadca-358d-4772-b338-37845708765b

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a new standalone Nemotron-powered fraud-detection voice agent example with a custom NemotronLLM that parses/executes XML <tool_call> blocks (bounded recursion), integrates Deepgram STT/TTS and GetStream edge, provides mock account/transaction tools, emits streaming/latency events, and includes project metadata and docs.

Changes

Cohort / File(s)	Summary
Nemotron Fraud-Detection Example `plugins/openai/examples/nemotron_example/nemotron_example.py`	New example script. Introduces `NemotronLLM` (streaming handling: strip Nemotron “thinking” markers, detect/parse `<tool_call>` XML, execute tool calls with recursion limit ≤5, append `<tool_response>` to history), tool implementations (`get_account_info`, `get_recent_transactions`, `flag_transaction`, `issue_replacement_card`, `cancel_charge`), per-agent caching, JSONL tool-call logging, event emissions for chunks/completion with latency/TTFB, optional GPT-4o-mini summarization, `create_agent` and `join_call` helpers, and CLI entrypoint.
Project Configuration `plugins/openai/examples/nemotron_example/pyproject.toml`	New pyproject declaring `nemotron-example` metadata, Python >=3.10, runtime deps (`vision-agents`, `vision-agents-plugins.openai`, `vision-agents-plugins.getstream`, `vision-agents-plugins.deepgram`, `python-dotenv`) and workspace source mapping.
Docs / README `README.md`, `plugins/openai/examples/nemotron_example/README.md`	Added demo entry to main README and a dedicated example README with description, setup instructions, required env vars, run instructions, and demo link/GIF.

Sequence Diagram(s)

sequenceDiagram
    participant User as User/Voice
    participant STT as Deepgram STT
    participant Agent as Agent Edge
    participant LLM as NemotronLLM
    participant Tools as Tool Functions
    participant TTS as Deepgram TTS

    User->>STT: voice input
    STT->>Agent: transcript
    Agent->>LLM: messages + tool definitions
    LLM->>LLM: strip <think>, parse <tool_call> blocks
    alt tool_call found
        LLM->>Tools: execute tool_call (repeatable, ≤5 rounds)
        Tools-->>LLM: tool result (JSON)
        LLM->>LLM: append <tool_response>, re-invoke for final reply
    end
    LLM->>Agent: stream events (chunks, complete, timings)
    Agent->>TTS: response text
    TTS->>User: synthesized speech

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

The voice arrives like a bruise; I pry it open—
XML like teeth that click and give a name.
The ledger breathes and coughs its petty sins,
A cold hand flags the card and writes its grief.
Nemotron thinks; the room waits with small, thin fists.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 61.54% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'add nvidia nemotron example' clearly and accurately summarizes the main change—adding a new NVIDIA Nemotron fraud detection example with accompanying documentation and configuration.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch add-nemotron-example

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Tip

CodeRabbit can generate a title for your PR based on the changes with custom instructions.

Set the reviews.auto_title_instructions setting to generate a title for your PR based on the changes in the PR with custom instructions.

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (2)

plugins/openai/examples/nemotron_example/nemotron_example.py (2)

76-96: Remove embedded dead code from prompt string.

Within this prompt, a ghost stirs—an entire duplicate of INSTRUCTIONS lies entombed in triple quotes, serving no purpose but confusion. This vestigial fragment should be excised.

🧹 Proposed fix

 SUMMARIZE_PROMPT = (
     "Rewrite the following as something a real person would say on the phone — casual, warm, natural. "
     "Two to three short sentences, around 25-35 words. No jargon, no IDs, no markdown, no em-dashes, no bullet points.\n\n"
     "Keep all key info about actions taken (frozen card, cancelled charge, replacement card, refund timeline). "
     "Example tone: 'Okay so I've frozen your card and cancelled that charge. The money should be back in a couple of days. Want me to send you a new card?'\n\n"
     "Do not use any special characters that only make sense in writing like brackets.\n"
-
-    "here's the prompt i gave to the llm that generates the text you review:\n\n"
-
-"""    INSTRUCTIONS = (
-    "detailed thinking off\n\n"
-    "You are a bank fraud phone agent on a live phone call. Your output goes directly to TTS.\n\n"
-    "When you spot a suspicious transaction, explain WHY — e.g. 'There's a large charge in Miami but you live in London.' "
-    "Suspicious means: different city from the customer's home, or unusually large compared to their other transactions.\n"
-    "If you flag a transaction, the card should be frozen immediately to prevent further fraud, but only after you have explicitly confirmed with the customer that they did not make the transaction.\n\n"
-    "If you freeze a card due to a transaction, you should also offer to issue a replacement card. If the customer confirms, issue a virtual card immediately and a physical card that arrives in 3-5 business days.\n\n"
-    "If a transaction is fraudulent, you should also cancel the charge to return the money to the customer's account. This typically takes 24-48 hours.\n\n"
-    "The user cannot see what you can see. They are on the phone."
-    "RULES:\n"
-    "- MAXIMUM 15 words per response. One short sentence. Responses over 15 words get cut off.\n"
-    "- NEVER take action (flag, freeze, cancel) without the customer explicitly confirming.\n"
-    "- After taking any action, always tell the customer what you did.\n"
-    "- Work one step at a time: look up info, briefly tell the customer what you found, ask what to do.\n"
-    "- Do NOT list data, read IDs, dates, or dollar amounts aloud.\n"
-    "- Speak casually like a real person. No markdown, no bullet points.\n"
-)
-"""
 )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@plugins/openai/examples/nemotron_example/nemotron_example.py` around lines 76
- 96, The INSTRUCTIONS prompt string contains an extraneous duplicated block
wrapped in triple quotes (a dead-code fragment) that should be removed; locate
the INSTRUCTIONS variable in nemotron_example.py and delete the entire
triple-quoted duplicate so only the intended INSTRUCTIONS string literal
remains, ensuring quotes and parentheses around INSTRUCTIONS remain balanced and
no other prompt text is altered.

207-211: Replace print() statements with logger calls.

These debug artifacts, raw and unfiltered, should not persist. Use logger.debug() for development tracing—consistent with the module's logging pattern.

♻️ Proposed fix

-            print(f">>> THINK TAG FOUND at pos {think_end} / {len(full_text)} chars")
+            logger.debug("THINK TAG FOUND at pos %d / %d chars", think_end, len(full_text))
             full_text = full_text[think_end + len(THINK_END_TAG) :].lstrip()
-            print(f">>> AFTER STRIP: {full_text[:200]}")
+            logger.debug("AFTER STRIP: %s", full_text[:200])
         else:
-            print(f">>> NO THINK TAG. Raw: {full_text[:300]}")
+            logger.debug("NO THINK TAG. Raw: %s", full_text[:300])

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@plugins/openai/examples/nemotron_example/nemotron_example.py` around lines
207 - 211, Replace the debug print statements that expose raw text with
logger.debug calls: change the three print(...) occurrences around THINK_END_TAG
handling (the messages showing think_end position, AFTER STRIP preview, and NO
THINK TAG raw preview) to logger.debug(...) using the same formatted strings;
ensure the module-level logger (e.g., logger = logging.getLogger(__name__)) is
used or imported so the logging calls follow the module's logging pattern and do
not leak raw debug output.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/openai/examples/nemotron_example/nemotron_example.py`:
- Around line 428-436: The call to agent.create_call in join_call has its
arguments reversed; Edge.create_call expects the call_id as the first positional
argument and call_type as a keyword. Fix join_call by calling agent.create_call
with call_id first and pass call_type as a keyword (e.g.,
agent.create_call(call_id, call_type=call_type, **kwargs)) so the values map
correctly to Edge.create_call's signature; update the join_call function
accordingly.
- Line 322: The line TOOL_CALLS_LOG.write_text("") in create_agent() truncates
the persistent tool-call log on every agent creation; change this to either
append new entries (open and append or use write_text(existing + new) behavior)
or remove the truncation entirely and instead create a timestamped/log-rotated
filename when initializing TOOL_CALLS_LOG so previous records are preserved;
update the code paths that write to TOOL_CALLS_LOG to use append semantics or
the new timestamped file creation logic and ensure create_agent() no longer
clears the file.
- Around line 173-182: The try/except in the Nemotron call is too broad and
skips emitting observability events; replace the bare except with specific
exceptions raised by self._client.chat.completions.create (e.g.,
HTTPError/Timeout/Error types from the Nemotron client or underlying http
library) and handle them explicitly, emit the LLMRequestStartedEvent before the
call and emit an LLMErrorEvent with the exception details on failure, log the
error with logger.exception including the exception, and ensure you still return
LLMResponseEvent(original=None, text="") after emitting the error; keep the
successful path calling _process_streaming_response.

---

Nitpick comments:
In `@plugins/openai/examples/nemotron_example/nemotron_example.py`:
- Around line 76-96: The INSTRUCTIONS prompt string contains an extraneous
duplicated block wrapped in triple quotes (a dead-code fragment) that should be
removed; locate the INSTRUCTIONS variable in nemotron_example.py and delete the
entire triple-quoted duplicate so only the intended INSTRUCTIONS string literal
remains, ensuring quotes and parentheses around INSTRUCTIONS remain balanced and
no other prompt text is altered.
- Around line 207-211: Replace the debug print statements that expose raw text
with logger.debug calls: change the three print(...) occurrences around
THINK_END_TAG handling (the messages showing think_end position, AFTER STRIP
preview, and NO THINK TAG raw preview) to logger.debug(...) using the same
formatted strings; ensure the module-level logger (e.g., logger =
logging.getLogger(__name__)) is used or imported so the logging calls follow the
module's logging pattern and do not leak raw debug output.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d0f569cb-351d-4639-b03f-61534ff5bc87

📥 Commits

Reviewing files that changed from the base of the PR and between 3cb6752 and 7eb2f02.

📒 Files selected for processing (3)

plugins/openai/examples/nemotron_example/__init__.py
plugins/openai/examples/nemotron_example/nemotron_example.py
plugins/openai/examples/nemotron_example/pyproject.toml

plugins/openai/examples/nemotron_example/nemotron_example.py

coderabbitai

♻️ Duplicate comments (2)

plugins/openai/examples/nemotron_example/nemotron_example.py (2)

464-472: ⚠️ Potential issue | 🔴 Critical

Arguments reversed in create_call() — the call will drown in confusion.

Per Edge.create_call() signature, it expects (call_id: str, **kwargs) with call_type as a keyword argument. The current invocation passes them positionally in reverse order.

🐛 Proposed fix

 async def join_call(agent: Agent, call_type: str, call_id: str, **kwargs) -> None:
     """Join the call and start the agent."""
-    call = await agent.create_call(call_type, call_id)
+    call = await agent.create_call(call_id, call_type=call_type)
 
     logger.info("Starting Fraud Detection Agent...")

#!/bin/bash
# Verify the Edge.create_call signature to confirm argument order
ast-grep --pattern $'async def create_call(self, call_id: str, $$$)'

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@plugins/openai/examples/nemotron_example/nemotron_example.py` around lines
464 - 472, The call to agent.create_call in join_call is passing arguments in
the wrong order; Edge.create_call expects call_id first and call_type as a
keyword. Update the join_call implementation to call agent.create_call(call_id,
call_type=call_type) (or pass call_type in kwargs) so the positional call_id and
keyword call_type match the Edge.create_call signature; modify the invocation
inside the async def join_call(agent: Agent, call_type: str, call_id: str,
**kwargs) to use agent.create_call(call_id, call_type=call_type).

150-180: ⚠️ Potential issue | 🟠 Major

Missing observability events — the request begins in silence, errors vanish untracked.

The base class emits LLMRequestStartedEvent before the API call and LLMErrorEvent on failure. This override emits neither, breaking observability for downstream consumers tracking request lifecycle and error rates.

🔧 Proposed fix

+from vision_agents.core.llm import events
+from vision_agents.core.llm.events import LLMRequestStartedEvent
+
 ...
 
+        # Emit request started event
+        self.events.send(
+            LLMRequestStartedEvent(
+                plugin_name=PLUGIN_NAME,
+                model=request_kwargs["model"],
+                streaming=stream,
+            )
+        )
+
         request_start_time = time.perf_counter()
         try:
             response = await self._client.chat.completions.create(**request_kwargs)
         except (APIConnectionError, APIStatusError) as e:
             logger.exception("Failed to get a response from Nemotron")
+            self.events.send(
+                events.LLMErrorEvent(
+                    plugin_name=PLUGIN_NAME,
+                    error_message=str(e),
+                    event_data=e,
+                )
+            )
             return LLMResponseEvent(original=None, text="")

See chat_completions_llm.py:162-189 for the expected event emission pattern.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@plugins/openai/examples/nemotron_example/nemotron_example.py` around lines
150 - 180, This override of _create_response_internal suppresses the base-class
observability events; before calling self._client.chat.completions.create(...)
emit an LLMRequestStartedEvent (including model, messages, tools, and
max_tokens) as the base does, and in the except block catch (APIConnectionError,
APIStatusError) create and emit an LLMErrorEvent containing the exception and
request metadata (then log the exception and return
LLMResponseEvent(original=None, text="") as currently done); ensure emitted
events use the same fields/shape as the base implementation and that
_process_streaming_response is still called on success.

🧹 Nitpick comments (3)

plugins/openai/examples/nemotron_example/README.md (1)
19-25: Add a language specifier to the fenced code block.

The .env example block lacks a language identifier. Use dotenv or shell for syntax highlighting consistency.
✏️ Suggested fix
-```
+```dotenv
 BASETEN_API_KEY=...
 STREAM_API_KEY=...
 STREAM_API_SECRET=...
 DEEPGRAM_API_KEY=...
 OPENAI_API_KEY=...
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against the current code and only fix it if needed.

In @plugins/openai/examples/nemotron_example/README.md around lines 19 - 25,
Update the fenced code block that lists BASETEN_API_KEY, STREAM_API_KEY,
STREAM_API_SECRET, DEEPGRAM_API_KEY, and OPENAI_API_KEY to include a language
specifier (e.g., dotenv or shell) after the opening triple backticks so the
.env example gets proper syntax highlighting; edit the README's fenced block
containing those environment variables and add the language identifier (for
example change todotenv).
</details>

</blockquote></details>
<details>
<summary>plugins/openai/examples/nemotron_example/nemotron_example.py (2)</summary><blockquote>

`77-89`: **Module-level client instantiation may fail at import time.**

The `_summarizer` client is created unconditionally when the module loads. If `OPENAI_API_KEY` is absent, this will raise before the script reaches `main`. For an example script this is tolerable, but consider lazy initialization if this pattern spreads.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@plugins/openai/examples/nemotron_example/nemotron_example.py` around lines 77
- 89, The module-level AsyncOpenAI instantiation stored in _summarizer can raise
at import time if OPENAI_API_KEY is missing; change to lazy initialization by
moving creation into a helper (e.g., _get_summarizer) or inside
_summarize_for_speech so AsyncOpenAI is instantiated on first use and cached
(use a module-level None check for _summarizer then instantiate), and update
_summarize_for_speech to call _get_summarizer() (or create if None) before using
chat.completions.create; reference symbols: _summarizer, _summarize_for_speech,
AsyncOpenAI.
```

</details>

---

`182-261`: **Streaming semantics altered — chunks accumulate in darkness before release.**

The base class emits `LLMResponseChunkEvent` incrementally as tokens arrive; this override collects everything, strips thinking, then emits one large chunk. This is intentional for Nemotron's XML parsing, but downstream consumers expecting true streaming for low-latency TTS may experience increased time-to-first-audio.

Consider documenting this behavior in the docstring.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@plugins/openai/examples/nemotron_example/nemotron_example.py` around lines
182 - 261, The _process_streaming_response method deviates from the base class
streaming semantics by buffering all incoming chunks, stripping THINK/tool XML
and markdown, then emitting a single aggregated LLMResponseChunkEvent and
completion event (affecting low-latency TTS consumers); update the method
docstring for _process_streaming_response to clearly state this behavior (that
it accumulates chunks, strips THINK tags and tool XML, enforces a max tool
recursion via _tool_depth, and emits only a final chunk), note the impact on
time-to-first-audio/time-to-first-token for downstream TTS consumers, and
optionally mention how to change behavior (e.g., make streaming vs buffered
behavior configurable) so maintainers and integrators can find and adjust it.
```

</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In @plugins/openai/examples/nemotron_example/nemotron_example.py:

Around line 464-472: The call to agent.create_call in join_call is passing
arguments in the wrong order; Edge.create_call expects call_id first and
call_type as a keyword. Update the join_call implementation to call
agent.create_call(call_id, call_type=call_type) (or pass call_type in kwargs) so
the positional call_id and keyword call_type match the Edge.create_call
signature; modify the invocation inside the async def join_call(agent: Agent,
call_type: str, call_id: str, **kwargs) to use agent.create_call(call_id,
call_type=call_type).

Around line 150-180: This override of _create_response_internal suppresses the
base-class observability events; before calling
self._client.chat.completions.create(...) emit an LLMRequestStartedEvent
(including model, messages, tools, and max_tokens) as the base does, and in the
except block catch (APIConnectionError, APIStatusError) create and emit an
LLMErrorEvent containing the exception and request metadata (then log the
exception and return LLMResponseEvent(original=None, text="") as currently
done); ensure emitted events use the same fields/shape as the base
implementation and that _process_streaming_response is still called on success.

Nitpick comments:
In @plugins/openai/examples/nemotron_example/nemotron_example.py:

Around line 77-89: The module-level AsyncOpenAI instantiation stored in
_summarizer can raise at import time if OPENAI_API_KEY is missing; change to
lazy initialization by moving creation into a helper (e.g., _get_summarizer) or
inside _summarize_for_speech so AsyncOpenAI is instantiated on first use and
cached (use a module-level None check for _summarizer then instantiate), and
update _summarize_for_speech to call _get_summarizer() (or create if None)
before using chat.completions.create; reference symbols: _summarizer,
_summarize_for_speech, AsyncOpenAI.

Around line 182-261: The _process_streaming_response method deviates from the
base class streaming semantics by buffering all incoming chunks, stripping
THINK/tool XML and markdown, then emitting a single aggregated
LLMResponseChunkEvent and completion event (affecting low-latency TTS
consumers); update the method docstring for _process_streaming_response to
clearly state this behavior (that it accumulates chunks, strips THINK tags and
tool XML, enforces a max tool recursion via _tool_depth, and emits only a final
chunk), note the impact on time-to-first-audio/time-to-first-token for
downstream TTS consumers, and optionally mention how to change behavior (e.g.,
make streaming vs buffered behavior configurable) so maintainers and integrators
can find and adjust it.

In @plugins/openai/examples/nemotron_example/README.md:

Around line 19-25: Update the fenced code block that lists BASETEN_API_KEY,
STREAM_API_KEY, STREAM_API_SECRET, DEEPGRAM_API_KEY, and OPENAI_API_KEY to
include a language specifier (e.g., dotenv or shell) after the opening
triple backticks so the .env example gets proper syntax highlighting; edit the
README's fenced block containing those environment variables and add the
language identifier (for example change todotenv).
</details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: Path: .coderabbit.yaml

**Review profile**: CHILL

**Plan**: Pro

**Run ID**: `d39908d5-2392-4bed-8b57-648e0eff8401`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 7eb2f029cb6a0811113ebed90674d7f017816c38 and bff56ee39154d72dd64fc1cd82325b5d6c523d12.

</details>

<details>
<summary>📒 Files selected for processing (5)</summary>

* `README.md`
* `plugins/openai/examples/nemotron_example/README.md`
* `plugins/openai/examples/nemotron_example/__init__.py`
* `plugins/openai/examples/nemotron_example/nemotron_example.py`
* `plugins/openai/examples/nemotron_example/pyproject.toml`

</details>

<details>
<summary>✅ Files skipped from review due to trivial changes (2)</summary>

* README.md
* plugins/openai/examples/nemotron_example/pyproject.toml

</details>

</details>

Added a link to demo and clarified function calls in README.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

plugins/openai/examples/nemotron_example/README.md (1)

9-9: Clarify tool-call behavior to avoid contradictory expectations.

Line 9 currently reads as both “event-only” and “agent executes actions,” which is confusing for readers trying to understand side effects.

Suggested wording

-It uses function calls that send events instead of actually blocking a card etc. The agent is empowered to do this itself.
+It uses function calls that emit events (for demo purposes) rather than directly executing irreversible account actions like card blocking.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@plugins/openai/examples/nemotron_example/README.md` at line 9, Update the
README wording to remove the contradictory statement that tools both "send
events" and "execute actions"; explicitly state that the example's function
calls are event-only (they emit events rather than performing side effects) and
that the agent itself is responsible for carrying out any blocking UI updates or
card actions. Mention the terms "function calls", "event-only", and "agent
executes actions" so readers understand that tool-call invocations generate
events to be handled by the agent/runtime and do not directly mutate UI state.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/openai/examples/nemotron_example/README.md`:
- Around line 23-29: The fenced code block containing environment variables is
missing a language tag (markdownlint MD040); update the triple-backtick fence
that surrounds the
BASETEN_API_KEY/STREAM_API_KEY/DEEPGRAM_API_KEY/OPENAI_API_KEY block to include
a language identifier such as dotenv (i.e., change ``` to ```dotenv) so the
block is properly annotated in README.md.

---

Nitpick comments:
In `@plugins/openai/examples/nemotron_example/README.md`:
- Line 9: Update the README wording to remove the contradictory statement that
tools both "send events" and "execute actions"; explicitly state that the
example's function calls are event-only (they emit events rather than performing
side effects) and that the agent itself is responsible for carrying out any
blocking UI updates or card actions. Mention the terms "function calls",
"event-only", and "agent executes actions" so readers understand that tool-call
invocations generate events to be handled by the agent/runtime and do not
directly mutate UI state.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1a529ea9-8ba7-48ee-834a-62c0d350e03e

📥 Commits

Reviewing files that changed from the base of the PR and between bff56ee and 381cc2f.

📒 Files selected for processing (1)

plugins/openai/examples/nemotron_example/README.md

plugins/openai/examples/nemotron_example/README.md

github-actions bot added plugins config labels Mar 16, 2026

Nash0x7E2 approved these changes Mar 16, 2026

View reviewed changes

coderabbitai bot reviewed Mar 16, 2026

View reviewed changes

plugins/openai/examples/nemotron_example/nemotron_example.py Show resolved Hide resolved

plugins/openai/examples/nemotron_example/nemotron_example.py Outdated Show resolved Hide resolved

plugins/openai/examples/nemotron_example/nemotron_example.py Show resolved Hide resolved

maxkahan added 3 commits March 19, 2026 11:46

add nvidia nemotron example

ed92942

address PR comments

3d1c1e1

add example to readme

bff56ee

maxkahan force-pushed the add-nemotron-example branch from 7eb2f02 to bff56ee Compare March 19, 2026 11:09

github-actions bot added docs project-info labels Mar 19, 2026

coderabbitai bot reviewed Mar 19, 2026

View reviewed changes

Update README with demo link and function call details

381cc2f

Added a link to demo and clarified function calls in README.

coderabbitai bot reviewed Mar 19, 2026

View reviewed changes

plugins/openai/examples/nemotron_example/README.md Show resolved Hide resolved

add demo gif

6698502

github-actions bot added the assets label Mar 19, 2026

maxkahan merged commit b775230 into main Mar 19, 2026
4 checks passed

maxkahan deleted the add-nemotron-example branch March 19, 2026 12:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add nvidia nemotron example#426

add nvidia nemotron example#426
maxkahan merged 5 commits intomainfrom
add-nemotron-example

maxkahan commented Mar 16, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 16, 2026 •

edited

Loading

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

maxkahan commented Mar 16, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

maxkahan commented Mar 16, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 16, 2026 •

edited

Loading