-
Notifications
You must be signed in to change notification settings - Fork 183
Inference gateway integration #22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
21 commits
Select commit
Hold shift + click to select a range
77f9492
project
bcherry 0a9bd0b
Working gateway
bcherry d2d4990
cartesia
bcherry f9ad627
tool
bcherry c8b80c6
readme
bcherry ec74fc9
rm
bcherry 7e1a5c3
readme
bcherry 0034509
test
bcherry 9ded420
updates
bcherry a1d6d73
comment
bcherry 6f66944
working
bcherry 4c7a7ee
Merge remote-tracking branch 'origin/main' into bcherry/gateway
bcherry 580ee83
readme
bcherry 8614ea8
secrets
bcherry e5593c2
models
bcherry 9ac71f0
fmt
bcherry 17b01c1
1.2
bcherry 3689083
url
bcherry a45def5
K
bcherry bd9fd59
Fixes
bcherry 69505ad
update
bcherry File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,7 +1,3 @@ | ||
| LIVEKIT_URL= | ||
| LIVEKIT_API_KEY= | ||
| LIVEKIT_API_SECRET= | ||
|
|
||
| OPENAI_API_KEY= | ||
| DEEPGRAM_API_KEY= | ||
| CARTESIA_API_KEY= |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -9,4 +9,4 @@ KMS | |
| .vscode | ||
| *.egg-info | ||
| .pytest_cache | ||
| .ruff_cache | ||
| .ruff_cache | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2,21 +2,17 @@ | |
|
|
||
| from dotenv import load_dotenv | ||
| from livekit.agents import ( | ||
| NOT_GIVEN, | ||
| Agent, | ||
| AgentFalseInterruptionEvent, | ||
| AgentSession, | ||
| JobContext, | ||
| JobProcess, | ||
| MetricsCollectedEvent, | ||
| RoomInputOptions, | ||
| RunContext, | ||
| WorkerOptions, | ||
| cli, | ||
| metrics, | ||
| ) | ||
| from livekit.agents.llm import function_tool | ||
| from livekit.plugins import cartesia, deepgram, noise_cancellation, openai, silero | ||
| from livekit.plugins import noise_cancellation, silero | ||
| from livekit.plugins.turn_detector.multilingual import MultilingualModel | ||
|
|
||
| logger = logging.getLogger("agent") | ||
|
|
@@ -27,27 +23,28 @@ | |
| class Assistant(Agent): | ||
| def __init__(self) -> None: | ||
| super().__init__( | ||
| instructions="""You are a helpful voice AI assistant. | ||
| instructions="""You are a helpful voice AI assistant. The user is interacting with you via voice, even if you perceive the conversation as text. | ||
| You eagerly assist users with their questions by providing information from your extensive knowledge. | ||
| Your responses are concise, to the point, and without any complex formatting or punctuation including emojis, asterisks, or other symbols. | ||
| You are curious, friendly, and have a sense of humor.""", | ||
| ) | ||
|
|
||
| # all functions annotated with @function_tool will be passed to the LLM when this | ||
| # agent is active | ||
| @function_tool | ||
| async def lookup_weather(self, context: RunContext, location: str): | ||
| """Use this tool to look up current weather information in the given location. | ||
|
|
||
| If the location is not supported by the weather service, the tool will indicate this. You must tell the user the location's weather is unavailable. | ||
|
|
||
| Args: | ||
| location: The location to look up weather information for (e.g. city name) | ||
| """ | ||
|
|
||
| logger.info(f"Looking up weather for {location}") | ||
|
|
||
| return "sunny with a temperature of 70 degrees." | ||
| # To add tools, use the @function_tool decorator. | ||
| # Here's an example that adds a simple weather tool. | ||
| # You also have to add `from livekit.agents.llm import function_tool, RunContext` to the top of this file | ||
| # @function_tool | ||
| # async def lookup_weather(self, context: RunContext, location: str): | ||
| # """Use this tool to look up current weather information in the given location. | ||
| # | ||
| # If the location is not supported by the weather service, the tool will indicate this. You must tell the user the location's weather is unavailable. | ||
| # | ||
| # Args: | ||
| # location: The location to look up weather information for (e.g. city name) | ||
| # """ | ||
| # | ||
| # logger.info(f"Looking up weather for {location}") | ||
| # | ||
| # return "sunny with a temperature of 70 degrees." | ||
|
Comment on lines
+32
to
+47
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Might be extra clear here to link to tool call docs page and to state that the return value is passed to LLM node? |
||
|
|
||
|
|
||
| def prewarm(proc: JobProcess): | ||
|
|
@@ -61,17 +58,17 @@ async def entrypoint(ctx: JobContext): | |
| "room": ctx.room.name, | ||
| } | ||
|
|
||
| # Set up a voice AI pipeline using OpenAI, Cartesia, Deepgram, and the LiveKit turn detector | ||
| # Set up a voice AI pipeline using OpenAI, Cartesia, AssemblyAI, and the LiveKit turn detector | ||
| session = AgentSession( | ||
| # A Large Language Model (LLM) is your agent's brain, processing user input and generating a response | ||
| # See all providers at https://docs.livekit.io/agents/integrations/llm/ | ||
| llm=openai.LLM(model="gpt-4o-mini"), | ||
| # Speech-to-text (STT) is your agent's ears, turning the user's speech into text that the LLM can understand | ||
| # See all providers at https://docs.livekit.io/agents/integrations/stt/ | ||
| stt=deepgram.STT(model="nova-3", language="multi"), | ||
| # See all available models at https://docs.livekit.io/agents/models/stt/ | ||
| stt="assemblyai/universal-streaming:en", | ||
| # A Large Language Model (LLM) is your agent's brain, processing user input and generating a response | ||
| # See all available models at https://docs.livekit.io/agents/models/llm/ | ||
| llm="openai/gpt-4.1-mini", | ||
| # Text-to-speech (TTS) is your agent's voice, turning the LLM's text into speech that the user can hear | ||
| # See all providers at https://docs.livekit.io/agents/integrations/tts/ | ||
| tts=cartesia.TTS(voice="6f84f4b8-58a2-430c-8c79-688dad597532"), | ||
| # See all available models as well as voice selections at https://docs.livekit.io/agents/models/tts/ | ||
| tts="cartesia/sonic-2:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc", | ||
| # VAD and turn detection are used to determine when the user is speaking and when the agent should respond | ||
| # See more at https://docs.livekit.io/agents/build/turns | ||
| turn_detection=MultilingualModel(), | ||
|
|
@@ -81,19 +78,16 @@ async def entrypoint(ctx: JobContext): | |
| preemptive_generation=True, | ||
| ) | ||
|
|
||
| # To use a realtime model instead of a voice pipeline, use the following session setup instead: | ||
| # To use a realtime model instead of a voice pipeline, use the following session setup instead. | ||
| # (Note: This is for the OpenAI Realtime API. For other providers, see https://docs.livekit.io/agents/models/realtime/)) | ||
| # 1. Install livekit-agents[openai] | ||
| # 2. Set OPENAI_API_KEY in .env.local | ||
| # 3. Add `from livekit.plugins import openai` to the top of this file | ||
| # 4. Use the following session setup instead of the version above | ||
| # session = AgentSession( | ||
| # # See all providers at https://docs.livekit.io/agents/integrations/realtime/ | ||
| # llm=openai.realtime.RealtimeModel(voice="marin") | ||
| # ) | ||
|
|
||
| # sometimes background noise could interrupt the agent session, these are considered false positive interruptions | ||
| # when it's detected, you may resume the agent's speech | ||
| @session.on("agent_false_interruption") | ||
| def _on_agent_false_interruption(ev: AgentFalseInterruptionEvent): | ||
| logger.info("false positive interruption, resuming") | ||
| session.generate_reply(instructions=ev.extra_instructions or NOT_GIVEN) | ||
|
|
||
| # Metrics collection, to measure pipeline performance | ||
| # For more information, see https://docs.livekit.io/agents/build/metrics/ | ||
| usage_collector = metrics.UsageCollector() | ||
|
|
@@ -110,9 +104,9 @@ async def log_usage(): | |
| ctx.add_shutdown_callback(log_usage) | ||
|
|
||
| # # Add a virtual avatar to the session, if desired | ||
| # # For other providers, see https://docs.livekit.io/agents/integrations/avatar/ | ||
| # # For other providers, see https://docs.livekit.io/agents/models/avatar/ | ||
| # avatar = hedra.AvatarSession( | ||
| # avatar_id="...", # See https://docs.livekit.io/agents/integrations/avatar/hedra | ||
| # avatar_id="...", # See https://docs.livekit.io/agents/models/avatar/plugins/hedra | ||
| # ) | ||
| # # Start the avatar and wait for it to join | ||
| # await avatar.start(session, room=ctx.room) | ||
|
|
@@ -122,9 +116,7 @@ async def log_usage(): | |
| agent=Assistant(), | ||
| room=ctx.room, | ||
| room_input_options=RoomInputOptions( | ||
| # LiveKit Cloud enhanced noise cancellation | ||
| # - If self-hosting, omit this parameter | ||
| # - For telephony applications, use `BVCTelephony` for best results | ||
| # For telephony applications, use `BVCTelephony` for best results | ||
| noise_cancellation=noise_cancellation.BVC(), | ||
| ), | ||
| ) | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just my preference, but I like to add the long form so it's more obvious what each does: