livekit-examples · shababo · Aug 21, 2025 · Aug 21, 2025 · Aug 22, 2025 · Aug 26, 2025
diff --git a/README.md b/README.md
@@ -23,6 +23,7 @@ The following examples include a template configuration or manifest file for eac
 | [Cerebrium](/cerebrium) | `cerebrium.toml` example for [Cerebrium](https://cerebrium.ai) |
 | [Fly.io](/fly.io) | `fly.toml` example for [Fly.io](https://fly.io) |
 | [Kubernetes](/kubernetes) | Example manifest file for any Kubernetes environment |
+| [Modal](/modal) | Example based on the python-agent-starter project ready to deploy on [Modal](https://modal.com) (no `Dockerfile` or config file necessary) |
 | [Render](/render.com) | `render.yaml` example for [Render](https://render.com) |
 
 ## Missing a provider?

diff --git a/modal/README.md b/modal/README.md
@@ -0,0 +1,104 @@
+# Modal LiveKit Agents Deployment Example
+
+This directory contains a [LiveKit](https://livekit.com) voice AI agent deployed on [Modal](https://www.modal.com?utm_source=partner&utm_medium=github&utm_campaign=livekit), a serverless platform for running Python applications. The agent is based on [LiveKit's `agent-starter-python` project](https://github.com/livekit-examples/agent-starter-python)
+
+## Getting Started
+
+Before deploying, ensure you have:
+
+- **Modal Account**: Sign up at [modal.com](https://www.modal.com?utm_source=partner&utm_medium=github&utm_campaign=livekit) and get $30/month of free compute.
+- **LiveKit Account**: Set up a [LiveKit](https://livekit.com) account
+- **API Keys**:
+    - [OpenAI](https://openai.com)
+    - [Cartesia](https://cartesia.com)
+    - [Deepgram](https://deepgram.com)
+
+### Install Dependencies
+
+The project uses `uv` for dependency management. That said, the only local dependency you need is `modal`. To setup the environment, run
+
+```bash
+uv sync
+```
+
+### Authenticate Modal
+
+```bash
+modal setup
+```
+
+### Set Up Secrets on Modal
+
+**Using the Modal dashboard**
+
+Navigate to the Secrets section in the Modal dashboard and add the following secrets:
+
+- `LIVEKIT_URL` - Your LiveKit WebRTC server URL
+- `LIVEKIT_API_KEY` - API key for authenticating LiveKit requests
+- `LIVEKIT_API_SECRET` - API secret for LiveKit authentication
+- `OPENAI_API_KEY` - API key for OpenAI's GPT-based processing
+- `CARTESIA_API_KEY` - API key for Cartesia's TTS services
+- `DEEPGRAM_API_KEY` - API key for Deepgram's STT services
+
+You can find your LiveKit URL and API keys under **Settings** > **Project** and **Settings** > **Keys** in the LiveKit dashboard.
+
+![Modal Secrets](https://modal-cdn.com/cdnbot/modal-livekit-secretsndip6awa_78ed94b0.webp)
+
+**Using the Modal CLI:**
+
+```bash
+modal secret create livekit-voice-agent \
+  --env LIVEKIT_URL=your_livekit_url \
+  --env LIVEKIT_API_KEY=your_api_key \
+  --env LIVEKIT_API_SECRET=your_api_secret \
+  --env OPENAI_API_KEY=your_openai_key \
+  --env DEEPGRAM_API_KEY=your_deepgram_key \
+  --env CARTESIA_API_KEY=your_cartesia_key
+```
+
+Once added, you can reference these secrets in your Modal functions.
+
+### Configure LiveKit Webhooks
+
+In your LiveKit project dashboard, create a new Webhook using the URL created when you deploy your Modal app. This URL will be printed to stdout and is also available in your Modal dashboard. It will look something like the URL in the screenshot below:
+
+![settings webhooks](https://modal-cdn.com/cdnbot/livekit-webhooksiceyins6_203427cc.webp)
+
+## Deployment
+
+Run the following command to deploy your Modal app. 
+```bash
+modal deploy -m src.server
+```
+You can interact with your agent using the hosted [LiveKit Agent Playground](https://docs.livekit.io/agents/start/playground/). When you connect to the room, the `room_started` webhook event will spawn your agent to the room.
+
+## Developing
+
+During development in case be helpful to launch the application using
+```
+modal serve -m src.server
+```
+which will reload the app when changes are made to the source code.
+
+## Testing
+
+### Test the Agent
+
+Use the following command to launch your app remotely and execute the tests using `pytest`:
+```
+modal run -m src.server
+```
+
+### Test the Webhook Endpoint
+
+Test the webhook endpoint with a sample LiveKit event from the command line:
+
+```bash
+curl -X POST {MODAL_AGENT_WEB_ENDPOINT_URL} \
+  -H "Authorization: Bearer your_livekit_token" \
+  -H "Content-Type: application/json" \
+  -d '{"event": "room_started", "room": {"name": "test-room"}}'
+```
+
+Or you can trigger Webhook events from LiveKit Webhooks setting page (the same place you created the new Webhook).
+
diff --git a/modal/pyproject.toml b/modal/pyproject.toml
@@ -0,0 +1,42 @@
+[build-system]
+requires = ["setuptools>=61.0", "wheel"]
+build-backend = "setuptools.build_meta"
+
+[project]
+name = "agent-starter-python"
+version = "1.0.0"
+description = "Simple voice AI assistant built with LiveKit Agents for Python"
+requires-python = ">=3.9"
+
+dependencies = [
+    "modal",
+]
+
+[dependency-groups]
+dev = [
+    "pytest",
+    "pytest-asyncio",
+    "ruff",
+]
+
+[tool.setuptools.packages.find]
+where = ["src"]
+
+[tool.setuptools.package-dir]
+"" = "src"
+
+[tool.pytest.ini_options]
+asyncio_mode = "auto"
+asyncio_default_fixture_loop_scope = "function"
+
+[tool.ruff]
+line-length = 88
+target-version = "py39"
+
+[tool.ruff.lint]
+select = ["E", "F", "W", "I", "N", "B", "A", "C4", "UP", "SIM", "RUF"]
+ignore = ["E501"]  # Line too long (handled by formatter)
+
+[tool.ruff.format]
+quote-style = "double"
+indent-style = "space"
diff --git a/modal/src/__init__.py b/modal/src/__init__.py
@@ -0,0 +1 @@
+# This file makes the src directory a Python package
diff --git a/modal/src/agent.py b/modal/src/agent.py
@@ -0,0 +1,138 @@
+import logging
+
+from fastapi import FastAPI, Request, Response
+from livekit import api
+from livekit.agents import (
+    NOT_GIVEN,
+    Agent,
+    AgentFalseInterruptionEvent,
+    AgentSession,
+    JobContext,
+    JobProcess,
+    MetricsCollectedEvent,
+    RoomInputOptions,
+    RunContext,
+    WorkerOptions,
+    cli,
+    metrics,
+)
+from livekit.agents.llm import function_tool
+from livekit.plugins import cartesia, deepgram, noise_cancellation, openai, silero
+from livekit.plugins.turn_detector.multilingual import MultilingualModel
+
+logger = logging.getLogger("agent")
+
+def download_files():
+    import subprocess
+    subprocess.run(["uv", "run", "src/agent.py", "download-files"], cwd="/root")
+
+
+
+
+
+class Assistant(Agent):
+    def __init__(self) -> None:
+        super().__init__(
+            instructions="""You are a helpful voice AI assistant.
+            You eagerly assist users with their questions by providing information from your extensive knowledge.
+            Your responses are concise, to the point, and without any complex formatting or punctuation including emojis, asterisks, or other symbols.
+            You are curious, friendly, and have a sense of humor.""",
+        )
+
+    # all functions annotated with @function_tool will be passed to the LLM when this
+    # agent is active
+    @function_tool
+    async def lookup_weather(self, context: RunContext, location: str):
+        """Use this tool to look up current weather information in the given location.
+
+        If the location is not supported by the weather service, the tool will indicate this. You must tell the user the location's weather is unavailable.
+
+        Args:
+            location: The location to look up weather information for (e.g. city name)
+        """
+
+        logger.info(f"Looking up weather for {location}")
+
+        return "sunny with a temperature of 70 degrees."
+
+
+def prewarm(proc: JobProcess):
+    proc.userdata["vad"] = silero.VAD.load()
+
+async def entrypoint(ctx: JobContext):
+    # Logging setup
+    # Add any other context you want in all log entries here
+    ctx.log_context_fields = {
+        "room": ctx.room.name,
+    }
+
+    # Set up a voice AI pipeline using OpenAI, Cartesia, Deepgram, and the LiveKit turn detector
+    session = AgentSession(
+        # A Large Language Model (LLM) is your agent's brain, processing user input and generating a response
+        # See all providers at https://docs.livekit.io/agents/integrations/llm/
+        llm=openai.LLM(model="gpt-4o-mini"),
+        # Speech-to-text (STT) is your agent's ears, turning the user's speech into text that the LLM can understand
+        # See all providers at https://docs.livekit.io/agents/integrations/stt/
+        stt=deepgram.STT(model="nova-3", language="multi"),
+        # Text-to-speech (TTS) is your agent's voice, turning the LLM's text into speech that the user can hear
+        # See all providers at https://docs.livekit.io/agents/integrations/tts/
+        tts=cartesia.TTS(voice="6f84f4b8-58a2-430c-8c79-688dad597532"),
+        # VAD and turn detection are used to determine when the user is speaking and when the agent should respond
+        # See more at https://docs.livekit.io/agents/build/turns
+        turn_detection=MultilingualModel(),
+        vad=ctx.proc.userdata["vad"],
+        # allow the LLM to generate a response while waiting for the end of turn
+        # See more at https://docs.livekit.io/agents/build/audio/#preemptive-generation
+        preemptive_generation=True,
+    )
+
+    # To use a realtime model instead of a voice pipeline, use the following session setup instead:
+    # session = AgentSession(
+    #     # See all providers at https://docs.livekit.io/agents/integrations/realtime/
+    #     llm=openai.realtime.RealtimeModel()
+    # )
+
+    # sometimes background noise could interrupt the agent session, these are considered false positive interruptions
+    # when it's detected, you may resume the agent's speech
+    @session.on("agent_false_interruption")
+    def _on_agent_false_interruption(ev: AgentFalseInterruptionEvent):
+        logger.info("false positive interruption, resuming")
+        session.generate_reply(instructions=ev.extra_instructions or NOT_GIVEN)
+
+    # Metrics collection, to measure pipeline performance
+    # For more information, see https://docs.livekit.io/agents/build/metrics/
+    usage_collector = metrics.UsageCollector()
+
+    @session.on("metrics_collected")
+    def _on_metrics_collected(ev: MetricsCollectedEvent):
+        metrics.log_metrics(ev.metrics)
+        usage_collector.collect(ev.metrics)
+
+    async def log_usage():
+        summary = usage_collector.get_summary()
+        logger.info(f"Usage: {summary}")
+
+    ctx.add_shutdown_callback(log_usage)
+
+    # # Add a virtual avatar to the session, if desired
+    # # For other providers, see https://docs.livekit.io/agents/integrations/avatar/
+    # avatar = hedra.AvatarSession(
+    #   avatar_id="...",  # See https://docs.livekit.io/agents/integrations/avatar/hedra
+    # )
+    # # Start the avatar and wait for it to join
+    # await avatar.start(session, room=ctx.room)
+
+    # Start the session, which initializes the voice pipeline and warms up the models
+    await session.start(
+        agent=Assistant(),
+        room=ctx.room,
+        room_input_options=RoomInputOptions(
+            # LiveKit Cloud enhanced noise cancellation
+            # - If self-hosting, omit this parameter
+            # - For telephony applications, use `BVCTelephony` for best results
+            noise_cancellation=noise_cancellation.BVC(),
+        ),
+    )
+
+    # Join the room and connect to the user
+    await ctx.connect()
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		# This file makes the src directory a Python package