Reachy Mini Voice Assistant

Give your Reachy Mini a voice — and a body. Fully offline on NVIDIA Jetson, zero cloud, zero cost per query.

Built on top of amarrmb/jetson-assistant (the modular voice + vision brain) and the upstream Pollen reachy_mini SDK. This repo is the Reachy integration: tool plugin + configs + launchers + character profiles. About 400 lines of Python on top of those two.

What you get:

9 motion tools (look, dance, express, nod, set_antennas, look_at_point, reachy_power, reachy_see, reachy_status) driving Pollen's curated emotion + dance libraries
5 character profiles out of the box: curious_reachy, desk_philosopher, eager_explorer, deadpan_reachy, language_learner
Two deployment topologies (embodied or standalone — see below)
Cleanup trap that puts motors in a safe state on Ctrl-C (a real failure mode without it)

Architecture (two topologies, same brain)

Embodied (recommended for real Reachy Mini Wireless):

Reachy mic ─┐                                              ┌─→ Reachy speaker
            │  WS/HTTP                              WS/HTTP│
Reachy cam ─┼─────► Jetson Thor: Nemotron STT ─→ vLLM ─→ Kokoro TTS
            │       (jetson-assistant serve, port 8080)   │
Reachy Pi5 ─┴── tool calls ─→ Reachy SDK (localhost) ─→ motors
   (orchestrator + lifecycle + picamera2 + sounddevice)

Standalone (any robot via WebSocket — sim or remote daemon):

Jetson mic → Nemotron STT → Qwen2.5-VL-7B (vLLM) → Kokoro TTS → Jetson speaker
                                    ↓ tool calls
                              bot/reachy_tools.py
                                    ↓ WebSocket (reachy-mini SDK 1.7.0)
                            Reachy Mini daemon (sim or real)

Try It

Embodied — brain on Thor, I/O on Reachy (A-OnPi)

This is the most natural deployment for a real Reachy Mini Wireless: the robot's onboard Pi5 captures audio + video, talks to Thor over the LAN for inference, and plays the response back through Reachy's own speaker. Thor stays mic/camera/speaker-free — pure compute.

On Jetson Thor (model server):

sudo sysctl -w vm.drop_caches=3
docker compose -f docker-compose.thor-serve.yml up -d   # ~12GB image already cached;
                                                         # vLLM/Nemotron/Kokoro preload ~5min
docker compose -f docker-compose.thor-serve.yml logs -f assistant-server

On the Reachy Mini Pi5 (orchestrator):

# One-time setup (rsync this repo + jetson-assistant to /home/pollen/dn/,
# then create the venv — see scripts/run-on-pi.sh for details).
sudo apt install python3-picamera2
python3 -m venv --system-site-packages /home/pollen/dn/orch-venv
/home/pollen/dn/orch-venv/bin/pip install httpx sounddevice openai opencv-python-headless \
    "reachy-mini>=1.7.0" pydantic fastapi typer pyyaml
/home/pollen/dn/orch-venv/bin/pip install --no-deps -e /home/pollen/dn/jetson-assistant
/home/pollen/dn/orch-venv/bin/pip install --no-deps -e /home/pollen/dn/reachy-mini

# Each session:
JA_SERVER_HOST=<thor-ip> /home/pollen/dn/reachy-mini/scripts/run-on-pi.sh

The launcher posts POST /api/media/release to the Reachy daemon (so it can grab the camera + mic), runs the assistant pointed at Thor, and posts /api/media/acquire on shutdown.

Now talk to Reachy directly — its onboard mic listens, its onboard speaker replies, its head camera answers "what do you see?". Thor is just a model server on the network.

Standalone — Jetson does everything (legacy / sim)

Use this when you want the brain and I/O on the same Jetson and Reachy is just a remote motion endpoint. Works with the MuJoCo simulator too.

Step 1 — start the robot daemon:

Option A — Simulation (no robot needed):

# On your laptop
pip install reachy-mini[mujoco]
reachy-mini-daemon --sim

Option B — Real Reachy Mini: Power it on — the daemon auto-starts on the robot.

Step 2 — start the voice AI on Jetson:

Set REACHY_HOST to wherever the daemon is running.

sudo sysctl -w vm.drop_caches=3

curl -fLO https://raw.githubusercontent.com/amarrmb/reachy-mini/main/docker-compose.yml
REACHY_HOST=<daemon-ip> docker compose up -d    # pulls ~12GB, vLLM loads model (~5 min)
docker compose logs -f reachy                    # watch it come up

Note: vLLM may restart once on first boot due to CUDA graph memory allocation. This is normal — restart: unless-stopped handles it automatically. Wait ~5 minutes.

Audio setup: Set ALSA_CARD to your audio device name (find it with aplay -l):

ALSA_CARD=USB REACHY_HOST=<daemon-ip> docker compose up -d

Plug in a mic and speaker on the Jetson, and start talking:

"Look to the left"                → head movement (left, right, up, down, center)
"Show me you're happy!"           → emotional expression (happy, sad, surprised, angry, ...)
"Do a dance"                      → choreographed dance sequence
"What do you see?"                → VLM describes what Reachy's camera sees
"Nod yes" / "Shake your head"     → agree/disagree gesture
"Go to sleep" / "Wake up"         → power on/off
"Set your antennas up"            → direct antenna control (-90 to 90 degrees)
"Look at the object on the table" → precise 3D gaze control
"Are you connected?"              → robot connection status
"Set a timer for 30 seconds"      → spoken countdown alert
"What time is it?"                → built-in clock

All jetson-assistant tools work too (web search, memory, language switching, multi-camera).

To stop: docker compose down

Make It Yours

Options

Configured via YAML files in configs/:

Setting	Default	Description
`tts_backend`	`kokoro`	TTS engine
`stt_backend`	`nemotron`	STT engine
`llm_backend`	`vllm`	LLM backend
`external_tools`	`[reachy_tools]`	Tool plugin modules
`camera_backend`	`v4l2`	`v4l2` (USB / cv2.VideoCapture) or `picamera2` (Pi CSI cameras — Reachy head cam)
`use_server`	`false`	When `true`, route STT/TTS/LLM to a remote `jetson-assistant serve` (used by the embodied A-OnPi deployment)

Environment variables:

Variable	Description
`ALSA_CARD`	Audio device name from `aplay -l` (e.g. `USB`, `Jabra`) — Jetson-side audio
`REACHY_HOST`	Remote Reachy daemon hostname/IP (empty = `localhost`, used by the orchestrator on the Pi5)
`JA_SERVER_HOST`	Remote `jetson-assistant serve` host for the Pi5 launcher (default `10.0.0.2`)
`JA_SERVER_PORT`	Remote `jetson-assistant serve` port (default `8080`)
`BOOTH_MODE=1`	Proactive booth greetings
`REACHY_BROADCAST=1`	UDP broadcast for dual-robot mode

Build Locally

# Clone both repos (reachy-mini depends on jetson-assistant)
git clone https://github.com/amarrmb/jetson-assistant.git
git clone https://github.com/amarrmb/reachy-mini.git

# Install jetson-assistant first
cd jetson-assistant && pip install -e ".[kokoro,nemotron,assistant,vision]"

# Install reachy-mini
cd ../reachy-mini && pip install -e .

# Start vLLM + daemon, then run
docker compose up -d vllm
reachy-mini-daemon --sim &
./scripts/run.sh

Build Docker

The pre-built image (ghcr.io/amarrmb/reachy-mini:thor) works out of the box. If you want to modify the code and rebuild:

# On Jetson Thor (aarch64 only)

# Option A: Use the pre-built base image from ghcr.io
docker build -t reachy-mini:thor .

# Option B: Rebuild the base image too (if you modified jetson-assistant)
cd ../jetson-assistant
gh release download v0.1.0 -p 'flash_attn-*.whl' -D wheels/
docker build -t jetson-assistant:thor .
cd ../reachy-mini
docker build --build-arg BASE_IMAGE=jetson-assistant:thor -t reachy-mini:thor .

reachy-mini layers on top of ghcr.io/amarrmb/jetson-assistant:thor. The flash-attn wheel needed for building that base image is available from the jetson-assistant v0.1.0 release. To adapt for other hardware, adapt the base image first — see jetson-assistant build docs.

Extend It

Add a tool in one function:

# my_tool.py
from typing import Annotated

def register_tools(registry, context=None):
    @registry.register("Description shown to the LLM")
    def my_action(param: Annotated[str, "What this param does"]) -> str:
        return "Done"

Add to your config YAML:

external_tools:
  - reachy_tools
  - my_tool

See examples/weather_tool.py for a complete example.

PersonaPlex Mode (Full-Duplex Conversation)

Run PersonaPlex (Moshi 7B speech-to-speech) with audio-reactive Reachy Mini animations. No tool calling — the robot breathes, tilts attentively while listening, and sways its head in sync with PersonaPlex's speech.

Browser ──WebSocket──► PersonaPlex (Jetson GPU)
                           │ on_audio_frame
                           ▼
                       MotionManager (50Hz)
                           │ WebSocket (reachy-mini SDK)
                           ▼
                       Reachy Mini (sim or real)

Quick Start

Terminal 1 — Robot daemon (laptop for sim, or real robot):

pip install "reachy-mini[mujoco]"
reachy-mini-daemon --sim --scene minimal

Terminal 2 — PersonaPlex + Reachy bridge (Jetson Thor):

python scripts/personaplex_reachy.py \
    --personaplex-dir ~/personaplex-oss \
    --port 8998 --fp8 \
    --reachy-host <laptop-ip>

Open the PersonaPlex Web UI at https://<jetson-ip>:8998 and start talking. The robot reacts in real time:

Breathing — subtle idle animation (always on)
Listening pose — attentive head tilt when you speak
Audio-reactive sway — head and antenna movement synced to PersonaPlex's speech

Custom Backend (MuJoCo, Mock, etc.)

The bridge module is importable for custom integrations:

from personaplex_bridge import setup_motion_manager, create_audio_bridge

# Any object with goto_target(head, antennas, duration) works
mm = setup_motion_manager(lambda: your_mujoco_reachy)
callback = create_audio_bridge(mm)

# Pass to PersonaPlex ServerState
state = ServerState(..., on_audio_frame=callback)

Troubleshooting

Issue	Fix
Can't connect to Reachy	Ensure `reachy-mini-daemon --sim` is running and `REACHY_HOST` points to it
vLLM not responding	Takes ~5 min to load. Check: `curl http://localhost:8001/v1/models`
vLLM restarts on first boot	Normal — CUDA graph allocation may OOM once. It auto-recovers.
No audio output	Verify `/dev/snd` is accessible: `docker exec reachy-assistant python -m sounddevice`
Camera in use	Another process holds it — `pkill -f jetson-assistant` on the host
Container keeps restarting	Check logs: `docker compose logs reachy`

License

Apache 2.0 — See LICENSE

Acknowledgments

Pollen Robotics — Reachy Mini hardware + SDK
jetson-assistant — voice + vision engine

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
bot		bot
configs		configs
docs/demos		docs/demos
examples		examples
local-fork-patches		local-fork-patches
scripts		scripts
tests		tests
.dockerignore		.dockerignore
.env.template		.env.template
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.thor-serve.yml		docker-compose.thor-serve.yml
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reachy Mini Voice Assistant

Architecture (two topologies, same brain)

Try It

Embodied — brain on Thor, I/O on Reachy (A-OnPi)

Standalone — Jetson does everything (legacy / sim)

Make It Yours

Options

Build Locally

Build Docker

Extend It

PersonaPlex Mode (Full-Duplex Conversation)

Quick Start

Custom Backend (MuJoCo, Mock, etc.)

Troubleshooting

License

Acknowledgments

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Reachy Mini Voice Assistant

Architecture (two topologies, same brain)

Try It

Embodied — brain on Thor, I/O on Reachy (A-OnPi)

Standalone — Jetson does everything (legacy / sim)

Make It Yours

Options

Build Locally

Build Docker

Extend It

PersonaPlex Mode (Full-Duplex Conversation)

Quick Start

Custom Backend (MuJoCo, Mock, etc.)

Troubleshooting

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages