Matcha-AI-DTU is a comprehensive AI-powered sports video analysis platform. It processes sports footage (specifically soccer/football), tracks the ball and players, automatically detects goals and events using SoccerNet + custom vision models, generates insightful commentary using Google Gemini LLMs, synthesizes high-quality audio voiceovers using a 3-tier neural TTS pipeline (Kokoro-82M β edge-tts β silence), and builds a complete analytics suite including player heatmaps, ball speed estimation, and team colour detection.
This is a monorepo built with Next.js, NestJS, and FastAPI, orchestrated with Docker and Turborepo.
The system is broken down into three main applications:
graph TD
subgraph "Core & Tooling"
Shared["packages/shared (ApiClient/Utils)"]
Database["packages/database (Prisma Client)"]
Env["packages/env (Strict Validation)"]
Contracts["packages/contracts (Zod Schemas)"]
UI["packages/ui (Standardized Components)"]
Theme["packages/theme (Design Tokens)"]
end
subgraph "Services"
Orchestrator["services/orchestrator (NestJS Hub)"]
Inference["services/inference (FastAPI Engine)"]
end
subgraph "Data Layer"
PG["PostgreSQL (Storage)"]
RDS["Redis (Cache/PubSub)"]
end
Web -.-> UI
Web -.-> Theme
Web -.-> Shared
Mobile -.-> Shared
Orchestrator ==> Database
Orchestrator ==> Contracts
Orchestrator ==> Env
Web ==> Orchestrator
Mobile ==> Orchestrator
Orchestrator ==> Inference
Database --- PG
Orchestrator --- RDS
- Frontend Web (apps/web): A modern Next.js 15 interface. Consumes
@matcha/uifor standardized components. - Frontend Mobile (apps/mobile): An Expo (React Native) app sharing business logic via
@matcha/shared. - Orchestrator (services/orchestrator): NestJS backend for API orchestration and database management.
- Shared Tooling (packages/*):
@matcha/ui: React component system (VideoPlayer, MatchReportPDF, ScoreBadge).@matcha/theme: Global design system (Tailwind config, Brand colors, Fonts).@matcha/shared: Universal API client, WebSocket registries, and logical utilities.@matcha/database: Shared Prisma schema and generated client.@matcha/contracts: Centralized Zod validation schemas for all API payloads.@matcha/env: Strict, boot-time environment variable validation via T3-Env.
- Inference Engine (services/inference): Python FastAPI AI pipeline (YOLO, SoccerNet, Gemini).
- Automated Video Analysis: Upload raw sports footage and let the system automatically analyze the content through a 5-phase AI pipeline.
- Advanced Goal Detection: Custom
GoalDetectionEnginebuilt with Kalman filtering, homography projection, and a finite state machine that auto-calibrates to the goal line, tracks the ball across frames, and confirms goals with high precision. - Action Event Recognition: SoccerNet-trained model detects 11 event types: GOAL, SAVE, TACKLE, FOUL, CORNER, YELLOW_CARD, RED_CARD, PENALTY, OFFSIDE, CELEBRATION, HIGHLIGHT. Falls back to motion-peak detection when SoccerNet returns too few results.
- AI Commentary Generation: Context-aware, natural-sounding commentary (40β60 words per event) generated by Gemini 2.0 Flash, describing the on-field action with late-game intensity boosts.
- 3-Tier Neural TTS Pipeline:
- π₯ Kokoro-82M (
hexgrad/Kokoro-82M) β #1 ranked in TTS-Spaces-Arena, British female sports voice - π₯ Microsoft edge-tts (
en-GB-RyanNeural) β British male broadcaster, always available, no key needed - π₯ FFmpeg silent audio β absolute fallback
- π₯ Kokoro-82M (
- Real-Time Progress Tracking: WebSocket integration provides users with live updates across the 5-phase analysis pipeline, including per-event streaming as events are detected.
- Player Heatmap: OpenCV-rendered top-down pitch diagram showing where each team concentrated their play throughout the entire match, with Gaussian-blurred density overlays coloured by auto-detected jersey colour.
- Ball Speed Estimation: YOLO ball tracking data is used to estimate the peak ball speed across the match (95th-percentile of pixel delta β km/h). Displayed in the Analytics tab.
- Team Colour Detection: NumPy K-Means clustering on jersey crops automatically identifies which colour belongs to which team. Displayed as hex swatches.
- Highlight Reel Generation: Top-N non-overlapping clips are cut, overlaid with scrolling text, mixed with TTS commentary + crowd ambience + background music, and concatenated into a single MP4 via FFmpeg.
- Context Score Intelligence: Every event is scored 0β10 via a weighted formula balancing event type importance, motion intensity, temporal position, and detection confidence. Late goals are doubly weighted.
- Shared Architecture: A dedicated
@matcha/sharedpackage centralizes TypeScript interfaces, API clients, WebSocket message registries, and formatters for universal type safety across web, mobile, and orchestrator. - Secure Authentication: Built-in JWT-based authentication via NestJS AuthModule and bcrypt, linking analyzed matches to distinct user accounts.
- Universal Design & PDF Generation: Reusable Tailwind-driven React components (
ScoreBadge,CopyButton) and standard logical themes applied globally, with an integrated@react-pdf/renderermodule capable of instantly generating rich, visual Match Reports as downloadable PDFs.
For a full step-by-step beginner walkthrough, see SETUP.md.
| Tool | Minimum Version | Install Link |
|---|---|---|
| Node.js | 18+ | https://nodejs.org (choose LTS) |
| Python | 3.9+ | https://www.python.org/downloads/ |
| Docker Desktop | Latest | https://www.docker.com/products/docker-desktop/ |
| FFmpeg | Latest | https://ffmpeg.org/download.html |
| NVIDIA GPU + CUDA | 12.4 (optional) | Boosts YOLO inference speed |
β οΈ Python Note: The local inference engine has been tested and confirmed working on Python 3.9 (the macOS system default). Python 3.10+ also works. Do not use 3.7 or 3.8.
git clone https://github.com/YOUR-ORG/Matcha-AI-DTU.git
cd Matcha-AI-DTU
# Create the uploads directory (git-ignored, must be created manually)
mkdir uploadsMake sure Docker Desktop is open and running, then:
docker-compose up -d --buildThis starts:
- PostgreSQL on port
5433(not 5432 β intentional to avoid local conflicts) - Redis on port
6380
Verify containers are up:
docker ps
# Both matcha_postgres and matcha_redis should show "Up"# Install all monorepo packages
npm install
# Build all shared TypeScript packages (@matcha/env, @matcha/ui, @matcha/shared, etc.)
npx turbo run buildThen apply the database schema:
cd services/orchestrator
npx prisma migrate deploy # use 'migrate dev' on a fresh local DB
cd ../..services/orchestrator/.env
PORT=4000
CORS_ORIGIN=http://localhost:3000
DATABASE_URL="postgresql://matcha_user:matcha_password@localhost:5433/matcha_db?schema=public"
INFERENCE_URL=http://localhost:8000
GEMINI_API_KEY=your_google_ai_studio_key # https://aistudio.google.com/app/apikey
HF_TOKEN=hf_your_token_here # optional, from huggingface.co/settings/tokensapps/web/.env.local
NEXT_PUBLIC_API_URL=http://localhost:4000/api/v1services/inference/.env
ORCHESTRATOR_URL=http://localhost:4000/api/v1
GEMINI_API_KEY=your_google_ai_studio_key
HF_TOKEN=hf_your_token_herecd services/inferencemacOS / Linux:
python3 -m venv venv
source venv/bin/activateWindows (PowerShell):
py -3.11 -m venv venv
.\venv\Scripts\activateThen install dependencies:
pip install --upgrade pip
# CPU only (works on all machines):
pip install -r requirements.txt
# OR β with CUDA 12.4 GPU acceleration (NVIDIA only):
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
pip install -r requirements.txtGo back to project root:
cd ../..One command starts everything β Frontend, Orchestrator, AND the Python Inference Engine:
# From project root
npx turbo run devThis works because
services/inference/package.jsonnow has adevscript that calls./venv/bin/python -m uvicorn ...directly β no shell activation needed. Turbo runs all three services concurrently and streams their logs together.
First run only: YOLO will automatically download model weights (~22 MB for
yolov8s-pose.ptand ~6 MB foryolov8n.pt). This takes ~30 seconds with a good internet connection and is cached for all future runs.
Windows users: The
devscript uses./venv/bin/pythonwhich is the macOS/Linux path. On Windows, run the inference manually:cd services/inference .\venv\Scripts\activate python -m uvicorn main:app --host 0.0.0.0 --port 8000 --reloadOr run
npm run dev:winfrom insideservices/inference.
If you prefer separate terminals:
| Terminal | Command | Port |
|---|---|---|
| Frontend | cd apps/web && npm run dev |
3000 |
| Orchestrator | cd services/orchestrator && npm run dev |
4000 |
| Inference (macOS/Linux) | cd services/inference && ./venv/bin/python -m uvicorn main:app --port 8000 --reload |
8000 |
| Inference (Windows) | cd services/inference && .\venv\Scripts\python.exe -m uvicorn main:app --port 8000 |
8000 |
# Check Orchestrator is healthy
curl http://localhost:4000/api/v1/health
# Expected: {"status":"ok","service":"orchestrator",...}
# Check Inference is alive
curl http://localhost:8000/Then open http://localhost:3000 (or 3001 if 3000 was busy β check terminal output).
Matcha-AI-DTU/
βββ apps/
β βββ web/ # Next.js 15 Web App
β βββ mobile/ # Expo (React Native) Mobile App
βββ packages/
β βββ shared/ # @matcha/shared β Types, Client, Utils
βββ services/
β βββ orchestrator/ # NestJS API (Port 4000)
β β βββ prisma/ # Database schema & migrations
β βββ inference/ # Python FastAPI AI Pipeline (Port 8000)
β βββ app/core/ # Pipeline modules (analysis, heatmap, goal detection)
β βββ AI_PIPELINE.md # Detailed 5-phase pipeline documentation
βββ uploads/ # Global video/audio/heatmap assets (shared)
βββ docs/
β βββ ARCHITECTURE.md # Full system architecture
β βββ API_REFERENCE.md # HTTP & WebSocket API contracts
β βββ CONTRIBUTING.md # Contribution guidelines & dev setup
βββ docker-compose.yml # PostgreSQL & Redis infrastructure
| Document | Description |
|---|---|
| docs/ARCHITECTURE.md | Full system architecture β data flow, DB schema, analytics pipeline, TTS tiers, frontend tree, env var reference |
| docs/API_REFERENCE.md | Complete HTTP + WebSocket API contracts with request/response bodies, event type table, and score formula |
| docs/CONTRIBUTING.md | Contributing guidelines, local dev setup, adding new features, mobile responsiveness guide |
| services/inference/AI_PIPELINE.md | Deep-dive into all 5 pipeline phases: YOLO, SoccerNet, Gemini, TTS, and analytics |
- Ports already in use: Use
netstat -ano | Select-String ":3000"to find and kill conflicting operations. - Postgres/Redis connection failures: Ensure Docker daemon is running and containers haven't exited (
docker ps -a). - Python Import errors (TTS/Torch): Double-check that your
venvis activated. Runpip install -r requirements.txtagain. - Analysis stuck at 0%: Ensure
ORCHESTRATOR_URLis correct in the Inference.env. The callback route may be unreachable. Check the Python terminal for error logs. - Kokoro TTS not working: Set
HF_TOKENinservices/inference/.env. Without it, anonymous HuggingFace rate limits may block Kokoro β the system will automatically fall back toedge-tts. - Heatmap not showing in Analytics tab: The heatmap is only generated if YOLO detected players during the analysis. Re-analyze the match. Check the inference logs for
Heatmap saved βorHeatmap generation failed:. prisma generatefails with EPERM: This means the orchestrator is running and has the Prisma DLL locked. Stop the service, runnpx prisma generate, then restart.
Built for Matcha-AI-DTU π