Skip to content

theJorDea/OpenCanvas

Repository files navigation

OpenCanvas

OpenCanvas is a FastAPI-based backend and demo client for interactive 3D graph visualization, AI-assisted graph generation, Mermaid import, and gesture-driven camera control via MediaPipe.

The repository combines four layers in one project:

  • a backend API for graph state, validation, and generation
  • WebSocket channels for realtime graph and control events
  • a browser demo renderer based on three.js and 3d-force-graph
  • a MediaPipe gesture sender that turns hand movement into normalized control data

What The Project Does

The service lets you:

  • generate a graph from a natural-language prompt through an LLM provider
  • build a graph from a Mermaid subset without calling the LLM
  • import and export graph snapshots as JSON
  • stream normalized control messages over WebSocket
  • drive a 3D graph camera with hand gestures
  • separate low-latency control traffic from graph and status traffic

This makes the project useful both as a standalone demo and as a backend for external renderers such as browser clients, desktop tools, game engines, or TouchDesigner.

Main Capabilities

  • FastAPI HTTP API for graph lifecycle and service introspection
  • realtime WebSocket transport with a unified socket and dedicated sockets
  • versioned JSON message contract in schemas/messages.schema.json
  • LLM generation with support for POLZA_* or OPENAI_* environment variables
  • deterministic fallback and mock mode for local development and tests
  • Mermaid-to-graph conversion for fast manual prototyping
  • control stream smoothing, dead-zone filtering, and rate limiting
  • MediaPipe-based gesture capture with support for the current Tasks API flow
  • integration tests for HTTP, WebSocket, import/apply, and generation flows

Repository Structure

app/
  main.py                     FastAPI app, endpoints, websocket hubs, runtime limits
  llm_service.py              LLM integration and graph generation logic
  mediapipe_gesture_sender.py MediaPipe hand tracking -> control payloads
  control_processing.py       Control stream filtering and throttling
  mermaid_parser.py           Mermaid subset parsing
  models.py                   Pydantic models and schema version
static/
  index.html                  Browser demo UI and 3D renderer
docs/
  API_REFERENCE.md
  MEDIAPIPE_SETUP.md
  CLIENT_ADAPTERS.md
  TOUCHDESIGNER_INTEGRATION.md
schemas/
  messages.schema.json        Versioned websocket message contract
scripts/
  run_server.ps1              Helper for choosing a free port and starting uvicorn
  download_hand_landmarker_model.py
tests/
  integration/                End-to-end API workflow tests
  test_mediapipe_gesture_sender.py
models/
  hand_landmarker.task        MediaPipe task model storage location

Architecture Overview

1. Backend service

The backend stores the current graph in memory, validates payloads with Pydantic, exposes system metadata, and broadcasts graph/status/error events to connected clients.

Key files:

2. Graph generation layer

Graphs can arrive from three sources:

  • POST /api/graph/generate for LLM-driven graph generation
  • POST /api/graph/mermaid for Mermaid subset parsing
  • POST /api/graph/apply for externally supplied JSON payloads

The LLM service supports both direct provider calls and mock mode for local work:

  • POLZA_API_KEY or OPENAI_API_KEY
  • optional POLZA_BASE_URL / OPENAI_BASE_URL
  • optional POLZA_MODEL / OPENAI_MODEL
  • LLM_MODE=mock for deterministic local testing

3. Realtime transport

The project exposes three WebSocket entry points:

  • WS /ws for combined traffic
  • WS /ws/control for high-rate control only
  • WS /ws/events for graph, status, and error events

This split is especially useful when a renderer must keep control latency low.

4. Gesture input

app/mediapipe_gesture_sender.py captures a webcam feed, detects a hand, derives orientation/openness/pinch states, and sends normalized values:

  • rotateX
  • rotateY
  • zoomDelta
  • confidence
  • resetCamera

The sender can use either:

  • mp.solutions when exposed by the installed MediaPipe build
  • the newer Tasks API with models/hand_landmarker.task

Message Model

The service uses a versioned schema with SCHEMA_VERSION = 1.0.0.

Outgoing messages include:

  • graph
  • status
  • error

Incoming realtime control messages include:

  • control

Example control message:

{
  "schemaVersion": "1.0.0",
  "type": "control",
  "requestId": "ctrl-123",
  "payload": {
    "rotateX": 0.12,
    "rotateY": -0.08,
    "zoomDelta": 0.03,
    "confidence": 0.94,
    "resetCamera": false
  }
}

The canonical contract lives in schemas/messages.schema.json.

Quick Start

Requirements

  • Windows PowerShell is the primary workflow in this repo
  • Python 3.12
  • webcam access for gesture control
  • optional LLM API key for prompt-based graph generation

1. Create and activate a virtual environment

py -3.12 -m venv .venv312
.\.venv312\Scripts\Activate.ps1

2. Install dependencies

pip install -r requirements.txt
pip install -r requirements-dev.txt

3. Configure environment variables

Minimal local .env example:

LLM_MODE=mock

Example with an external model provider:

OPENAI_API_KEY=your_key_here
OPENAI_MODEL=gpt-4o-mini

Or:

POLZA_API_KEY=your_key_here
POLZA_MODEL=openai/gpt-4o

4. Start the server

Using the helper script:

.\scripts\run_server.ps1

Or directly:

.\.venv312\Scripts\python.exe -m uvicorn app.main:app --host 127.0.0.1 --port 8000 --reload --env-file .env

5. Open the demo UI

After startup, open:

http://127.0.0.1:8000/

The browser demo lets you:

  • generate a graph from a text prompt
  • parse Mermaid code into a graph
  • load the current graph from the server
  • export and import graph JSON
  • watch control and status diagnostics live

Gesture Control Setup

If your MediaPipe build needs the Tasks API model bundle, download it once:

python scripts\download_hand_landmarker_model.py

That stores the model at:

models/hand_landmarker.task

Then run the gesture sender:

python app\mediapipe_gesture_sender.py --uri ws://127.0.0.1:8000/ws --camera 0 --hz 45 --preview

Optional explicit model path:

python app\mediapipe_gesture_sender.py --model C:\path\to\hand_landmarker.task --preview

Gesture semantics

The demo is designed around these gestures:

  • turn the hand to rotate the graph
  • open or close the hand to zoom
  • pinch to lock movement
  • hold an OK gesture to reset the camera

API Summary

System endpoints

  • GET /health
  • GET /api/system/limits
  • GET /api/system/capabilities
  • GET /api/system/schema/messages

Graph endpoints

  • GET /api/graph/current
  • POST /api/graph/apply
  • POST /api/graph/generate
  • POST /api/graph/mermaid

WebSocket endpoints

  • WS /ws
  • WS /ws/control
  • WS /ws/events

Detailed endpoint examples are documented in docs/API_REFERENCE.md.

Runtime Configuration

The backend exposes several useful tuning variables:

Variable Default Purpose
GENERATE_RATE_LIMIT_PER_MIN 20 per-minute limit for LLM generation
MERMAID_RATE_LIMIT_PER_MIN 30 per-minute limit for Mermaid parsing
APPLY_RATE_LIMIT_PER_MIN 20 per-minute limit for external graph apply
GENERATE_MAX_CONCURRENCY 2 concurrent generation jobs
GENERATE_TIMEOUT_SECONDS 45 timeout for LLM generation
GENERATE_QUEUE_WAIT_SECONDS 2.5 wait time to enter the generation queue
CONTROL_ALPHA 0.28 smoothing factor for control processing
CONTROL_DEAD_ZONE 0.015 ignore tiny control fluctuations
CONTROL_MAX_HZ 30.0 maximum outgoing control event rate
MEDIAPIPE_HAND_MODEL unset explicit path to hand_landmarker.task
LLM_MODE auto generation mode, including mock for tests

Development Workflow

Run tests

.\.venv312\Scripts\python.exe -m pytest

The test suite covers:

  • API health and runtime metadata
  • graph retrieval and apply flow
  • graph generation and validation errors
  • Mermaid generation flow
  • gesture mapping behavior and reset logic

Recommended local mode

For stable local development without external API calls:

LLM_MODE=mock

This keeps generation deterministic and avoids needing provider credentials while working on the transport, schema, UI, or gesture pipeline.

Integration Notes

Browser / desktop / game engine

Any client that supports HTTP, WebSocket, and JSON can integrate with this backend. A practical pattern is:

  1. Call /health, /api/system/capabilities, and /api/system/limits.
  2. Hydrate the latest graph via GET /api/graph/current.
  3. Open WebSocket subscriptions.
  4. Trigger graph generation or apply external graph snapshots.

TouchDesigner

For the best latency split:

  • use WS /ws/control for high-rate control data
  • use WS /ws/events for graph, status, and error
  • keep the final camera math and render loop inside TouchDesigner

See:

Demo Frontend

The demo page in static/index.html is intentionally lightweight. It is useful as:

  • a smoke test for backend connectivity
  • a visual debugger for graph payloads
  • a manual QA surface for gestures, import/export, and Mermaid parsing
  • a reference client for external integrations

The page uses CDN-delivered:

  • three.js
  • 3d-force-graph

Known Operational Notes

  • The graph is stored in process memory, so restarting the server resets the current in-memory state.
  • LLM generation can be rate-limited, queued, or timed out depending on configuration.
  • MediaPipe support can differ by build, which is why the project includes a Tasks API fallback path and model download script.
  • The current frontend is a demo surface, not a fully packaged production SPA.

Documentation Map

Suggested Positioning

This repository works well as:

  • a backend transport layer for interactive graph canvases
  • a prototype platform for AI-generated graph structures
  • a gesture-control bridge for realtime 3D visualization tools
  • a reference implementation for splitting generation, state, and control into separate channels

License

No explicit license file is currently present in the repository. Add one before public redistribution if you want the usage terms to be unambiguous.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors