A real-time music generation web application using pose detection and computer vision to turn body movements into musical expressions.
✅ Project Foundation Setup Complete
- Vite + React 18 + TypeScript - Modern development stack
- Tailwind CSS + Headless UI - Responsive UI framework
- ESLint + Prettier + Husky - Code quality and formatting
- Vitest + Testing Library - Comprehensive testing setup
- Zustand - State management
- React Query - Data fetching and caching
- MediaPipe (
@mediapipe/holistic,@mediapipe/camera_utils,@mediapipe/drawing_utils) - Tone.js - Web audio synthesis
- Development Tools - Complete toolchain for modern React development
src/
├── components/ # React UI components
│ ├── common/ # Reusable UI components (Button, LoadingSpinner)
│ ├── camera/ # Camera-related components
│ ├── audio/ # Audio-related components
│ └── dashboard/ # Main control panel components
├── core/ # Core functionality modules
│ ├── vision/ # Computer vision (MediaPipe integration)
│ ├── audio/ # Audio processing (Tone.js integration)
│ ├── mapping/ # Action-to-music mapping engine
│ └── performance/ # Performance monitoring
├── hooks/ # React hooks
├── stores/ # Zustand state management
├── types/ # TypeScript type definitions
├── utils/ # Utility functions and constants
└── styles/ # CSS and styling
- TypeScript - Strict configuration with path aliases
- Tailwind CSS - Custom theme with dark mode support
- Vite - Optimized build configuration
- ESLint/Prettier - Consistent code formatting
- Husky + lint-staged - Pre-commit hooks for code quality
- Vitest - Testing configuration with jsdom environment
- VS Code - Workspace settings and extension recommendations
Complete TypeScript definitions for:
- Common types - Base interfaces and utilities
- Pose detection - MediaPipe integration types
- Action recognition - Gesture and movement types
- Audio synthesis - Tone.js and Web Audio API types
- Mapping system - Action-to-music mapping types
- Performance monitoring - Metrics and optimization types
Zustand stores for:
- App Store - Global application state
- Camera Store - Camera device management
- Audio Store - Audio engine and synthesis state
- Node.js 16+
- Modern browser with WebRTC and Web Audio API support
npm installnpm run dev # Start development server
npm run build # Build for production
npm run test # Run tests
npm run lint # Lint code
npm run format # Format codenpm run test # Run tests
npm run test:ui # Run tests with UI
npm run test:coverage # Run tests with coverageThe foundation is complete. Next phase will implement:
-
MediaPipe Integration (45 minutes)
- Core engine initialization
- Camera permission handling
- Holistic model loading
- Real-time video processing
-
Pose Detection System (60 minutes)
- Basic pose detection
- Landmark data processing
- Hand and face keypoint extraction
- Data smoothing and filtering
-
Action Recognition (45 minutes)
- Finger snap detection
- Fist gesture recognition
- Mouth movement detection
- Action debouncing and validation
For detailed technical specifications and implementation guidelines, see DEVELOPMENT.md.
The project includes comprehensive testing setup:
- Unit tests for components and utilities
- Integration tests for component interactions
- Mock setup for MediaPipe and Web Audio APIs
- Test coverage reporting
Environment variables are configured in .env:
- MediaPipe model paths and settings
- Audio engine configuration
- Performance monitoring settings
- Debug and development options
- TypeScript strict mode for type safety
- ESLint + Prettier for code consistency
- Conventional commits for clear git history
- Pre-commit hooks for code quality assurance
- Component-first architecture for maintainability
Status: ✅ Stage 1 Complete - Ready for Stage 2 Development Next: Core Vision System Implementation