A Python application that performs real-time object detection, segmentation, classification, and pose estimation on Twitch streams using YOLO models. The application supports low-latency streaming with audio playback capabilities.
On practical use, my laptop with an Nvidia GeForce GTX 1650 and Intel i5 can manage a real-time stream at 360p/30FPS with sound, using the YOLO nano model.
- Real-time video processing using YOLO v11 models
- Multiple detection tasks: object detection, segmentation, classification, and pose estimation
- Configurable confidence threshold
- Low-latency Twitch stream capture with configurable quality
- Optional audio playback
- Multi-threaded processing for improved performance
- CUDA acceleration support
- Configurable model sizes (nano, small, medium, large, xlarge)
- Python 3.8+
- FFmpeg for video processing. Ensure FFmpeg is installed and accessible in your system's PATH.
- CUDA-compatible GPU (recommended)
- See requirements.txt for Python dependencies
streamer = "username" # Twitch username
desired_quality = "360p" # '160p', '360p', '480p', '720p60', '1080p60'
hls_url = get_low_latency_stream(streamer, quality=desired_quality)
detector = VideoDetector(
task="detect", # segment, classify, pose
size="nano",
input_path=hls_url,
quality=desired_quality,
class_name="person",
conf=0.7, # Confidence threshold for detection
enable_audio=True
)
detector.process_video()