Enterprise-grade hardware-accelerated machine learning inference with IPFS network-based distribution
- Overview
- Installation
- Quick Start
- MCP++ Server
- Architecture
- Supported Hardware
- Supported Models
- Documentation
- IPFS & Distributed Features
- Performance & Optimization
- Troubleshooting
- Testing & Quality
- Contributing
- License
IPFS Accelerate Python combines cutting-edge hardware acceleration, distributed computing, and IPFS network integration to deliver blazing-fast machine learning inference across multiple platforms and devices - from data centers to browsers.
- π₯ 8+ Hardware Platforms - CPU, CUDA, ROCm, OpenVINO, Apple MPS, WebNN, WebGPU, Qualcomm
- π Distributed by Design - IPFS content addressing, P2P inference, global caching
- π€ 300+ Models - Full HuggingFace compatibility + custom architectures
- π§ Canonical MCP++ Server - Unified
ipfs_accelerate_py.mcp_serverruntime is now the default startup path - π Browser-Native - WebNN & WebGPU for client-side acceleration
- π Production Ready - Real-time monitoring, enterprise security, compliance validation
- β‘ High Performance - Intelligent caching, batch processing, model optimization
# 1. Create virtual environment
python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# 2. Install IPFS Accelerate
pip install -U pip setuptools wheel
pip install ipfs-accelerate-py
# 3. Verify installation
python -c "from ipfs_accelerate_py import IPFSAccelerator; print('β
Ready!')"By default, pip may install a CPU-only PyTorch wheel from PyPI (e.g. torch==...+cpu) because the CUDA-enabled wheels are published on PyTorch's wheel indexes.
If you have an NVIDIA GPU and want to ensure CUDA is available in PyTorch, install PyTorch from the CUDA wheel index:
python -m pip install -U pip
python -m pip install --upgrade --force-reinstall -r install/requirements_torch_cu124.txt
python -c "import torch; print('torch=', torch.__version__); print('cuda_available=', torch.cuda.is_available()); print('torch_cuda=', torch.version.cuda)"If you're on an NVIDIA GB10 / DGX Spark-class system (CUDA capability 12.1, CUDA 13.0), stable builds may warn that your GPU is unsupported. In that case, use the CUDA 13.0 nightly wheels:
./scripts/install_torch_cuda_cu130_nightly.shIf you're installing from source/editable mode, you can also run:
python -m pip install -e . --no-deps
python -m pip install --upgrade --force-reinstall -r install/requirements_torch_cu124.txt
python -m pip install -r requirements.txtChoose the profile that matches your needs:
| Profile | Use Case | Installation |
|---|---|---|
| Core | Basic inference | pip install ipfs-accelerate-py |
| Full | Models + API server | pip install ipfs-accelerate-py[full] |
| MCP | MCP server extras | pip install ipfs-accelerate-py[mcp] |
| Dev | Development setup | pip install -e . |
π Detailed instructions: Installation Guide | Troubleshooting | Getting Started
from ipfs_accelerate_py import IPFSAccelerator
# Initialize with automatic hardware detection
accelerator = IPFSAccelerator()
# Load any HuggingFace model
model = accelerator.load_model("bert-base-uncased")
# Run inference (automatically optimized for your hardware)
result = model.inference("Hello, world!")
print(result)# Start the default MCP++ server for automation
ipfs-accelerate mcp start
# Run the canonical FastAPI MCP service directly
python -m ipfs_accelerate_py.mcp_server.fastapi_service
# Run the direct MCP server CLI with p2p/task options
python -m ipfs_accelerate_py.mcp.cli --host 0.0.0.0 --port 9000
# Run inference directly
ipfs-accelerate inference generate \
--model bert-base-uncased \
--input "Hello, world!"
# List available models and hardware
ipfs-accelerate models list
ipfs-accelerate hardware status
# Start GitHub Actions autoscaler
ipfs-accelerate github autoscalerIf you want a remote machine running the ipfs_accelerate_py MCP server to also pick up libp2p task submissions coming from ipfs_datasets_py, you can start the MCP server CLI with the built-in P2P task worker:
# Remote machine (runs MCP + worker + libp2p TaskQueue service)
python -m ipfs_accelerate_py.mcp.cli \
--host 0.0.0.0 --port 9000 \
--p2p-task-worker --p2p-service --p2p-listen-port 9710 \
--p2p-queue ~/.cache/ipfs_datasets_py/task_queue.duckdb
# Optional (off-host clients): set the public IP that will be embedded in the announced multiaddr
export IPFS_DATASETS_PY_TASK_P2P_PUBLIC_IP="YOUR_PUBLIC_IP"By default, the libp2p TaskQueue service writes an announce file into your XDG cache dir and clients will try to use it automatically:
- Default announce file:
~/.cache/ipfs_accelerate_py/task_p2p_announce.json - Disable announce file (opt-out):
IPFS_ACCELERATE_PY_TASK_P2P_ANNOUNCE_FILE=0(orIPFS_DATASETS_PY_TASK_P2P_ANNOUNCE_FILE=0)
If your client machine can read that announce file (same host/user, or a shared filesystem path you set via
IPFS_ACCELERATE_PY_TASK_P2P_ANNOUNCE_FILE / IPFS_DATASETS_PY_TASK_P2P_ANNOUNCE_FILE), you do not need to set any remote multiaddr env vars.
Otherwise, the process also prints a multiaddr=... line. On the client machine, set:
export IPFS_DATASETS_PY_TASK_P2P_REMOTE_MULTIADDR="/ip4/.../tcp/9710/p2p/..."Notes:
- This mode requires
ipfs_datasets_pyto be installed on the remote machine (andlibp2pinstalled viaipfs_datasets_py[p2p]).
| Example | Description | Complexity |
|---|---|---|
| Basic Usage | Simple inference with BERT | Beginner |
| Hardware Selection | Choose specific accelerator | Intermediate |
| Distributed Inference | P2P model sharing | Advanced |
| Browser Integration | WebNN/WebGPU in browsers | Advanced |
π More examples: examples/ | Quick Start Guide
The MCP server in this repository has completed its unification cutover.
- Canonical runtime:
ipfs_accelerate_py/mcp_server - Compatibility facade:
ipfs_accelerate_py/mcp - Current default:
create_mcp_server()and the main MCP startup paths now select the unified runtime by default - Cutover status: approved and frozen with a focused release-candidate matrix of
120 passed
| Entry point | Best for | Notes |
|---|---|---|
ipfs-accelerate mcp start |
End-user server startup | Main product CLI for MCP server management and dashboard workflows |
python -m ipfs_accelerate_py.mcp.cli |
Direct server/process control | Starts the MCP server and can also host TaskQueue/libp2p worker services |
python -m ipfs_accelerate_py.mcp_server.fastapi_service |
Standalone HTTP/FastAPI hosting | Reads IPFS_MCP_* env vars and mounts the MCP app at /mcp by default |
from ipfs_accelerate_py.mcp_server import create_server |
Programmatic embedding | Stable import target for the canonical runtime package |
The unified runtime currently advertises these additive MCP++ profiles:
mcp++/profile-a-idlmcp++/profile-b-cid-artifactsmcp++/profile-c-ucanmcp++/profile-d-temporal-policymcp++/profile-e-mcp-p2p
- Meta-tools:
tools_list_categories,tools_list_tools,tools_get_schema,tools_dispatch,tools_runtime_metrics - Migrated native categories:
ipfs,workflow,p2p - Security and governance: UCAN validation, temporal/deontic policy evaluation, policy audit logging, secrets vault support, and risk scoring/frontier execution
- Observability: runtime metrics, audit-to-metrics bridging, OpenTelemetry hooks, and Prometheus exporter support
- Transport coverage: compatibility-tested process helpers, FastAPI mounting, and MCP+p2p handler parity with mixed-version negotiation hardening
These controls remain available for validation and operational rollback:
IPFS_MCP_FORCE_LEGACY_ROLLBACK=1β force the compatibility facade to stay on the legacy wrapperIPFS_MCP_UNIFIED_CUTOVER_DRY_RUN=1β validate the unified startup path while keeping legacy runtime behavior activeIPFS_MCP_ENABLE_UNIFIED_BRIDGE=1β explicitly request the unified bridge on compatibility-facade paths
- Canonical MCP server README
- MCP Cutover Checklist
- MCP Server Unification Plan
- MCP++ Conformance Checklist
- MCP++ Spec Gap Matrix
IPFS Accelerate Python is built on a modular, enterprise-grade architecture:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Application Layer β
β Python API β’ CLI β’ MCP Server β’ Web Dashboard β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββββ
β Hardware Abstraction Layer β
β Unified interface across 8+ hardware platforms β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββββ
β Inference Backends β
β CPU β’ CUDA β’ ROCm β’ MPS β’ OpenVINO β’ WebNN β’ WebGPU β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββββ
β IPFS Network Layer β
β Content addressing β’ P2P β’ Distributed caching β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Hardware Abstraction: Unified API across 8+ platforms with automatic selection
- IPFS Integration: Content-addressed storage, P2P distribution, intelligent caching
- Performance Modeling: ML-powered optimization and resource management
- MCP Server: Canonical
ipfs_accelerate_py.mcp_serverMCP++ runtime with compatibility facade and cutover controls - Monitoring: Real-time metrics, profiling, and analytics
π Detailed architecture: docs/architecture/overview.md | CI/CD
Run anywhere - from powerful servers to edge devices and browsers:
| Platform | Status | Acceleration | Requirements | Performance |
|---|---|---|---|---|
| CPU (x86/ARM) | β | SIMD, AVX | Any | Good |
| NVIDIA CUDA | β | GPU + TensorRT | CUDA 11.8+ | Excellent |
| AMD ROCm | β | GPU + HIP | ROCm 5.0+ | Excellent |
| Apple MPS | β | Metal | M1/M2/M3 | Excellent |
| Intel OpenVINO | β | CPU/GPU | Intel HW | Very Good |
| WebNN | β | Browser NPU | Chrome, Edge | Good |
| WebGPU | β | Browser GPU | Modern browsers | Very Good |
| Qualcomm | β | Mobile DSP | Snapdragon | Good |
The framework automatically detects and selects the best available hardware:
# Automatic (recommended)
accelerator = IPFSAccelerator() # Uses best available
# Manual selection
accelerator = IPFSAccelerator(device="cuda") # Force CUDA
accelerator = IPFSAccelerator(device="mps") # Force Apple MPSβοΈ Hardware guides: Hardware Optimization | Platform Support
| Category | Models | Status |
|---|---|---|
| Text | BERT, RoBERTa, DistilBERT, ALBERT, GPT-2/Neo/J, T5, BART, Pegasus, Sentence Transformers | β |
| Vision | ViT, DeiT, BEiT, ResNet, EfficientNet, DETR, YOLO | β |
| Audio | Whisper, Wav2Vec2, WavLM, Audio Transformers | β |
| Multimodal | CLIP, BLIP, LLaVA | β |
| Custom | PyTorch models, ONNX, TensorFlow (converted) | β |
# From HuggingFace Hub
model = accelerator.load_model("bert-base-uncased")
# From IPFS (content-addressed)
model = accelerator.load_model("ipfs://QmXxxx...")
# Local model
model = accelerator.load_model("./my_model/")
# With specific hardware
model = accelerator.load_model("gpt2", device="cuda")π€ Full model list: Supported Models | Custom Models Guide
| Guide | Description | Audience |
|---|---|---|
| Getting Started | Complete beginner tutorial | Everyone |
| Quick Start | Get running in 5 minutes | Everyone |
| Installation | Detailed setup instructions | Users |
| FAQ | Common questions & answers | Everyone |
| API Reference | Complete API documentation | Developers |
| Architecture | System design & components | Architects |
| Hardware Optimization | Platform-specific tuning | Engineers |
| Testing Guide | Testing & benchmarking | QA/DevOps |
| Topic | Resources |
|---|---|
| IPFS & P2P | IPFS Integration β’ P2P Networking |
| GitHub Actions | Autoscaler β’ CI/CD |
| Docker & K8s | Container Guide β’ Deployment |
| MCP Server | Canonical MCP Server README β’ MCP Setup β’ Protocol Docs β’ Cutover Checklist |
| Browser Support | WebNN/WebGPU β’ Examples |
Our documentation has been professionally audited (January 2026):
- β 200+ files covering all features
- β 93/100 quality score (Excellent)
- β Comprehensive - From beginner to expert
- β Well-organized - Clear structure and navigation
- β Verified - All examples tested and working
π Documentation Hub: docs/ | Full Index | Audit Report
IPFS integration provides enterprise-grade distributed computing:
- π Content Addressing - Cryptographically secure, immutable model distribution
- π Global Network - Automatic peer discovery and geographic optimization
- β‘ Intelligent Caching - Multi-level LRU caching across the network
- π Load Balancing - Automatic distribution across available peers
- π‘οΈ Fault Tolerance - Robust error handling and fallback mechanisms
The IPFS Backend Router provides a flexible, pluggable backend system with automatic fallback:
Backend Preference Order:
- ipfs_kit_py - Full distributed storage (preferred)
- HuggingFace Cache - Local storage with IPFS addressing
- Kubo CLI - Standard IPFS daemon
from ipfs_accelerate_py import ipfs_backend_router
# Store model weights to IPFS
cid = ipfs_backend_router.add_path("/path/to/model", pin=True)
print(f"Model CID: {cid}")
# Retrieve from anywhere
ipfs_backend_router.get_to_path(cid, output_path="/cache/model")Configuration:
# Prefer ipfs_kit_py (default)
export ENABLE_IPFS_KIT=true
# Use HF cache only (good for CI/CD)
export IPFS_BACKEND=hf_cache
# Force Kubo CLI
export IPFS_BACKEND=kuboπ Full documentation: IPFS Backend Router Guide
# Enable P2P inference
accelerator = IPFSAccelerator(enable_p2p=True)
# Model is automatically shared across peers
model = accelerator.load_model("bert-base-uncased")
# Inference uses best available peer
result = model.inference("Distributed AI!")| Feature | Description | Status |
|---|---|---|
| P2P Workflow Scheduler | Distributed task execution with merkle clocks | β |
| GitHub Actions Cache | Distributed cache for CI/CD | β |
| Autoscaler | Dynamic runner provisioning | β |
| MCP Server | Model Context Protocol (14+ tools) | β |
π Learn more: IPFS Guide | P2P Architecture | Network Setup
# Run all tests
pytest
# Run specific test suite
pytest test/test_inference.py
# Run with coverage report
pytest --cov=ipfs_accelerate_py --cov-report=html
# Run benchmarks
python data/benchmarks/run_benchmarks.py| Metric | Status | Details |
|---|---|---|
| Test Coverage | β | Comprehensive test suite |
| Documentation | β 93/100 | Audit Report |
| Code Quality | β | Linted, type-checked |
| Security | β | Regular vulnerability scans |
| Performance | β | Benchmarked across platforms |
π§ͺ Testing guide: docs/guides/testing/TESTING_README.md | CI/CD Setup
| Hardware | Model | Throughput | Latency |
|---|---|---|---|
| NVIDIA RTX 3090 | BERT-base | ~2000 samples/sec | <1ms |
| Apple M2 Max | BERT-base | ~800 samples/sec | 2-3ms |
| Intel i9 (CPU) | BERT-base | ~100 samples/sec | 10-15ms |
| WebGPU (Browser) | BERT-base | ~50 samples/sec | 20-30ms |
# Enable mixed precision for 2x speedup
accelerator = IPFSAccelerator(precision="fp16")
# Use batch processing for better throughput
results = model.batch_inference(inputs, batch_size=32)
# Enable model quantization for 4x memory reduction
model = accelerator.load_model("bert-base-uncased", quantize=True)
# Use intelligent caching for repeated queries
accelerator = IPFSAccelerator(enable_cache=True)π Performance guide: Hardware Optimization | Benchmarking
| Issue | Solution |
|---|---|
| Import errors | pip install --upgrade ipfs-accelerate-py |
| CUDA not found | Install CUDA Toolkit 11.8+ |
| Slow inference | Check hardware selection, enable caching |
| Memory errors | Use quantization, reduce batch size |
| Connection issues | Check IPFS daemon, firewall settings |
# Verify installation
python -c "import ipfs_accelerate_py; print(ipfs_accelerate_py.__version__)"
# Check hardware detection
ipfs-accelerate hardware status
# Test basic inference
ipfs-accelerate inference test
# View logs
ipfs-accelerate logs --tail 100π Get help: Troubleshooting Guide | FAQ | GitHub Issues
We welcome contributions! Here's how to get started:
- Fork & Clone: Get your own copy of the repository
- Create Branch:
git checkout -b feature/your-feature - Make Changes: Follow our coding standards
- Run Tests:
pytestto ensure everything works - Submit PR: Open a pull request with clear description
- π Bug Reports - Found an issue? Let us know!
- π Documentation - Help improve guides and examples
- π§ͺ Testing - Add tests for edge cases
- π Translations - Translate docs to other languages
- π‘ Features - Suggest or implement new features
- π¬ GitHub Discussions - Ask questions, share ideas
- π Issue Tracker - Report bugs, request features
- π Security Policy - Report security vulnerabilities
- π§ Email: starworks5@gmail.com
π Full guides: CONTRIBUTING.md | Code of Conduct | Security Policy
This project is licensed under the GNU Affero General Public License v3.0 or later (AGPLv3+).
What this means:
- β Free to use, modify, and distribute
- β Commercial use allowed
- β Patent protection included
β οΈ Source code must be disclosed for network servicesβ οΈ Modifications must use same license
π Details: LICENSE | AGPL FAQ
Built with amazing open source technologies:
- HuggingFace Transformers - ML model ecosystem
- IPFS - Distributed file system
- PyTorch - Deep learning framework
- FastAPI - Modern web framework
Special thanks to all contributors who make this project possible! π
- π Changelog - Version history and release notes
- π Security Policy - Security reporting and best practices
- π€ Contributing Guide - How to contribute
- π License - AGPLv3+ license details
If you find this project useful:
- β Star this repository on GitHub
- π’ Share with your network
- π Report issues to help improve it
- π‘ Contribute features or fixes
- π Write about your experience
Made with β€οΈ by Benjamin Barber and contributors
π Homepage β’ π Documentation β’ π Issues β’ π¬ Discussions