Skip to content

endomorphosis/ipfs_accelerate_py

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2,220 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

IPFS Accelerate Python

Enterprise-grade hardware-accelerated machine learning inference with IPFS network-based distribution

PyPI version License: AGPL v3 Python 3.8+ Documentation Tests


πŸ“‹ Table of Contents


πŸš€ Overview

IPFS Accelerate Python combines cutting-edge hardware acceleration, distributed computing, and IPFS network integration to deliver blazing-fast machine learning inference across multiple platforms and devices - from data centers to browsers.

⚑ Key Highlights

  • πŸ”₯ 8+ Hardware Platforms - CPU, CUDA, ROCm, OpenVINO, Apple MPS, WebNN, WebGPU, Qualcomm
  • 🌐 Distributed by Design - IPFS content addressing, P2P inference, global caching
  • πŸ€– 300+ Models - Full HuggingFace compatibility + custom architectures
  • 🧠 Canonical MCP++ Server - Unified ipfs_accelerate_py.mcp_server runtime is now the default startup path
  • 🌍 Browser-Native - WebNN & WebGPU for client-side acceleration
  • πŸ“Š Production Ready - Real-time monitoring, enterprise security, compliance validation
  • ⚑ High Performance - Intelligent caching, batch processing, model optimization

πŸ“¦ Installation

Quick Start (5 minutes)

# 1. Create virtual environment
python3 -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

# 2. Install IPFS Accelerate
pip install -U pip setuptools wheel
pip install ipfs-accelerate-py

# 3. Verify installation
python -c "from ipfs_accelerate_py import IPFSAccelerator; print('βœ… Ready!')"

NVIDIA CUDA (PyTorch)

By default, pip may install a CPU-only PyTorch wheel from PyPI (e.g. torch==...+cpu) because the CUDA-enabled wheels are published on PyTorch's wheel indexes.

If you have an NVIDIA GPU and want to ensure CUDA is available in PyTorch, install PyTorch from the CUDA wheel index:

python -m pip install -U pip
python -m pip install --upgrade --force-reinstall -r install/requirements_torch_cu124.txt

python -c "import torch; print('torch=', torch.__version__); print('cuda_available=', torch.cuda.is_available()); print('torch_cuda=', torch.version.cuda)"

If you're on an NVIDIA GB10 / DGX Spark-class system (CUDA capability 12.1, CUDA 13.0), stable builds may warn that your GPU is unsupported. In that case, use the CUDA 13.0 nightly wheels:

./scripts/install_torch_cuda_cu130_nightly.sh

If you're installing from source/editable mode, you can also run:

python -m pip install -e . --no-deps
python -m pip install --upgrade --force-reinstall -r install/requirements_torch_cu124.txt
python -m pip install -r requirements.txt

Installation Profiles

Choose the profile that matches your needs:

Profile Use Case Installation
Core Basic inference pip install ipfs-accelerate-py
Full Models + API server pip install ipfs-accelerate-py[full]
MCP MCP server extras pip install ipfs-accelerate-py[mcp]
Dev Development setup pip install -e .

πŸ“š Detailed instructions: Installation Guide | Troubleshooting | Getting Started


🎯 Quick Start

Python API

from ipfs_accelerate_py import IPFSAccelerator

# Initialize with automatic hardware detection
accelerator = IPFSAccelerator()

# Load any HuggingFace model
model = accelerator.load_model("bert-base-uncased")

# Run inference (automatically optimized for your hardware)
result = model.inference("Hello, world!")
print(result)

Command Line Interface

# Start the default MCP++ server for automation
ipfs-accelerate mcp start

# Run the canonical FastAPI MCP service directly
python -m ipfs_accelerate_py.mcp_server.fastapi_service

# Run the direct MCP server CLI with p2p/task options
python -m ipfs_accelerate_py.mcp.cli --host 0.0.0.0 --port 9000

# Run inference directly
ipfs-accelerate inference generate \
  --model bert-base-uncased \
  --input "Hello, world!"

# List available models and hardware
ipfs-accelerate models list
ipfs-accelerate hardware status

# Start GitHub Actions autoscaler
ipfs-accelerate github autoscaler

Remote libp2p task pickup (ipfs_datasets_py)

If you want a remote machine running the ipfs_accelerate_py MCP server to also pick up libp2p task submissions coming from ipfs_datasets_py, you can start the MCP server CLI with the built-in P2P task worker:

# Remote machine (runs MCP + worker + libp2p TaskQueue service)
python -m ipfs_accelerate_py.mcp.cli \
  --host 0.0.0.0 --port 9000 \
  --p2p-task-worker --p2p-service --p2p-listen-port 9710 \
  --p2p-queue ~/.cache/ipfs_datasets_py/task_queue.duckdb

# Optional (off-host clients): set the public IP that will be embedded in the announced multiaddr
export IPFS_DATASETS_PY_TASK_P2P_PUBLIC_IP="YOUR_PUBLIC_IP"

By default, the libp2p TaskQueue service writes an announce file into your XDG cache dir and clients will try to use it automatically:

  • Default announce file: ~/.cache/ipfs_accelerate_py/task_p2p_announce.json
  • Disable announce file (opt-out): IPFS_ACCELERATE_PY_TASK_P2P_ANNOUNCE_FILE=0 (or IPFS_DATASETS_PY_TASK_P2P_ANNOUNCE_FILE=0)

If your client machine can read that announce file (same host/user, or a shared filesystem path you set via IPFS_ACCELERATE_PY_TASK_P2P_ANNOUNCE_FILE / IPFS_DATASETS_PY_TASK_P2P_ANNOUNCE_FILE), you do not need to set any remote multiaddr env vars.

Otherwise, the process also prints a multiaddr=... line. On the client machine, set:

export IPFS_DATASETS_PY_TASK_P2P_REMOTE_MULTIADDR="/ip4/.../tcp/9710/p2p/..."

Notes:

  • This mode requires ipfs_datasets_py to be installed on the remote machine (and libp2p installed via ipfs_datasets_py[p2p]).

Real-World Examples

Example Description Complexity
Basic Usage Simple inference with BERT Beginner
Hardware Selection Choose specific accelerator Intermediate
Distributed Inference P2P model sharing Advanced
Browser Integration WebNN/WebGPU in browsers Advanced

πŸ“– More examples: examples/ | Quick Start Guide


🧠 MCP++ Server

The MCP server in this repository has completed its unification cutover.

  • Canonical runtime: ipfs_accelerate_py/mcp_server
  • Compatibility facade: ipfs_accelerate_py/mcp
  • Current default: create_mcp_server() and the main MCP startup paths now select the unified runtime by default
  • Cutover status: approved and frozen with a focused release-candidate matrix of 120 passed

Current entrypoints

Entry point Best for Notes
ipfs-accelerate mcp start End-user server startup Main product CLI for MCP server management and dashboard workflows
python -m ipfs_accelerate_py.mcp.cli Direct server/process control Starts the MCP server and can also host TaskQueue/libp2p worker services
python -m ipfs_accelerate_py.mcp_server.fastapi_service Standalone HTTP/FastAPI hosting Reads IPFS_MCP_* env vars and mounts the MCP app at /mcp by default
from ipfs_accelerate_py.mcp_server import create_server Programmatic embedding Stable import target for the canonical runtime package

Supported MCP++ profile chapters

The unified runtime currently advertises these additive MCP++ profiles:

  • mcp++/profile-a-idl
  • mcp++/profile-b-cid-artifacts
  • mcp++/profile-c-ucan
  • mcp++/profile-d-temporal-policy
  • mcp++/profile-e-mcp-p2p

Unified control-plane features

  • Meta-tools: tools_list_categories, tools_list_tools, tools_get_schema, tools_dispatch, tools_runtime_metrics
  • Migrated native categories: ipfs, workflow, p2p
  • Security and governance: UCAN validation, temporal/deontic policy evaluation, policy audit logging, secrets vault support, and risk scoring/frontier execution
  • Observability: runtime metrics, audit-to-metrics bridging, OpenTelemetry hooks, and Prometheus exporter support
  • Transport coverage: compatibility-tested process helpers, FastAPI mounting, and MCP+p2p handler parity with mixed-version negotiation hardening

Cutover and rollback controls

These controls remain available for validation and operational rollback:

  • IPFS_MCP_FORCE_LEGACY_ROLLBACK=1 β€” force the compatibility facade to stay on the legacy wrapper
  • IPFS_MCP_UNIFIED_CUTOVER_DRY_RUN=1 β€” validate the unified startup path while keeping legacy runtime behavior active
  • IPFS_MCP_ENABLE_UNIFIED_BRIDGE=1 β€” explicitly request the unified bridge on compatibility-facade paths

Recommended documentation


πŸ—οΈ Architecture

IPFS Accelerate Python is built on a modular, enterprise-grade architecture:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   Application Layer                     β”‚
β”‚  Python API β€’ CLI β€’ MCP Server β€’ Web Dashboard          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Hardware Abstraction Layer                 β”‚
β”‚  Unified interface across 8+ hardware platforms         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                Inference Backends                       β”‚
β”‚  CPU β€’ CUDA β€’ ROCm β€’ MPS β€’ OpenVINO β€’ WebNN β€’ WebGPU    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              IPFS Network Layer                         β”‚
β”‚  Content addressing β€’ P2P β€’ Distributed caching         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Core Components

  • Hardware Abstraction: Unified API across 8+ platforms with automatic selection
  • IPFS Integration: Content-addressed storage, P2P distribution, intelligent caching
  • Performance Modeling: ML-powered optimization and resource management
  • MCP Server: Canonical ipfs_accelerate_py.mcp_server MCP++ runtime with compatibility facade and cutover controls
  • Monitoring: Real-time metrics, profiling, and analytics

πŸ“ Detailed architecture: docs/architecture/overview.md | CI/CD


πŸ”§ Supported Hardware

Run anywhere - from powerful servers to edge devices and browsers:

Platform Status Acceleration Requirements Performance
CPU (x86/ARM) βœ… SIMD, AVX Any Good
NVIDIA CUDA βœ… GPU + TensorRT CUDA 11.8+ Excellent
AMD ROCm βœ… GPU + HIP ROCm 5.0+ Excellent
Apple MPS βœ… Metal M1/M2/M3 Excellent
Intel OpenVINO βœ… CPU/GPU Intel HW Very Good
WebNN βœ… Browser NPU Chrome, Edge Good
WebGPU βœ… Browser GPU Modern browsers Very Good
Qualcomm βœ… Mobile DSP Snapdragon Good

Hardware Selection

The framework automatically detects and selects the best available hardware:

# Automatic (recommended)
accelerator = IPFSAccelerator()  # Uses best available

# Manual selection
accelerator = IPFSAccelerator(device="cuda")  # Force CUDA
accelerator = IPFSAccelerator(device="mps")   # Force Apple MPS

βš™οΈ Hardware guides: Hardware Optimization | Platform Support


πŸ€– Supported Models

Pre-trained Models (300+)

Category Models Status
Text BERT, RoBERTa, DistilBERT, ALBERT, GPT-2/Neo/J, T5, BART, Pegasus, Sentence Transformers βœ…
Vision ViT, DeiT, BEiT, ResNet, EfficientNet, DETR, YOLO βœ…
Audio Whisper, Wav2Vec2, WavLM, Audio Transformers βœ…
Multimodal CLIP, BLIP, LLaVA βœ…
Custom PyTorch models, ONNX, TensorFlow (converted) βœ…

Model Loading

# From HuggingFace Hub
model = accelerator.load_model("bert-base-uncased")

# From IPFS (content-addressed)
model = accelerator.load_model("ipfs://QmXxxx...")

# Local model
model = accelerator.load_model("./my_model/")

# With specific hardware
model = accelerator.load_model("gpt2", device="cuda")

πŸ€– Full model list: Supported Models | Custom Models Guide


πŸ“š Documentation

πŸ“– Essential Guides

Guide Description Audience
Getting Started Complete beginner tutorial Everyone
Quick Start Get running in 5 minutes Everyone
Installation Detailed setup instructions Users
FAQ Common questions & answers Everyone
API Reference Complete API documentation Developers
Architecture System design & components Architects
Hardware Optimization Platform-specific tuning Engineers
Testing Guide Testing & benchmarking QA/DevOps

🎯 Specialized Topics

Topic Resources
IPFS & P2P IPFS Integration β€’ P2P Networking
GitHub Actions Autoscaler β€’ CI/CD
Docker & K8s Container Guide β€’ Deployment
MCP Server Canonical MCP Server README β€’ MCP Setup β€’ Protocol Docs β€’ Cutover Checklist
Browser Support WebNN/WebGPU β€’ Examples

πŸ“Š Documentation Quality

Our documentation has been professionally audited (January 2026):

  • βœ… 200+ files covering all features
  • βœ… 93/100 quality score (Excellent)
  • βœ… Comprehensive - From beginner to expert
  • βœ… Well-organized - Clear structure and navigation
  • βœ… Verified - All examples tested and working

πŸ“‹ Documentation Hub: docs/ | Full Index | Audit Report


🌐 IPFS & Distributed Features

Why IPFS?

IPFS integration provides enterprise-grade distributed computing:

  • πŸ” Content Addressing - Cryptographically secure, immutable model distribution
  • 🌍 Global Network - Automatic peer discovery and geographic optimization
  • ⚑ Intelligent Caching - Multi-level LRU caching across the network
  • πŸ”„ Load Balancing - Automatic distribution across available peers
  • πŸ›‘οΈ Fault Tolerance - Robust error handling and fallback mechanisms

IPFS Backend Router (New! ⭐)

The IPFS Backend Router provides a flexible, pluggable backend system with automatic fallback:

Backend Preference Order:

  1. ipfs_kit_py - Full distributed storage (preferred)
  2. HuggingFace Cache - Local storage with IPFS addressing
  3. Kubo CLI - Standard IPFS daemon
from ipfs_accelerate_py import ipfs_backend_router

# Store model weights to IPFS
cid = ipfs_backend_router.add_path("/path/to/model", pin=True)
print(f"Model CID: {cid}")

# Retrieve from anywhere
ipfs_backend_router.get_to_path(cid, output_path="/cache/model")

Configuration:

# Prefer ipfs_kit_py (default)
export ENABLE_IPFS_KIT=true

# Use HF cache only (good for CI/CD)
export IPFS_BACKEND=hf_cache

# Force Kubo CLI
export IPFS_BACKEND=kubo

πŸ“š Full documentation: IPFS Backend Router Guide

Distributed Inference

# Enable P2P inference
accelerator = IPFSAccelerator(enable_p2p=True)

# Model is automatically shared across peers
model = accelerator.load_model("bert-base-uncased")

# Inference uses best available peer
result = model.inference("Distributed AI!")

Advanced Features

Feature Description Status
P2P Workflow Scheduler Distributed task execution with merkle clocks βœ…
GitHub Actions Cache Distributed cache for CI/CD βœ…
Autoscaler Dynamic runner provisioning βœ…
MCP Server Model Context Protocol (14+ tools) βœ…

🌐 Learn more: IPFS Guide | P2P Architecture | Network Setup


πŸ§ͺ Testing & Quality

# Run all tests
pytest

# Run specific test suite
pytest test/test_inference.py

# Run with coverage report
pytest --cov=ipfs_accelerate_py --cov-report=html

# Run benchmarks
python data/benchmarks/run_benchmarks.py

Quality Metrics

Metric Status Details
Test Coverage βœ… Comprehensive test suite
Documentation βœ… 93/100 Audit Report
Code Quality βœ… Linted, type-checked
Security βœ… Regular vulnerability scans
Performance βœ… Benchmarked across platforms

πŸ§ͺ Testing guide: docs/guides/testing/TESTING_README.md | CI/CD Setup


⚑ Performance & Optimization

Benchmarks

Hardware Model Throughput Latency
NVIDIA RTX 3090 BERT-base ~2000 samples/sec <1ms
Apple M2 Max BERT-base ~800 samples/sec 2-3ms
Intel i9 (CPU) BERT-base ~100 samples/sec 10-15ms
WebGPU (Browser) BERT-base ~50 samples/sec 20-30ms

Optimization Tips

# Enable mixed precision for 2x speedup
accelerator = IPFSAccelerator(precision="fp16")

# Use batch processing for better throughput
results = model.batch_inference(inputs, batch_size=32)

# Enable model quantization for 4x memory reduction
model = accelerator.load_model("bert-base-uncased", quantize=True)

# Use intelligent caching for repeated queries
accelerator = IPFSAccelerator(enable_cache=True)

πŸ“Š Performance guide: Hardware Optimization | Benchmarking


πŸ”§ Troubleshooting

Common Issues

Issue Solution
Import errors pip install --upgrade ipfs-accelerate-py
CUDA not found Install CUDA Toolkit 11.8+
Slow inference Check hardware selection, enable caching
Memory errors Use quantization, reduce batch size
Connection issues Check IPFS daemon, firewall settings

Quick Fixes

# Verify installation
python -c "import ipfs_accelerate_py; print(ipfs_accelerate_py.__version__)"

# Check hardware detection
ipfs-accelerate hardware status

# Test basic inference
ipfs-accelerate inference test

# View logs
ipfs-accelerate logs --tail 100

πŸ†˜ Get help: Troubleshooting Guide | FAQ | GitHub Issues


🀝 Contributing

We welcome contributions! Here's how to get started:

Quick Contribution Guide

  1. Fork & Clone: Get your own copy of the repository
  2. Create Branch: git checkout -b feature/your-feature
  3. Make Changes: Follow our coding standards
  4. Run Tests: pytest to ensure everything works
  5. Submit PR: Open a pull request with clear description

Areas We Need Help

  • πŸ› Bug Reports - Found an issue? Let us know!
  • πŸ“š Documentation - Help improve guides and examples
  • πŸ§ͺ Testing - Add tests for edge cases
  • 🌍 Translations - Translate docs to other languages
  • πŸ’‘ Features - Suggest or implement new features

Community & Guidelines

πŸ“– Full guides: CONTRIBUTING.md | Code of Conduct | Security Policy


πŸ“„ License

This project is licensed under the GNU Affero General Public License v3.0 or later (AGPLv3+).

What this means:

  • βœ… Free to use, modify, and distribute
  • βœ… Commercial use allowed
  • βœ… Patent protection included
  • ⚠️ Source code must be disclosed for network services
  • ⚠️ Modifications must use same license

πŸ“‹ Details: LICENSE | AGPL FAQ


πŸ™ Acknowledgments

Built with amazing open source technologies:

Special thanks to all contributors who make this project possible! 🌟

Project Information


🌟 Show Your Support

If you find this project useful:

  • ⭐ Star this repository on GitHub
  • πŸ“’ Share with your network
  • πŸ› Report issues to help improve it
  • πŸ’‘ Contribute features or fixes
  • πŸ“ Write about your experience

About

A model server and decentralized task management system.

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors