Turn your GPU into an AI inference endpoint and join the Swan Chain decentralized computing network.
No wallet needed. No blockchain registration. No public IP required.
# 0. Install build tools (skip if already installed)
sudo apt-get update && sudo apt-get install -y git make
wget https://go.dev/dl/go1.22.0.linux-amd64.tar.gz
sudo rm -rf /usr/local/go && sudo tar -C /usr/local -xzf go1.22.0.linux-amd64.tar.gz
echo 'export PATH=$PATH:/usr/local/go/bin' >> ~/.bashrc && source ~/.bashrc
# 1. Clone and build
git clone https://github.com/swanchain/computing-provider.git
cd computing-provider
make clean && make testnet && sudo make install
# 2. Download model weights from HuggingFace (e.g., Qwen 2.5 7B)
computing-provider models download Qwen/Qwen2.5-7B-Instruct
# 3. Start SGLang with the downloaded model
docker run -d --gpus all -p 30000:30000 --ipc=host --name sglang \
-v ~/.swan/models/Qwen/Qwen2.5-7B-Instruct:/models \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server --model-path /models \
--host 0.0.0.0 --port 30000 \
--served-model-name Qwen/Qwen2.5-7B-Instruct
# 4. Run the setup wizard (handles auth, config, and model discovery)
computing-provider setup
# 5. Run the provider
computing-provider run
# 7. Verify your provider is connected
computing-provider inference statusThe models download command downloads model weights directly from HuggingFace. Large weight files (LFS) are verified with SHA256 hashes. The setup wizard will:
- Check prerequisites (Docker, GPU)
- Create/login to your Swan Inference account
- Auto-discover your running model servers
- Auto-match local models to Swan Inference model IDs
- Generate
config.tomlandmodels.json
# 1. Install Ollama and pull a model
brew install ollama
ollama serve &
ollama pull qwen2.5:7b
# 2. Install Computing Provider
brew install go
git clone https://github.com/swanchain/computing-provider.git
cd computing-provider
make clean && make testnet && sudo make install
# 3. Run the setup wizard
computing-provider setup
# 4. Run the provider
computing-provider run
# 5. Verify your provider is connected
computing-provider inference statusThe setup wizard auto-discovers Ollama models and matches them to Swan Inference model IDs (e.g., qwen2.5:7b → qwen-2.5-7b).
Once your provider is running, it goes through these stages automatically. Most providers are fully active within a day.
Connect ──▶ Benchmark ──▶ Approval ──▶ Collateral ──▶ Active
(instant) (automatic) (< 24 hrs) (USD or crypto) (earning)
| Stage | What happens | Time |
|---|---|---|
| Connect | Provider connects to the network and registers its models | Immediate |
| Benchmark | Automated benchmarks verify your GPU can serve the registered models | Minutes (automatic) |
| Approval | Admin reviews your provider | < 24 hours |
| Collateral | Deposit collateral to secure your position and unlock earnings (Stripe/PayPal or USDC/USDT on-chain) | Instant |
| Active | Start receiving inference requests and earning rewards | Ongoing |
Grace period: New providers get a 7-day grace period after activation. During this period, benchmark failures and low uptime won't affect your routing priority, giving you time to stabilize your setup.
Check your current stage at any time:
computing-provider inference statusSwan Inference (Cloud)
│
│ WebSocket (outbound connection - works behind NAT)
▼
┌───────────────────────┐
│ Computing Provider │
│ ┌─────────────────┐ │
│ │ Your GPU Server │ │
│ │ (SGLang/Ollama) │ │
│ └─────────────────┘ │
└───────────────────────┘
- Provider connects outbound to Swan Inference (no inbound ports needed)
- Registers available models
- Receives inference requests via WebSocket
- Forwards to local model server, returns response
- Earn rewards for completed requests (optional wallet setup)
| Category | Requirement |
|---|---|
| GPU | NVIDIA RTX 3090, 4090, A100, H100, or equivalent |
| VRAM | Minimum 16GB (24GB+ recommended) |
| RAM | Minimum 32GB system memory |
| Storage | 500GB+ SSD for model weights |
| OS | Ubuntu 22.04+ or Debian 11+ |
| NVIDIA Driver | 535.x or newer |
| CUDA | 12.1 or newer |
| Docker | 24.0+ with NVIDIA Container Toolkit |
| Network | 100 Mbps minimum (1 Gbps recommended), stable connection with low latency |
| Category | Requirement |
|---|---|
| Chip | Apple Silicon M1, M2, M3, or M4 |
| Memory | 16GB+ unified memory (32GB+ recommended) |
| Storage | 500GB+ SSD for model weights |
| OS | macOS 13 Ventura or newer |
| Software | Ollama (latest version) |
| Network | 100 Mbps minimum, stable connection with low latency |
Ports: Only outbound WebSocket connections are needed — no port forwarding or public IP required.
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
# Verify
docker run --rm --gpus all nvidia/cuda:12.0-base-ubuntu22.04 nvidia-smiMap Swan Inference model IDs to your local inference endpoints:
{
"qwen-2.5-7b": {
"endpoint": "http://localhost:30000",
"gpu_memory": 16000,
"category": "text-generation"
}
}| Field | Description |
|---|---|
endpoint |
URL of your local inference server |
gpu_memory |
GPU memory required (MB) |
category |
Model category (text-generation, image-generation, etc.) |
local_model |
(Optional) Actual model name for local server (e.g., Ollama model name) |
Note: The
local_modelfield is used when your local server uses different model names than Swan Inference. For example, Ollama usesqwen2.5:7bwhile Swan Inference expectsqwen-2.5-7b. The setup wizard handles this mapping automatically.
Located at ~/.swan/computing/config.toml:
[API]
Port = 8085
NodeName = "my-provider"
[Inference]
Enable = true
WebSocketURL = "ws://inference-ws-dev.swanchain.io"
ServiceURL = "https://api-dev.swanchain.io"
ApiKey = "sk-prov-xxxxxxxxxxxxxxxxxxxx" # Required - get from https://inference-dev.swanchain.io
Models = ["qwen-2.5-7b"]computing-provider dashboard
# Open http://localhost:3005Features: Real-time metrics, GPU status, model management, request controls.
# View metrics
curl http://localhost:8085/api/v1/computing/inference/metrics
# List models
curl http://localhost:8085/api/v1/computing/inference/models
# Check health
curl http://localhost:8085/api/v1/computing/inference/health| Endpoint | Description |
|---|---|
GET /inference/metrics |
Request counts, latency, GPU stats |
GET /inference/metrics/prometheus |
Prometheus format for Grafana |
GET /inference/models |
List all models with status |
POST /inference/models/:id/enable |
Enable a model |
POST /inference/models/:id/disable |
Disable a model |
POST /inference/models/reload |
Hot-reload models.json |
To receive SWAN token rewards for completed inference requests, set your beneficiary wallet:
# Set the wallet address where rewards will be sent
computing-provider inference set-beneficiary 0xYourWalletAddressFor on-chain collateral (optional, increases reward tier):
# 1. Create a wallet
computing-provider wallet new
# 2. Add collateral
computing-provider collateral add --ecp --from <your-wallet> <amount>Note: You can run the provider without a wallet - it will still serve inference requests, but you won't receive on-chain rewards.
computing-provider setup # Interactive setup wizard (recommended)
computing-provider run # Start provider
computing-provider inference status # Check status on Swan Inference
computing-provider inference config # Show inference config
computing-provider inference deposit # Get collateral deposit instructions
computing-provider inference deposit --check # Check current collateral status
computing-provider inference set-beneficiary 0x... # Set reward wallet
computing-provider dashboard # Web UI (port 3005)
computing-provider task list --ecp # List tasksThe setup wizard is the recommended way to configure a new provider:
computing-provider setup # Full interactive setup
computing-provider setup --skip-discovery # Skip model discovery
computing-provider setup --api-key=sk-prov-xxx # Use existing API key
# Subcommands
computing-provider setup discover # Just discover model servers
computing-provider setup login # Login to existing account
computing-provider setup signup # Create new accountcomputing-provider wallet new # Create wallet
computing-provider wallet list # List wallets
computing-provider wallet import <file> # Import private keycomputing-provider research hardware # All hardware info
computing-provider research gpu-info # GPU details
computing-provider research gpu-benchmark # Run benchmark| Error | Solution |
|---|---|
go: command not found |
Install Go 1.21+: see go.dev/dl |
permission denied...docker.sock |
Add user to docker group: sudo usermod -aG docker $USER |
could not select device driver "nvidia" |
Install NVIDIA Container Toolkit |
container "/resource-exporter" already in use |
Run docker rm -f resource-exporter |
authentication required |
Set ApiKey in config.toml or INFERENCE_API_KEY env var |
invalid provider API key |
Verify key starts with sk-prov- and is not revoked |
WebSocket connection failed |
Check WebSocketURL and network connectivity |
| Provider not receiving requests | Check models.json matches your inference server |
cuda>=12.x unsatisfied condition |
Use an older SGLang tag: lmsysorg/sglang:v0.4.7.post1-cu124 |
# Provider logs
tail -f cp.log
# Inference server logs
docker logs sglangQ: make mainnet fails with go: command not found
Install Go 1.22+. On Linux: download from go.dev/dl and add to PATH. On macOS: brew install go. Make sure to restart your shell or source ~/.bashrc after installing.
Q: SGLang container fails with cuda>=12.x unsatisfied condition
Your NVIDIA driver is too old for the latest SGLang image. Either update your driver (sudo apt install nvidia-driver-550) or use an older SGLang tag:
docker run -d --gpus all -p 30000:30000 --ipc=host --name sglang \
-v ~/.swan/models/Qwen/Qwen2.5-7B-Instruct:/models \
lmsysorg/sglang:v0.4.7.post1-cu124 \
python3 -m sglang.launch_server --model-path /models \
--host 0.0.0.0 --port 30000 \
--served-model-name Qwen/Qwen2.5-7B-InstructQ: docker: Error response from daemon: could not select device driver "nvidia"
The NVIDIA Container Toolkit is not installed. Follow the NVIDIA Container Toolkit section, then restart Docker.
Q: computing-provider setup doesn't detect my running model server
The setup wizard scans common ports (30000, 8080, 11434). Make sure your model server is running before you start the wizard. You can verify manually:
curl http://localhost:30000/v1/models # SGLang/vLLM
curl http://localhost:11434/api/tags # OllamaIf your server uses a non-standard port, the wizard may not find it — you can manually edit ~/.swan/computing/models.json afterward.
Q: My provider is online but not receiving any inference requests
The most common cause is a model name mismatch. The --served-model-name in your SGLang/vLLM command must exactly match the key in models.json, and that key must match a model ID registered on Swan Inference. Run computing-provider models catalog to see valid model IDs.
Q: SGLang container starts but immediately exits
Check logs with docker logs sglang. Common causes:
- Out of VRAM: The model is too large for your GPU. Try a smaller model or a quantized version.
- Shared memory: Add
--shm-size 4gto yourdocker runcommand. - Port conflict: Port 30000 is already in use. Check with
docker psorlsof -i :30000.
Q: models download fails for Llama or other gated models
Some HuggingFace models require accepting a license agreement. Visit the model page on huggingface.co, accept the terms, then set your HuggingFace token:
export HF_TOKEN=hf_xxxxxxxxxxxxxxxxxxxxx
computing-provider models download meta-llama/Llama-3.3-70B-InstructQ: WebSocket connection failed or provider can't connect
- Verify the WebSocket URL in
~/.swan/computing/config.tomliswss://inference-ws.swanchain.io(nothttp://orhttps://) - Check that outbound port 443 isn't blocked by your firewall or cloud security group
- If behind a corporate proxy, WebSocket connections may be blocked — check with your network admin
Q: invalid provider API key or authentication required
- Your API key must start with
sk-prov-. Consumer keys (sk-swan-*) won't work. - Verify your key in
~/.swan/computing/config.tomlunder[Inference].ApiKey - You can also set it via environment variable:
export INFERENCE_API_KEY=sk-prov-xxx
Q: Provider is stuck in pending status
Providers are auto-activated when all conditions are met: collateral deposited, GPU meets minimum tier, and registration benchmark passes. Check your status:
computing-provider inference statusIf you're just testing, ask the Swan team on Discord about dev mode access which skips these requirements.
Q: How do I earn rewards? You earn per successful inference request. Earnings are calculated based on token usage (input + output tokens) multiplied by the model's per-token price. You can check your balance and earnings breakdown anytime:
computing-provider inference status # Shows current stage and earnings summaryPayouts are processed when your balance reaches the minimum threshold ($50). Set a beneficiary wallet to receive payouts:
computing-provider inference set-beneficiary 0xYourWalletAddressQ: What are the collateral deposit options? After your provider is approved, you can deposit collateral via:
- USD (off-chain): Stripe or PayPal — pay through the Provider Dashboard
- Stablecoin (on-chain): USDC or USDT on supported chains (Base, Ethereum)
Run computing-provider inference deposit to see supported chains, contract addresses, and minimum amounts. Deposit via the Provider Dashboard or directly to the contract from your wallet.
Q: What happens if I fail benchmarks? The system runs periodic benchmarks (math, code, reasoning, latency) to verify provider quality. Passing resets your failure counter. Consecutive failures may result in collateral slashing (default: 10% after 2 consecutive failures).
Q: I edited config.toml but nothing changed
Make sure you're editing the right file. The provider reads config from ~/.swan/computing/config.toml (or wherever $CP_PATH points), not the config.toml in the git repo directory.
Q: How do I change models without restarting?
Edit ~/.swan/computing/models.json — the provider watches this file and hot-reloads automatically. You can also reload via the API:
curl -X POST http://localhost:8085/api/v1/computing/inference/models/reloadQ: Port 8085 or 30000 is already in use Find and stop the conflicting process:
lsof -i :30000 # Find what's using the port
docker ps # Check for leftover containers
docker rm -f sglang # Remove old SGLang container- Discord - Community support
- GitHub Issues - Bug reports
- Documentation - Full docs
Apache 2.0