Small distributed computing demo for a Computer Networks final project. This repo runs a Ray-based distributed log analytics workload across a 3-node virtual cluster.
-
Virtual LAN setup with static IPs
-
Hostname-based node communication
-
A simple Ray cluster (head + workers)
-
A non-GUI distributed workload that produces a final artifact:
final_wordcount.json
VMs (Ubuntu Server):
head— Ray coordinator (no compute contribution)worker1— compute nodeworker2— compute node
Internal cluster subnet:
192.168.56.0/24
Static IPs:
head→192.168.56.9worker2→192.168.56.10worker1→192.168.56.11
Each VM uses two NICs:
Adapter 1
- Attached to: Internal Network
- Name:
cluster-net
Adapter 2
- Attached to: NAT
This keeps cluster networking stable while allowing internet access for installs.
Ensure each VM has the same /etc/hosts entries (example):
127.0.0.1 localhost
127.0.1.1 <this-vm-hostname>
192.168.56.9 head
192.168.56.11 worker1
192.168.56.10 worker2Test:
ping -c 2 head
ping -c 2 worker1
ping -c 2 worker2Run on each VM:
sudo apt update
sudo apt install -y curl
curl -LsSf https://astral.sh/uv/install.sh | sh
source ~/.bashrc
uv --versionRay requires consistent Python versions across nodes. This project uses Python 3.12.3.
On each VM inside the repo directory:
cd ~/rayproj
uv python pin 3.12.3
uv python install 3.12.3
uv synccd ~/rayproj
uv run ray stop || true
uv run ray start --head --num-cpus=0 --port=6379 --node-ip-address=192.168.56.9Run on worker1:
cd ~/rayproj
uv run ray stop || true
uv run ray start --address=192.168.56.9:6379Run on worker2:
cd ~/rayproj
uv run ray stop || true
uv run ray start --address=192.168.56.9:6379On head:
cd ~/rayproj
uv run ray statusExpected:
- 3 active nodes
- Total CPU reflects workers (head is coordinator-only)
On head:
cd ~/rayproj
uv run python main.py --generate --lines 80000 --chunk-size 400 --top 20Output:
telemetry_logs.txtfinal_wordcount.json
Confirm:
ls -l final_wordcount.jsonStop Ray on any node:
cd ~/rayproj
uv run ray stop