Skip to content

feat(fizz): add --parallel N for multi-process simulation#359

Merged
jp-fizzbee merged 1 commit into
mainfrom
user/jp/feat-parallel-simulation
May 19, 2026
Merged

feat(fizz): add --parallel N for multi-process simulation#359
jp-fizzbee merged 1 commit into
mainfrom
user/jp/feat-parallel-simulation

Conversation

@jp-fizzbee
Copy link
Copy Markdown
Collaborator

Each worker is its own fizzbee subprocess, which sidesteps the in-process shared-state hazards (roleRefs, nextChannelId, and others not yet audited) that block goroutine-based parallelism. Tradeoff: N process-startup overheads per run (~50ms each, negligible vs typical sim wall time) and per-worker output dirs instead of interleaved stdout.

Activates only when -x is set, no fixed --seed (a seed forces a single run anyway), and --parallel > 1. Otherwise falls through to the existing single-process invocation unchanged.

Behavior:

  • Splits --max_runs evenly across workers (0 -> unlimited per worker).
  • Each worker writes to /parallel_/worker_/ + worker_.log.
  • Polling loop checks every 500ms: if a worker writes the failure sentinel, kill surviving workers; otherwise wait for all to finish.
  • Ctrl-C kills all live workers and exits 130.
  • On success: aggregate "PASSED: N total runs across W workers".
  • On failure: print the first failing worker's full log + output dir.

Measured: 13-02-01 two-phase-commit, 1000 sims, sequential 52.7s vs parallel=4 19.8s on macOS arm64 (2.65x speedup).

Compatible with macOS bash 3.2 — uses only basic array, arithmetic, and process-control features. set -e is disabled inside the block so expected non-zero exits (kill of dead worker, grep miss) don't abort the script; we exit explicitly at the end.

Each worker is its own fizzbee subprocess, which sidesteps the
in-process shared-state hazards (roleRefs, nextChannelId, and others
not yet audited) that block goroutine-based parallelism. Tradeoff: N
process-startup overheads per run (~50ms each, negligible vs typical
sim wall time) and per-worker output dirs instead of interleaved
stdout.

Activates only when -x is set, no fixed --seed (a seed forces a
single run anyway), and --parallel > 1. Otherwise falls through to
the existing single-process invocation unchanged.

Behavior:

- Splits --max_runs evenly across workers (0 -> unlimited per worker).
- Each worker writes to <base>/parallel_<ts>/worker_<N>/ + worker_<N>.log.
- Polling loop checks every 500ms: if a worker writes the failure
  sentinel, kill surviving workers; otherwise wait for all to finish.
- Ctrl-C kills all live workers and exits 130.
- On success: aggregate "PASSED: N total runs across W workers".
- On failure: print the first failing worker's full log + output dir.

Measured: 13-02-01 two-phase-commit, 1000 sims, sequential 52.7s vs
parallel=4 19.8s on macOS arm64 (2.65x speedup).

Compatible with macOS bash 3.2 — uses only basic array, arithmetic,
and process-control features. set -e is disabled inside the block so
expected non-zero exits (kill of dead worker, grep miss) don't abort
the script; we exit explicitly at the end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jp-fizzbee jp-fizzbee merged commit 8748f56 into main May 19, 2026
1 check passed
@jp-fizzbee jp-fizzbee deleted the user/jp/feat-parallel-simulation branch May 19, 2026 22:47
@jp-fizzbee jp-fizzbee restored the user/jp/feat-parallel-simulation branch May 20, 2026 21:57
@jp-fizzbee jp-fizzbee deleted the user/jp/feat-parallel-simulation branch May 20, 2026 21:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants