This document is Silicon / Sigil's 0.1 performance baseline. The numbers below are the floor we promise not to regress past; they are not a benchmark contest against Zig, Rust, or Go. Those are different languages with different design points; comparison is misleading.
The bench harness is scripts/bench.ts. Re-run it locally on your own
hardware before drawing conclusions — the absolute numbers are
machine-dependent; only the ratios and trends matter.
bun run scripts/bench.ts # human-readable table
bun run scripts/bench.ts --json # machine-readable JSON
SIGIL_BENCH_RUNS=21 bun run scripts/bench.tsEnvironment knobs:
| Variable | Default | Purpose |
|---|---|---|
SIGIL_BENCH_RUNS |
11 |
Iterations per measurement |
SIGIL_BENCH_WARMUP |
2 |
Warmup iterations (excluded from stats) |
Three sizes, all live under tests/bench/fixtures/:
| Fixture | LOC | Source bytes | Shape |
|---|---|---|---|
small.si |
51 | 942 | Hand-written; exercises @struct, @type, @match, @if, @loop, @fn |
medium.si |
400 | 7,874 | Generated; 50 @fns with varied bodies |
large.si |
3,382 | 62,139 | Generated; 500 @fns |
medium.si and large.si are generated by scripts/bench-gen.ts —
re-run that script if you change the generator shape; the resulting
files are committed so CI sees a stable corpus.
Recorded on the CI reference platform — Linux x86_64, Bun 1.3.x. Eleven runs, two warmups, median reported.
| Fixture | Stage | LOC | Source B | WAT B | Median ms |
|---|---|---|---|---|---|
| small | parse | 51 | 942 | — | 3.9 |
| small | compile | 51 | 942 | 19,888 | 110.4 |
| medium | parse | 400 | 7,874 | — | 35.2 |
| medium | compile | 400 | 7,874 | 30,775 | 156.0 |
| large | parse | 3382 | 62,139 | — | 1024 |
| large | compile | 3382 | 62,139 | 109,136 | 1164 |
Notes on shape:
- The
smallcompile baseline is dominated by strata initialisation (loadingsrc/strata/*.siand building the elaborator registry). That cost is paid once percompile()call;parse()doesn't pay it. This is the "small program" floor — you cannot do a real Silicon compile faster than this on a cold call. mediumcompile shows the per-LOC cost: ~0.4 ms / LOC at this size.- The hand-written recursive-descent parser is fast (~30 MiB/s) and is no longer the long pole; parsing is now cheap relative to elaboration and lowering. Backend lowering scales linearly and stays below 200 ms even on the 3,382-LOC fixture.
| Platform | sgl binary size (release) |
|---|---|
| linux-x86_64 | TBD |
| linux-aarch64 | TBD |
| macos-aarch64 | TBD |
| macos-x86_64 | TBD |
These numbers populate from the release workflow
(.github/workflows/release.yml) the first time v1.0.0 is tagged.
Until then, the local build (bun run build:sigilc) is the reference:
bun run build:sigilc
ls -la dist/sigilcOn Linux x86_64 the local-built sgl lands around 60 MB (it embeds the
Bun runtime). Smaller builds are a v1.x story (the obvious lever is
shipping a Bun bytecode bundle instead of a full single-file
executable; ADR pending if there's demand).
The test suite is the canonical regression test; its own wall-clock is a perf signal.
time bun test # the full suiteAt 0.1 the full bun test runs in roughly the time of large compile
× a few hundred — the property and fuzz suites dominate. The headline
fast suite (bun test src excluding tests/) runs in seconds and is
what CI gates on for every PR.
The bench job runs on every PR and posts a warn-only comment. The
build only fails on a >2× regression on a headline metric
(small.compile.median, medium.compile.median,
large.compile.median, or any wat B column). Smaller regressions
are surfaced as a comment for human judgement — bench noise on shared
CI runners is real and tight thresholds give false positives.
Current implementation: scripts/bench.ts --json writes a snapshot;
the workflow at .github/workflows/bench.yml compares against the
last-recorded baseline on main and emits the warning/failure
accordingly.
- No JIT warm-up gaming. The two warmup iterations exist to let V8/Bun reach steady state on hot paths, then are discarded.
- Source bytes vs LOC are both reported because LOC is the user's mental model and bytes is what the parser actually pays for.
parsestage isolates the hand-written parser from elaboration / typecheck / lowering, so a regression in either layer surfaces in the right column.- Deterministic fixtures —
bench-gen.tsproduces the same output bit-for-bit on every run. A change in the generated fixture content (intentional or otherwise) is itself a bench-result change. - No native-compile bench yet. The QBE backend is in place
(Phase 9), but the bench harness measures the WAT pipeline only at
0.1. Adding
compile --nativeis a later bench extension; the current numbers are the WAT compile floor.
| Date | Change |
|---|---|
| 2026-05-28 | Initial document — 0.1 perf baseline (10b-5). |