Performance

This document is Silicon / Sigil's 0.1 performance baseline. The numbers below are the floor we promise not to regress past; they are not a benchmark contest against Zig, Rust, or Go. Those are different languages with different design points; comparison is misleading.

The bench harness is scripts/bench.ts. Re-run it locally on your own hardware before drawing conclusions — the absolute numbers are machine-dependent; only the ratios and trends matter.

How to run

bun run scripts/bench.ts                  # human-readable table
bun run scripts/bench.ts --json           # machine-readable JSON
SIGIL_BENCH_RUNS=21 bun run scripts/bench.ts

Environment knobs:

Variable	Default	Purpose
`SIGIL_BENCH_RUNS`	`11`	Iterations per measurement
`SIGIL_BENCH_WARMUP`	`2`	Warmup iterations (excluded from stats)

Fixtures

Three sizes, all live under tests/bench/fixtures/:

Fixture	LOC	Source bytes	Shape
`small.si`	51	942	Hand-written; exercises `@struct`, `@type`, `@match`, `@if`, `@loop`, `@fn`
`medium.si`	400	7,874	Generated; 50 `@fn`s with varied bodies
`large.si`	3,382	62,139	Generated; 500 `@fn`s

medium.si and large.si are generated by scripts/bench-gen.ts — re-run that script if you change the generator shape; the resulting files are committed so CI sees a stable corpus.

Headline baseline (0.1)

Recorded on the CI reference platform — Linux x86_64, Bun 1.3.x. Eleven runs, two warmups, median reported.

Fixture	Stage	LOC	Source B	WAT B	Median ms
small	parse	51	942	—	3.9
small	compile	51	942	19,888	110.4
medium	parse	400	7,874	—	35.2
medium	compile	400	7,874	30,775	156.0
large	parse	3382	62,139	—	1024
large	compile	3382	62,139	109,136	1164

Notes on shape:

The small compile baseline is dominated by strata initialisation (loading src/strata/*.si and building the elaborator registry). That cost is paid once per compile() call; parse() doesn't pay it. This is the "small program" floor — you cannot do a real Silicon compile faster than this on a cold call.
medium compile shows the per-LOC cost: ~0.4 ms / LOC at this size.
The hand-written recursive-descent parser is fast (~30 MiB/s) and is no longer the long pole; parsing is now cheap relative to elaboration and lowering. Backend lowering scales linearly and stays below 200 ms even on the 3,382-LOC fixture.

Binary size

Platform	`sgl` binary size (release)
linux-x86_64	TBD
linux-aarch64	TBD
macos-aarch64	TBD
macos-x86_64	TBD

These numbers populate from the release workflow (.github/workflows/release.yml) the first time v1.0.0 is tagged. Until then, the local build (bun run build:sigilc) is the reference:

bun run build:sigilc
ls -la dist/sigilc

On Linux x86_64 the local-built sgl lands around 60 MB (it embeds the Bun runtime). Smaller builds are a v1.x story (the obvious lever is shipping a Bun bytecode bundle instead of a full single-file executable; ADR pending if there's demand).

Test-suite wall clock

The test suite is the canonical regression test; its own wall-clock is a perf signal.

time bun test                  # the full suite

At 0.1 the full bun test runs in roughly the time of large compile × a few hundred — the property and fuzz suites dominate. The headline fast suite (bun test src excluding tests/) runs in seconds and is what CI gates on for every PR.

CI signal

The bench job runs on every PR and posts a warn-only comment. The build only fails on a >2× regression on a headline metric (small.compile.median, medium.compile.median, large.compile.median, or any wat B column). Smaller regressions are surfaced as a comment for human judgement — bench noise on shared CI runners is real and tight thresholds give false positives.

Current implementation: scripts/bench.ts --json writes a snapshot; the workflow at .github/workflows/bench.yml compares against the last-recorded baseline on main and emits the warning/failure accordingly.

Methodology notes

No JIT warm-up gaming. The two warmup iterations exist to let V8/Bun reach steady state on hot paths, then are discarded.
Source bytes vs LOC are both reported because LOC is the user's mental model and bytes is what the parser actually pays for.
parse stage isolates the hand-written parser from elaboration / typecheck / lowering, so a regression in either layer surfaces in the right column.
Deterministic fixtures — bench-gen.ts produces the same output bit-for-bit on every run. A change in the generated fixture content (intentional or otherwise) is itself a bench-result change.
No native-compile bench yet. The QBE backend is in place (Phase 9), but the bench harness measures the WAT pipeline only at 0.1. Adding compile --native is a later bench extension; the current numbers are the WAT compile floor.

Change log

Date	Change
2026-05-28	Initial document — 0.1 perf baseline (10b-5).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance

How to run

Fixtures

Headline baseline (0.1)

Binary size

Test-suite wall clock

CI signal

Methodology notes

Change log

FilesExpand file tree

performance.md

Latest commit

History

performance.md

File metadata and controls

Performance

How to run

Fixtures

Headline baseline (0.1)

Binary size

Test-suite wall clock

CI signal

Methodology notes

Change log