Skip to content

Latest commit

 

History

History
140 lines (109 loc) · 5.32 KB

File metadata and controls

140 lines (109 loc) · 5.32 KB

Performance

This document is Silicon / Sigil's 0.1 performance baseline. The numbers below are the floor we promise not to regress past; they are not a benchmark contest against Zig, Rust, or Go. Those are different languages with different design points; comparison is misleading.

The bench harness is scripts/bench.ts. Re-run it locally on your own hardware before drawing conclusions — the absolute numbers are machine-dependent; only the ratios and trends matter.

How to run

bun run scripts/bench.ts                  # human-readable table
bun run scripts/bench.ts --json           # machine-readable JSON
SIGIL_BENCH_RUNS=21 bun run scripts/bench.ts

Environment knobs:

Variable Default Purpose
SIGIL_BENCH_RUNS 11 Iterations per measurement
SIGIL_BENCH_WARMUP 2 Warmup iterations (excluded from stats)

Fixtures

Three sizes, all live under tests/bench/fixtures/:

Fixture LOC Source bytes Shape
small.si 51 942 Hand-written; exercises @struct, @type, @match, @if, @loop, @fn
medium.si 400 7,874 Generated; 50 @fns with varied bodies
large.si 3,382 62,139 Generated; 500 @fns

medium.si and large.si are generated by scripts/bench-gen.ts — re-run that script if you change the generator shape; the resulting files are committed so CI sees a stable corpus.

Headline baseline (0.1)

Recorded on the CI reference platform — Linux x86_64, Bun 1.3.x. Eleven runs, two warmups, median reported.

Fixture Stage LOC Source B WAT B Median ms
small parse 51 942 3.9
small compile 51 942 19,888 110.4
medium parse 400 7,874 35.2
medium compile 400 7,874 30,775 156.0
large parse 3382 62,139 1024
large compile 3382 62,139 109,136 1164

Notes on shape:

  • The small compile baseline is dominated by strata initialisation (loading src/strata/*.si and building the elaborator registry). That cost is paid once per compile() call; parse() doesn't pay it. This is the "small program" floor — you cannot do a real Silicon compile faster than this on a cold call.
  • medium compile shows the per-LOC cost: ~0.4 ms / LOC at this size.
  • The hand-written recursive-descent parser is fast (~30 MiB/s) and is no longer the long pole; parsing is now cheap relative to elaboration and lowering. Backend lowering scales linearly and stays below 200 ms even on the 3,382-LOC fixture.

Binary size

Platform sgl binary size (release)
linux-x86_64 TBD
linux-aarch64 TBD
macos-aarch64 TBD
macos-x86_64 TBD

These numbers populate from the release workflow (.github/workflows/release.yml) the first time v1.0.0 is tagged. Until then, the local build (bun run build:sigilc) is the reference:

bun run build:sigilc
ls -la dist/sigilc

On Linux x86_64 the local-built sgl lands around 60 MB (it embeds the Bun runtime). Smaller builds are a v1.x story (the obvious lever is shipping a Bun bytecode bundle instead of a full single-file executable; ADR pending if there's demand).

Test-suite wall clock

The test suite is the canonical regression test; its own wall-clock is a perf signal.

time bun test                  # the full suite

At 0.1 the full bun test runs in roughly the time of large compile × a few hundred — the property and fuzz suites dominate. The headline fast suite (bun test src excluding tests/) runs in seconds and is what CI gates on for every PR.

CI signal

The bench job runs on every PR and posts a warn-only comment. The build only fails on a >2× regression on a headline metric (small.compile.median, medium.compile.median, large.compile.median, or any wat B column). Smaller regressions are surfaced as a comment for human judgement — bench noise on shared CI runners is real and tight thresholds give false positives.

Current implementation: scripts/bench.ts --json writes a snapshot; the workflow at .github/workflows/bench.yml compares against the last-recorded baseline on main and emits the warning/failure accordingly.

Methodology notes

  • No JIT warm-up gaming. The two warmup iterations exist to let V8/Bun reach steady state on hot paths, then are discarded.
  • Source bytes vs LOC are both reported because LOC is the user's mental model and bytes is what the parser actually pays for.
  • parse stage isolates the hand-written parser from elaboration / typecheck / lowering, so a regression in either layer surfaces in the right column.
  • Deterministic fixturesbench-gen.ts produces the same output bit-for-bit on every run. A change in the generated fixture content (intentional or otherwise) is itself a bench-result change.
  • No native-compile bench yet. The QBE backend is in place (Phase 9), but the bench harness measures the WAT pipeline only at 0.1. Adding compile --native is a later bench extension; the current numbers are the WAT compile floor.

Change log

Date Change
2026-05-28 Initial document — 0.1 perf baseline (10b-5).