You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#112 (FastLED examples CI benchmark) landed iteration 3 and measured a warm-cache run that produced zero cross-run compile speedup despite actions/cache correctly restoring 304 MB of fbuild + project state. Every one of the 83 FastLED examples re-compiled at the same rate as cold (~1.5s per example × 82 ≈ 124s of avoidable work).
Root cause is not one bug but a cluster of cache-key and cache-scope bugs, each of which independently breaks cross-run reuse:
The 9s saved on example #1 is the ~/.fbuild toolchain-materialization cache paying off. Per-TU reuse across runs: zero.
Within a single run, the fbuild daemon is already warm after example #1 (examples 2–83 run at ~1.5s each). That "warm-within-run" rate is what we want to reproduce cross-run. Nothing in this meta is about speeding up that rate further — we just want to stop throwing it away between runs.
build: audit zccache per-TU cache keys for cross-runner stability #150 — Audit zccache's own per-TU cache keys for cross-runner stability; if it bakes absolute source / include / output paths from the wrapped command, file upstream and/or normalize at wrap_args call site. (filed, research-first)
The biggest single lever is #146. On its own it should recover most of the ~124s currently spent recompiling unchanged TUs (assuming zccache's own keying is content-based, which #150 will confirm). #148 protects against regressions when the runner image or workspace path shifts. #149 + #151 turn the warm number from "cheap per-TU reuse because daemon is warm within-run" into "zero compile work because the prior objects are on disk already."
Definition of done
On the bench/fastled-examples benchmark (board=uno, examples=all, FastLED master, fbuild ≥ release-containing-#146-and-#148):
Same-runner warm regime: a warm run after a cold run on the same runner image drops compile phase from ~142s to < 10s (only the first example pays a measurable cost; examples 2–83 finish sub-second each).
docs/CI_CACHE.md in tree describing the full recipe.
Until (1)+(2) hold, #112 cannot measurably improve beyond its current iter3 numbers, because the compile phase dominates total wall-clock and nothing else we could optimize (parallel, venv cache, uv sync) adds up to the 124s of avoidable recompile.
Why this exists
#112 (FastLED examples CI benchmark) landed iteration 3 and measured a warm-cache run that produced zero cross-run compile speedup despite
actions/cachecorrectly restoring 304 MB of fbuild + project state. Every one of the 83 FastLED examples re-compiled at the same rate as cold (~1.5s per example × 82 ≈ 124s of avoidable work).Root cause is not one bug but a cluster of cache-key and cache-scope bugs, each of which independently breaks cross-run reuse:
actions/cachepath list yet (bench: add zccache store to bench/fastled-examples actions/cache path list #151).Fix all of them and the cross-run warm regime moves from "caches 304 MB and saves 9 seconds" to "caches ~500 MB and saves ~2 minutes."
Evidence
Benchmark on
bench/fastled-examples, boarduno, fbuild 2.1.21, Ubuntu 24.04:The 9s saved on example #1 is the
~/.fbuildtoolchain-materialization cache paying off. Per-TU reuse across runs: zero.Within a single run, the fbuild daemon is already warm after example #1 (examples 2–83 run at ~1.5s each). That "warm-within-run" rate is what we want to reproduce cross-run. Nothing in this meta is about speeding up that rate further — we just want to stop throwing it away between runs.
Sub-issues
fbuild code changes
FileStamp. (filed, TDD plan included)compiler.rs:303(toolchain path in rebuild signature) andbuild_fingerprint/mod.rs:106,127,236(absolute watch-set paths in fingerprint). Fix by hashing compiler identity and paths relative to watch root. (filed, TDD plan included)Setup / distribution
setupcomposite action must expose zccache's store location (zccache-store-pathstep output +ZCCACHE_DIRenv var) so consumer workflows can feed it toactions/cache. (filed)wrap_argscall site. (filed, research-first)Benchmark + validation
zccache-store-pathtobench/fastled-examplesworkflow'sactions/cachepath list. Bump cache version, re-run iter4 cold + warm. (filed, blocked by daemon/setup: expose zccache store location so consumer workflows can actions/cache it #149)Documentation
docs/CI_CACHE.md: directories to cache, key composition, invalidation pattern, expected numbers from [META] Fastest possible FastLED examples CI rebuild — profile + benchmark #112 final iter. (filed, sequenced last)Ordering / critical path
The biggest single lever is #146. On its own it should recover most of the ~124s currently spent recompiling unchanged TUs (assuming zccache's own keying is content-based, which #150 will confirm). #148 protects against regressions when the runner image or workspace path shifts. #149 + #151 turn the warm number from "cheap per-TU reuse because daemon is warm within-run" into "zero compile work because the prior objects are on disk already."
Definition of done
On the
bench/fastled-examplesbenchmark (board=uno, examples=all, FastLED master, fbuild ≥ release-containing-#146-and-#148):docs/CI_CACHE.mdin tree describing the full recipe.Until (1)+(2) hold, #112 cannot measurably improve beyond its current iter3 numbers, because the compile phase dominates total wall-clock and nothing else we could optimize (parallel, venv cache, uv sync) adds up to the 124s of avoidable recompile.
Non-goals
--parallel, concurrent per-example dispatch). Orthogonal to cache resilience; tracked on [META] Fastest possible FastLED examples CI rebuild — profile + benchmark #112 directly as a separate iteration lever. Parallelism hides per-TU inefficiency but does not fix it.~/.fbuildcache which works correctly..venvcaching (23suv syncphase). Orthogonal; tracked on [META] Fastest possible FastLED examples CI rebuild — profile + benchmark #112.Related
content_hashfield should wire through those paths cleanly.