Skip to content

[META] Fastest possible FastLED examples CI rebuild — profile + benchmark #112

@zackees

Description

@zackees

Goal

Drive the wall-clock time to compile the full FastLED examples/ set on CI as low as possible, for two distinct regimes:

  1. Cold — first time a runner sees this cache key. No fbuild cache, no toolchain on disk, no pip cache. This is what every new board / new OS / first PR after a cache bust pays.
  2. Warm (group rebuild) — same graph re-built with a populated fbuild cache. This is the common path on master after #2319 landed in the FastLED repo.

We want both numbers low, and we want to understand where the time goes — not just trust that it feels fast.

Non-goals

  • Not trying to speed up a single example compile. The unit under test is the whole example list for one board.
  • Not building hardware-specific fast paths. Improvements must come from fbuild / cache / CI orchestration, not from dropping examples.

Approach

A dedicated orphan branch in this repo — bench/fastled-examples — containing only:

  • .github/workflows/benchmark.yml — a single self-contained workflow (workflow_dispatch) that:
    • installs fbuild via the local setup action,
    • clones FastLED at a pinned SHA,
    • compiles all examples for a chosen board (uno by default, overridable),
    • prints per-phase wall-clock timings to the job summary,
    • uploads raw timing data as an artifact.
  • README.md documenting the branch purpose.

Two invocations per change = one number for each regime:

  • Run 1 with cache-version bumped / fresh key → cold.
  • Run 2 immediately after → warm.

Phases we want to measure

  • actions/setup-python
  • pip install fbuild
  • actions/cache restore (fbuild cache)
  • git clone FastLED (+ LFS if relevant)
  • fbuild daemon startup
  • toolchain download / materialization (first compile)
  • per-example compile time (or at least aggregate compile phase)
  • actions/cache save
  • job teardown

Each phase gets its own ::group:: + date +%s.%N delta so we can eyeball it in logs and parse it from artifacts.

Tracking

  • Land orphan branch bench/fastled-examples with initial workflow + README
  • First cold run — record baseline timings here
  • First warm run — record baseline timings here
  • Identify largest phase; open sub-issue + attempt optimization
  • Iterate until cold and warm both stop improving

Baselines and per-iteration results will be posted as comments on this issue.

Dimensions to vary later

  • Board: uno, esp32dev, esp32s3, teensy41 (different toolchain sizes)
  • OS: ubuntu-latest vs ubuntu-24.04 vs self-hosted
  • Parallelism: --parallel vs --no-parallel
  • fbuild daemon: cold-spawn vs pre-spawned
  • pip: uv pip vs pip vs pre-built wheel cache

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestpriority: p1Important follow-up after p0 foundationstrackingUmbrella or tracking issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions