Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/INDEX.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ A grep-friendly FAQ that maps common questions to the file that answers them. Bo
| What's safe to cache in GitHub Actions? | [CI_CACHING.md](CI_CACHING.md#what-the-cache-holds) |
| Why does warm-pass build take ~30 s per sketch? (#91) | [PERF_WARM_BUILD.md](PERF_WARM_BUILD.md) |
| What does `FBUILD_PERF_LOG=1` do? | [PERF_WARM_BUILD.md](PERF_WARM_BUILD.md#instrumentation) |
| Does warm rebuild + deploy + monitor land under 4 s? (#114) | [PERF_WARM_DEPLOY.md](PERF_WARM_DEPLOY.md) |
| How fast is `soldr` when building fbuild itself? | [SOLDR_BUILD_PERF.md](SOLDR_BUILD_PERF.md) |
| Where do end-to-end perf benchmarks live (FastLED matrix, P-01)? | [../bench/fastled-examples/README.md](../bench/fastled-examples/README.md) |
| What architecture docs should I read for a given crate? | [CLAUDE.md](CLAUDE.md) |
Expand Down
86 changes: 86 additions & 0 deletions docs/PERF_WARM_DEPLOY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
# Warm-Deploy Loop Results (FastLED/fbuild#114)

First-pass measurement of the warm **rebuild + deploy + monitor reconnect**
path against real ESP32-S3 hardware, against the 4 000 ms end-to-end budget
defined in #114.

## TL;DR

- **5/5 consecutive iterations** of the full `deploy + monitor reattach +
first-byte-from-device` loop land at **3.585–3.697 s** — all inside the
4 000 ms budget with ~300–400 ms of slack.
- Deploy-only warm path (`fbuild deploy` without `--monitor`) is **~2.6 s**
steady-state (~3.3 s on the first call after daemon spawn).
- Acceptance criterion from #114 (*three consecutive in-budget iterations*)
is met; loop spec (`LOOP.md`) is retired.

## Methodology

- Host: Windows 10 Pro, x86_64-pc-windows-msvc.
- Binary: `target/x86_64-pc-windows-msvc/release/fbuild.exe` built from
`feat/114-warm-deploy-loop-first-pass`.
- Project: `tests/platform/esp32s3/` (Arduino, `ARDUINO_USB_CDC_ON_BOOT=1`,
blinking LED + `Serial.println("Hello from ESP32-S3!")` once per 1 s loop).
- Device: ESP32-S3-DevKitC-1, native USB-CDC on COM13 (`303a:1001`).
- Daemon: in-process via CLI auto-spawn, persistent across iterations.
- Pre-condition: cold deploy (full flash, 3m 16s) brings the device to the
exact image the warm path then verify-skips against.

## Results

### Deploy-only warm path (no monitor)

`fbuild deploy -e esp32s3 -p COM13`

| iter | wall-clock | server-side outcome |
|---:|---:|---|
| 1 | 3.257 s | `verify skipped, device already matched` (incl. daemon warm-up) |
| 2 | 2.642 s | `verify skipped, device already matched` |
| 3 | 2.640 s | `verify skipped, device already matched` |
| 4 | 2.615 s | `verify skipped, device already matched` |
| 5 | 2.568 s | `verify skipped, device already matched` |

### Full loop (T1 + T2 + T3 + TTFB)

`fbuild deploy -e esp32s3 -p COM13 --monitor --halt-on-success "Hello from ESP32-S3" --timeout 5`

| iter | wall-clock | budget | margin |
|---:|---:|---:|---:|
| 1 | 3.587 s | 4.000 s | +413 ms |
| 2 | 3.590 s | 4.000 s | +410 ms |
| 3 | 3.697 s | 4.000 s | +303 ms |
| 4 | 3.585 s | 4.000 s | +415 ms |
| 5 | 3.605 s | 4.000 s | +395 ms |

Mean 3.613 s, max 3.697 s. The ~1 s overhead between the deploy-only path
and the full loop is dominated by the test sketch's `delay(500)` × 2 loop
period — TTFB is the next `Serial.println` cadence, not a property of the
deploy/reconnect path. A sketch emitting at boot would shave that ~1 s.

## What got us here

Landed before this measurement:

- **#116** — `FBUILD_TRUST_DEVICE_HASH=1` opt-in trust-skip on verify-flash
(server cost: ~1.5 s → ~50 ms).
- **#118** — `ImageHashMemo` (skip SHA-256 of unchanged firmware) +
`DeviceManager::refresh_devices_if_stale(2s)` (~50 ms → ~1–2 ms server-side).
- **#120** — `DaemonWatchSetCache` removes the `fp-watches-collect` walk on
back-to-back warm builds.

The full loop runs cleanly inside the 4 s envelope without any phase
breaching its individual budget; no follow-up perf issues are filed against
#114.
Comment on lines +72 to +73
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Line 73: avoid accidental ATX heading syntax

#114. at start-of-line is parsed as a heading marker and triggers MD018. Keep it inline text (for example: issue #114.) so markdownlint stays clean.

Suggested fix
-The full loop runs cleanly inside the 4 s envelope without any phase
-breaching its individual budget; no follow-up perf issues are filed against
-#114.
+The full loop runs cleanly inside the 4 s envelope without any phase
+breaching its individual budget; no follow-up perf issues are filed against
+issue `#114`.
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 73-73: No space after hash on atx style heading

(MD018, no-missing-space-atx)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/PERF_WARM_DEPLOY.md` around lines 72 - 73, The line begins with "#114."
which Markdown parses as an ATX heading; change that occurrence to inline text
such as "issue `#114`." (or wrap it in backticks `#114`) so it no longer starts
the line and markdownlint MD018 is not triggered—update the exact token "#114."
in the affected paragraph to "issue `#114`." to fix the linting.


## Closing the loop

Acceptance criteria from #114:

- [x] LOOP.md committed (#113), and later untracked (#115); local-only
scratch, removed.
- [x] First pass of the loop against real ESP32-S3 hardware to confirm or
tune budgets — this document.
- [x] Follow-up issues for any phase that consistently breaches its
budget — none; all five iterations green.
- [x] T1–T3 land inside the total budget for three consecutive iterations —
five consecutive, lowest margin 303 ms.
Loading