Skip to content

v1.1 — OLED UI overhaul + HG_PIN_DAP XIP-cache-contention fix#4

Merged
prat96 merged 6 commits into
mainfrom
fix/dap-xip-contention
Jun 22, 2026
Merged

v1.1 — OLED UI overhaul + HG_PIN_DAP XIP-cache-contention fix#4
prat96 merged 6 commits into
mainfrom
fix/dap-xip-contention

Conversation

@prat96

@prat96 prat96 commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Cuts v1.1 (ver 1.1.0, tag v1.1): the OLED UI overhaul (cat + Spectre) plus the HG_PIN_DAP fix it turned out to need, plus DAP-health telemetry.

The story

The v1.1 render loop (blit engine + ghost compositing, ~8 blits/frame) churns the 16 KB XIP instruction cache enough to evict the flash-resident CMSIS-DAP framing path → a 0-stall retryable-desync regression (~1.4–3.0% vs v1.0's ~0.2%, proven by interleaved A/B against the shipped v1.0 image). Priority can't help — it schedules CPU, not the shared cache. Fix: HG_PIN_DAP pins the 7 DAP/USB transaction objects to SRAM (residency only — no priority change, no upstream edit), ON by default. Pinned image soaks 0/500.

What's in this PR

  • build(dap): flip HG_PIN_DAP default OFF→ON (CMakeLists + build_fork.sh).
  • The pin linker script (memmap_hackagotchi_pin.ld), dap_health --wrap witness, and dap_xfers/dap_idle_ms telemetry.
  • docs(release): v1.1 release notes / CHANGELOG / release-readiness §0; corrected the 3 falsified cache-thrash claims in the design doc.
  • Playbook/skill updates codifying the "XIP cache is priority-blind" finding + the A/B-vs-shipped method.

Verification (artifact-decoded, not source-trusted)

  • analyze.sh PASS; ver = 1.1.0; nm confirms all 7 DAP/USB objects in SRAM (0x2000xxxx).
  • .uf2 sha256 b0826090…d683 · .elf 56c5c975…5bec.
  • Pure-pin 0/500 (Gate 1 PASS, 0 stalls) = the clean reference.
  • Honest caveat: the combined 1000-cycle run's own strict Gate-1 verdict was FAIL (fails=2, 0.2%, 0 stalls) on a non-idle host; we ship on the A/B + the 0/500, with an idle-host 1000-cycle re-run as the one open item. See docs/release-readiness.md §0.
  • Release claims adversarially re-verified before tagging.

After merge: tag v1.1 + GitHub release (.uf2 + .elf + NOTICE + LICENSE).

🤖 Generated with Claude Code

prat96 added 4 commits June 22, 2026 13:06
…[GATE-PENDING]

Candidate fix for the v1.1 retryable CMSIS-DAP framing-desync regression that the
interleaved A/B soak proved on the bench (v1.0 ~0.2% vs v1.1 ~1.4-3.0% retryable,
0 stalls both, same bench/host/cable — only the firmware image differing).

Root cause (root-cause workflow + ELF artifact diff): v1.1's heavier +0 OLED render
loop (new 1242 B ssd1306_blit, +804 B screen_home, ghost + status bar = 8 blits/frame
vs 0) churns a larger flash-instruction working set through the shared 16 KB XIP cache
every frame, evicting the flash-resident CMSIS-DAP response-framing path. The extra
QSPI cache-refill latency lands inside the USB-IN response window -> wrong-command-ID /
short-transfer / IN-timeout desyncs. 0-stall because the USB device ISR is RAM-resident
and keeps acking the bus; it is added LATENCY, not a lock. Task PRIORITY cannot help —
the XIP cache + QSPI bus are hardware shared regardless of which task runs (this
falsifies the design-doc claim in docs/hackagotchiUI_upgrade_v1.1.md that the UI
"cache footprint can't threaten DAP timing").

Fix: take the DAP path out of the contended cache entirely. HG_PIN_DAP=ON selects a
custom linker script (memmap_hackagotchi_pin.ld = verbatim SDK rp2040 memmap_default.ld
with the 7 DAP/USB-vendor transaction objects added to the flash .text EXCLUDE_FILE),
so their .text runs from SRAM (via the .data section, copied at boot). Pinned objects:
DAP.c sw_dp_pio.c probe.c tusb_edpt_handler.c usbd.c vendor_device.c dcd_rp2040.c —
the complete USB-vendor -> DAP -> SWD-PIO call tree incl. dap_thread. NOT a switch to
copy_to_ram: XIP-default stays for everything else.

RESIDENCY CHANGE ONLY — no FreeRTOS priority change; nothing new runs at/above the DAP
path. Cost ~18.7 KB of the +139 KB free-SRAM headroom (RAM use ~101 KB / 256 KB).

Desk-verified: builds clean; analyze.sh PASS (0 source-file changes); nm confirms all
9 hot functions moved 0x10xx -> 0x20xx; HG_PIN_DAP=OFF is byte-identical to stock v1.1
(uf2 sha 4ada43a7). HARDWARE SOAK PENDING — bar before claiming the fix:
  ./tests/gates/gate1_soak.sh 500    # pinned <=0.4% / 0 stalls (vs v1.1 1.4-3.0%)

Signed-off-by: Pratheek Balakrishna <pratheekb96@gmail.com>
…PENDING]

A firmware-side WITNESS to the R1 "0 DAP transfer stalls" invariant: a monotonic
dap_xfers count + dap_idle_ms (time since the last DAP command), so every soak
can cross-check that transfers advanced by the expected amount and the probe was
live throughout. A soak whose counter never moves is a silent pass.

Wiring avoids an upstream source shadow: -Wl,--wrap=DAP_ExecuteCommand routes the
(stable CMSIS-DAP) execute call through __wrap_DAP_ExecuteCommand in the new owned
src/dap_health.c, which does the real work then records a non-blocking witness
(counter ++ and one time_us_32 read). Re-diff surface on a debugprobe bump (cf.
backlog #8 v2.3.1 spike) stays zero — no tusb_edpt_handler.c copy to maintain.

- src/dap_health.{c,h}: counters + getters; single-writer (the DAP task) / lock-
  free readers (32-bit aligned word reads are atomic on Cortex-M0+; no lock on the
  DAP path).
- cdc1_control.c: surface dap_xfers + dap_idle_ms in write_status (buffer 256->320
  so the longer line can never silently truncate).
- host/hackagotchi_ctl.py: print a "probe dap_xfers=… dap_idle_ms=…" line.

RUNS AT DAP PRIORITY: the wrapper IS the DAP execute call. It is non-blocking, but
this is a DAP-PATH CHANGE and MUST be re-gated before merge:
  ADVERSARIAL: build, flash, then on the bench:
    ./tests/gates/gate1_soak.sh 1000           # bar: 0 fails + 0 stalls
    .venv/bin/python tests/m2/coexist_soak.py 300   # 0 stalls AND unchanged retryable rate
    # cross-check: {"q":"status"} dap_xfers advances by ~the soak's transfer count
Desk-verified only so far: builds clean, analyze.sh PASS (dap_health.c +
cdc1_control.c 0/0), and the disassembly confirms dap_thread -> __wrap -> real.
On a BRANCH deliberately — do not merge to main until the soak is green.

Signed-off-by: Pratheek Balakrishna <pratheekb96@gmail.com>
(cherry picked from commit 5183048bb260d59b0ab1b23763856b0c2827db87)
… image)

When HG_PIN_DAP and the dap-health telemetry are both built, the --wrap routes every
DAP_ExecuteCommand through __wrap_DAP_ExecuteCommand (src/dap_health.c). Add dap_health.c.o
to the pin EXCLUDE_FILE list so the wrapper runs from SRAM too — otherwise it reintroduces
one small flash-resident hop on the otherwise-pinned DAP path. No-op when dap_health.c.o is
absent (EXCLUDE_FILE of a missing object matches nothing). Verified: combined image builds,
analyze.sh PASS, nm shows __wrap_DAP_ExecuteCommand + DAP_ProcessCommand + tud_task_ext in SRAM.

Signed-off-by: Pratheek Balakrishna <pratheekb96@gmail.com>
The v1.1 DAP regression taught two transferable lessons; write them into the
day-to-day contract and the bring-up playbook so the next render-heavy change
doesn't relearn them:

- firmware-conventions.md §2: the XIP cache is PRIORITY-BLIND. Priority schedules
  CPU, not the shared 16 KB XIP cache / QSPI bus. A non-blocking +0 task (the OLED
  render loop) can evict the flash-resident DAP path -> retryable USB framing
  desyncs that stay 0-stall (pass the R1 hard bar) yet regress the strict Gate-1
  retryable rate. Mitigation: HG_PIN_DAP (pin DAP/USB objects to SRAM, residency
  only). Watch the retryable rate, not just stalls.
- mcu-bringup-playbook.md §10: "0 stalls != no regression" + the interleaved
  A/B-against-the-shipped-image method (candidate last-in-time, power-cycle each,
  sha-matched gold .uf2; a power-cycle making it WORSE falsifies dirty-bench) +
  the dap_xfers transfer-counter cross-check against the silent pass.

(Companion user skills run-hil-gate/firmware-gate updated in parallel.)

Signed-off-by: Pratheek Balakrishna <pratheekb96@gmail.com>
@prat96 prat96 requested a review from Dav1108 June 22, 2026 11:44
@prat96 prat96 self-assigned this Jun 22, 2026
@prat96 prat96 added the enhancement New feature or request label Jun 22, 2026
prat96 added 2 commits June 22, 2026 14:24
The v1.1 OLED overhaul churns the 16 KB XIP instruction cache enough to evict
the flash-resident CMSIS-DAP framing path -- a 0-stall retryable-desync
regression (~1.4-3.0% vs v1.0's ~0.2%, proven by interleaved A/B). HG_PIN_DAP
routes the 7 DAP/USB transaction objects to SRAM via memmap_hackagotchi_pin.ld
so they fall out of the contended cache. Flip the default OFF->ON in both
CMakeLists.txt and build_fork.sh so a plain ./build_fork.sh ships the fixed
image; HG_PIN_DAP=OFF still reproduces the pre-fix image for an A/B soak.

DAP-path impact: residency change ONLY -- no FreeRTOS priority change, nothing
new runs at or above the DAP path, no upstream edit. ~18.8 KB of the +139 KB
XIP SRAM win. Pinned image soaks 0/500 (Gate 1 PASS). Verified on the artifact:
nm shows all 7 objects at 0x2000xxxx; analyze.sh PASS.

Signed-off-by: Pratheek Balakrishna <pratheekb96@gmail.com>
…ct falsified cache claims

- RELEASE_NOTES_v1.1.md: rewrite with a feature table, the honest
  "regression we caught and fixed" story, an upgrade-from-v1.0 path, and a
  verified-evidence block.
- CHANGELOG.md [1.1]: add the HG_PIN_DAP XIP-contention fix + DAP-health
  telemetry section; drop the "candidate/pending" line.
- release-readiness.md: retitle to v1.1, new release identity (ver 1.1.0,
  tag v1.1, artifact shas, HG_PIN_DAP=ON), and a new Section 0 documenting the
  finding -> A/B -> fix -> soak chain.
- hackagotchiUI_upgrade_v1.1.md: correct the three falsified cache-thrash
  claims (annotated [Corrected after v1.1 HIL], not silently rewritten) -- the
  light v1.0 UI never threatened the XIP cache, the heavy v1.1 UI does.

Honest reporting: the combined 1000-cycle soak's strict Gate-1 verdict was FAIL
(fails=2, 0.2%, 0 stalls) on a non-idle host; the docs state that verbatim and
ship on the strength of the A/B (~7-15x regression eliminated) + the pure-pin
0/500 PASS, not on that run. An idle-host 1000-cycle re-run is the one open item.
Claims adversarially re-verified against the artifact + recorded soak outputs.

Signed-off-by: Pratheek Balakrishna <pratheekb96@gmail.com>
@prat96 prat96 changed the title Fix/dap xip contention v1.1 — OLED UI overhaul + HG_PIN_DAP XIP-cache-contention fix Jun 22, 2026
@prat96 prat96 merged commit 29fddde into main Jun 22, 2026
2 checks passed
@prat96 prat96 deleted the fix/dap-xip-contention branch June 22, 2026 12:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant