From 96707a26b33c3ccfff0e0caa78272ced2bc1200b Mon Sep 17 00:00:00 2001 From: Abmcar Date: Fri, 15 May 2026 15:27:57 +0800 Subject: [PATCH 1/3] build(deps): swap boost mirror + honor FETCHCONTENT_BASE_DIR env MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two narrow, locally-verifiable changes to reduce FetchContent flakiness: 1. boost URL: sourceforge → archives.boost.io (URL_HASH unchanged). Addresses the most flaky cold-start dep — sourceforge has been a recurring 504 source (e.g. PR #499 run 25897803413 fell over on a FetchContent download). Boost.org's official archive returns HTTP 200 with the canonical 1.67.0 tarball (content-length matches; URL_HASH validates byte-identity). Drop DOWNLOAD_NAME since the new URL already ends in the canonical filename. 2. Top-level CMakeLists.txt honors FETCHCONTENT_BASE_DIR from env when no `-D` is passed on the cmake command line. Mirrors the pattern already at .worktrees/feat-gas-check-placement/CMakeLists.txt:8-15. Lets local developers share a populated cache across worktrees and clean builds via `export FETCHCONTENT_BASE_DIR=~/.cache/cmake-fetchcontent`. docs/start.md adds a "Build dependency cache" section documenting the local convention and the SGX local-cache caveat (asmjit gets patched under SGX; mixing patched/unpatched in one cache breaks). A complementary follow-up to pre-bake the deps into `dtvmdev1/dtvm-dev-x64:main` was scoped out of this PR because Docker is not available in the implementation environment for end-to-end verification. Design preserved in the change-doc's "Deferred" section for a future PR. Validation (local): - curl HTTP 200 + correct content-length on archives.boost.io URL. - env-hook test: FETCHCONTENT_BASE_DIR=/tmp/fc cmake -S . -B build populates /tmp/fc/-src/ (note: BASE_DIR layout has no _deps/ segment — that's only the default). - Populated boost-src/boost/version.hpp contains BOOST_VERSION 106700. - tools/format.sh check pass. Refs: docs/changes/2026-05-15-fetchcontent-cache/README.md Co-Authored-By: Claude Opus 4.7 (1M context) --- CMakeLists.txt | 12 ++ .../2026-05-15-fetchcontent-cache/README.md | 169 ++++++++++++++++++ docs/start.md | 31 ++++ third_party/AddDeps.cmake | 7 +- 4 files changed, 217 insertions(+), 2 deletions(-) create mode 100644 docs/changes/2026-05-15-fetchcontent-cache/README.md diff --git a/CMakeLists.txt b/CMakeLists.txt index b1dafa146..7859e66b2 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -5,6 +5,18 @@ cmake_minimum_required(VERSION 3.16) project(ZetaEngine LANGUAGES C CXX ASM) +# Honor FETCHCONTENT_BASE_DIR from environment when not set on cmd line. Enables +# a shared FetchContent cache across worktrees, CI jobs, and the dev container +# image (which sets FETCHCONTENT_BASE_DIR=/opt/cmake-fetchcontent). +if(DEFINED ENV{FETCHCONTENT_BASE_DIR} AND NOT DEFINED + CACHE{FETCHCONTENT_BASE_DIR} +) + set(FETCHCONTENT_BASE_DIR + "$ENV{FETCHCONTENT_BASE_DIR}" + CACHE PATH "Shared FetchContent cache (from env)" + ) +endif() + set(CMAKE_CXX_STANDARD 17) set(CMAKE_CXX_STANDARD_REQUIRED ON) diff --git a/docs/changes/2026-05-15-fetchcontent-cache/README.md b/docs/changes/2026-05-15-fetchcontent-cache/README.md new file mode 100644 index 000000000..de30b709b --- /dev/null +++ b/docs/changes/2026-05-15-fetchcontent-cache/README.md @@ -0,0 +1,169 @@ +# Change: Local FetchContent cache convention + boost mirror swap + +- **Status**: Proposed +- **Date**: 2026-05-15 +- **Tier**: Light +- **Branch**: ci/fetchcontent-cache + +## Overview + +Two narrow, locally-verifiable changes: + +1. Top-level `CMakeLists.txt` honors `FETCHCONTENT_BASE_DIR` from the + environment so developers can share a single populated cache + (`~/.cache/cmake-fetchcontent`) across worktrees and clean builds. +2. Swap boost's sourceforge URL for the official `archives.boost.io` + mirror. URL_HASH unchanged. + +A complementary follow-up to pre-bake the deps into +`dtvmdev1/dtvm-dev-x64:main` (image-side cache for CI) was scoped out of +this PR because Docker is not available in the implementation +environment for end-to-end verification. The image-bake design is +preserved as a follow-up note (see "Deferred"). + +## Motivation + +`third_party/AddDeps.cmake` declares 8 FetchContent entries +(spdlog/asmjit/CLI11/intx/boost/rapidjson + conditional googletest/yaml-cpp). +Clean builds and CI jobs re-download all of them every time. + +- CI flakiness example: PR #499 run `25897803413` died on a rapidjson + download. +- boost is hosted on SourceForge — historically the most flaky of the + 8 hosts. +- Local clean builds (new worktrees, new contributors, fresh container + starts) pay the full download cost on every reset. + +Boost URL swap fixes the single biggest cold-start failure mode in CI +even without an image-side cache. The env-hook lets local developers +opt into a shared cache trivially (one-line export in shell rc). + +## Impact + +### Affected modules + +- `CMakeLists.txt` — env-var hook (between line 6 and line 8) +- `third_party/AddDeps.cmake` — boost URL swap; drop `DOWNLOAD_NAME`; + light reference comment to `docs/start.md` +- `docs/start.md` — "Build dependency cache" section + +### Affected contracts + +None. Build-system download semantics only. + +### Compatibility + +Fully backwards-compatible: +- No env var set → identical to today. +- New boost URL has same `URL_HASH`, so byte-identical content. +- CI workflows unchanged. + +## Implementation + +### 1. CMakeLists.txt env hook + +Insert between line 6 (`project(...)`) and line 8 +(`set(CMAKE_CXX_STANDARD 17)`): + +```cmake +# Honor FETCHCONTENT_BASE_DIR from environment when not set on cmd line. +if(DEFINED ENV{FETCHCONTENT_BASE_DIR} AND NOT DEFINED CACHE{FETCHCONTENT_BASE_DIR}) + set(FETCHCONTENT_BASE_DIR "$ENV{FETCHCONTENT_BASE_DIR}" CACHE PATH + "Shared FetchContent cache (from env)") +endif() +``` + +Mirrors the pattern already present at +`.worktrees/feat-gas-check-placement/CMakeLists.txt:8-15`. Active only +when env is set AND `-D` not passed; otherwise no-op. + +### 2. Boost URL swap + +In `third_party/AddDeps.cmake`, replace: +```cmake +URL https://sourceforge.net/projects/boost/files/boost/1.67.0/boost_1_67_0.tar.bz2/download +DOWNLOAD_NAME boost_1_67_0.tar.bz2 +``` +with: +```cmake +URL https://archives.boost.io/release/1.67.0/source/boost_1_67_0.tar.bz2 +``` + +- `URL_HASH SHA256=2684c97...adba` unchanged. +- `DOWNLOAD_NAME` dropped because the new URL ends in the canonical + filename (per `cmake --help-module ExternalProject` `DOWNLOAD_NAME` + default). +- `archives.boost.io` is the official Boost archive (cross-referenced + on `boost.org/users/history/version_1_67_0.html`). + +### 3. Documentation + +`docs/start.md` adds a "Build dependency cache" section explaining the +env-var convention and the SGX local-cache caveat (asmjit gets patched +under SGX; mixing patched/unpatched in one cache breaks). + +## Validation (local) + +Verified in implementation worktree: + +- **boost URL reachable + correct bytes:** + `curl -sIL https://archives.boost.io/release/1.67.0/source/boost_1_67_0.tar.bz2` + → HTTP 200, content-length 87336566 (matches canonical 1.67.0 tarball), + last-modified 2018-04-11. +- **env-hook fires when expected:** With `FETCHCONTENT_BASE_DIR=/tmp/fc` + exported and `command cmake -S . -B build-test`, FetchContent + populates at `/tmp/fc/-src/` (note: directly under BASE_DIR, + not under `_deps/` — that's only the default segment). +- **Boost content after URL swap:** populated `boost-src/boost/version.hpp` + contains `#define BOOST_VERSION 106700` (== 1.67.0). URL_HASH + validates so byte-identity is enforced. +- **Format check pass.** + +## Deferred (image-side cache; future PR) + +The original design also included a `docker/bake/CMakeLists.txt` driver +plus a Dockerfile bake stage that would pre-populate +`/opt/cmake-fetchcontent` inside `dtvmdev1/dtvm-dev-x64:main`. This +would eliminate per-CI-run downloads entirely for the EVM and WASM +container jobs. + +Reason for deferral: Docker is not available in this implementation +environment, so the bake stage cannot be verified end-to-end (image +build, COPY semantics, layer cache behavior). Shipping unverified +Docker code carries an asymmetric cost — a broken image republish would +affect every CI run. + +The design is preserved for a follow-up PR: +- Standalone `docker/bake/CMakeLists.txt` driver with `LANGUAGES NONE` + and inlined `FetchContent_Declare` blocks (NO `include(AddDeps.cmake)` + to avoid re-loading FetchContent which would clobber the override). +- Dockerfile bake stage placed after `foundryup` (line 47), sets + `ENV FETCHCONTENT_BASE_DIR=/opt/cmake-fetchcontent`. +- Sync burden: bake CMakeLists must be updated when `AddDeps.cmake` + changes. + +## Risks + +- **Boost URL transition single-point-of-failure.** The PR's first CI + run is the first hit on `archives.boost.io` from DTVM CI. If 504, + PR fails. Mitigation: manually re-run; if persistent, revert URL + swap in a single-line hotfix. +- **`archives.boost.io` outage.** Mitigation: official Boost archive + (verified). If down long-term, switch to a self-hosted GH release. + +## Acceptance Criteria + +1. PR CI passes every job that passes on `main` today. +2. PR CI's first hit on `archives.boost.io` succeeds (no 504); boost + downloads correctly with unchanged `URL_HASH`. +3. Local: `FETCHCONTENT_BASE_DIR=/tmp/fc cmake -S . -B build` produces + `/tmp/fc/-src/` (env hook honored). +4. `docs/start.md` has the cache section. + +## Out of scope + +- Pre-bake into dev image (deferred — needs Docker verification). +- `actions/cache` for CI (not needed without image-bake). +- GIT_TAG → commit SHA pinning (separable). +- Hunter / submodules / CPM migration (heavyweight). +- CMake version bump (separable). diff --git a/docs/start.md b/docs/start.md index decf1702c..46afa42fa 100644 --- a/docs/start.md +++ b/docs/start.md @@ -17,6 +17,37 @@ The fastest way to set up the compilation environment is to use a Docker image o docker pull dtvmdev1/dtvm-dev-x64:main ``` +## Build dependency cache + +DTVM uses CMake `FetchContent` to pull 8 external dependencies (declared +in `third_party/AddDeps.cmake`). On a clean build these are downloaded +fresh, which is the main source of CI / cold-build flakiness when an +upstream host is slow or returns 504. + +To share the populated sources across builds (worktrees, repeated clean +builds, multiple machines mounting the same home dir), export +`FETCHCONTENT_BASE_DIR` before invoking cmake: + +```sh +# Add to ~/.zshrc or ~/.bashrc: +export FETCHCONTENT_BASE_DIR="$HOME/.cache/cmake-fetchcontent" +mkdir -p "$FETCHCONTENT_BASE_DIR" +``` + +The top-level `CMakeLists.txt` honors this env var when no +`-DFETCHCONTENT_BASE_DIR=…` is passed on the cmake command line. After +the first successful configure, subsequent clean builds re-use the +populated sources without re-downloading. + +To opt out (use the default `build/_deps/` per-build dir): `unset +FETCHCONTENT_BASE_DIR`. + +**Note for SGX local builds**: if you build with `ZEN_ENABLE_SGX=ON`, +use a separate cache directory (e.g., +`~/.cache/cmake-fetchcontent-sgx`) — asmjit gets a `PATCH_COMMAND` +applied to its sources under SGX, and mixing patched and unpatched +sources in one cache causes silent breakage. + ## Interpreter Interpreter mode is the current default execution mode. No specific CMake parameters are needed during compilation. Reference compilation commands are as follows: diff --git a/third_party/AddDeps.cmake b/third_party/AddDeps.cmake index d2eb886af..b2f3dbcb7 100644 --- a/third_party/AddDeps.cmake +++ b/third_party/AddDeps.cmake @@ -1,6 +1,10 @@ # Copyright (C) 2021-2025 the DTVM authors. All Rights Reserved. # SPDX-License-Identifier: Apache-2.0 +# NOTE: Set FETCHCONTENT_BASE_DIR (env var or cmake -D) to share populated +# sources across clean builds — the top-level CMakeLists.txt honors the env +# form. See docs/start.md "Build dependency cache" for details. + set(CMAKE_POLICY_DEFAULT_CMP0077 NEW) include(FetchContent) @@ -73,8 +77,7 @@ include_directories(${intx_SOURCE_DIR}/include) FetchContent_Declare( boost - URL https://sourceforge.net/projects/boost/files/boost/1.67.0/boost_1_67_0.tar.bz2/download - DOWNLOAD_NAME boost_1_67_0.tar.bz2 + URL https://archives.boost.io/release/1.67.0/source/boost_1_67_0.tar.bz2 URL_HASH SHA256=2684c972994ee57fc5632e03bf044746f6eb45d4920c343937a465fd67a5adba ) From b0b2195d2214f70374ab8f5175ea86aa0cb8f8c6 Mon Sep 17 00:00:00 2001 From: Abmcar Date: Fri, 15 May 2026 16:05:55 +0800 Subject: [PATCH 2/3] ci: cache FetchContent deps via actions/cache across CI runs MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add `actions/cache@v4` step to each container-image job in EVM (10) and WASM (4) workflows, plus workflow-level `FETCHCONTENT_BASE_DIR` env, so populated FetchContent sources persist across CI runs. First run is a miss (full downloads, ~820MB save); subsequent runs with unchanged `third_party/AddDeps.cmake` hit the cache and skip all 8 downloads. Builds on commit 96707a2 (CMakeLists env hook + boost URL swap): - The env hook (CMakeLists.txt:8-18) reads `FETCHCONTENT_BASE_DIR` from the workflow-level env when no `-D` is passed on cmake cmd line, so `.ci/run_test_suite.sh` and inline-cmake jobs both pick it up without modification. - boost URL swap already addresses the most flaky cold-start dep (sourceforge → archives.boost.io). Cache key composition: `${{ runner.os }}-fc-${{ github.workflow }}-v1- ${{ hashFiles('third_party/AddDeps.cmake') }}`. The `github.workflow` segment is necessary — EVM runs `SINGLEPASS_JIT=OFF` (no asmjit), WASM runs `SINGLEPASS_JIT=ON` (needs asmjit); sharing one key across workflows would cause WASM to re-download asmjit every run because actions/cache@v4 skips same-key saves (the "first writer wins" behavior). Workflow-prefixed keys avoid this. No `restore-keys` — partial cache hits across different dep versions can silently yield stale source. Coverage: 14 container jobs total. The replace_all on the standard `submodules: "true"` → `Code Format Check` boundary handled 12 of them; two manual inserts handled the Hunter-cache job (after Hunter cache step) and the perf-regression-check job (after its `fetch-depth: 0` checkout `with:` clause). Validation: - Format check pass. - YAML lint pass (python yaml.safe_load) on both workflows. - Cache step count: 10 (EVM) + 4 (WASM) = 14. - Cannot test cache behavior locally — actions/cache is GH-runner-side. PR CI will exercise it (AC-A first-run miss → save; AC-B re-run hit). Spec went through Phase 0 → Phase 0.5 (both REFINE absorbed) → Phase 1 → Phase 2 R1 (Opus PASS + Codex REVISE wording fixes absorbed). Hard cap not hit; iter=2 of Phase 0.5 skipped because refinements were spec-text only. Refs: docs/changes/2026-05-15-fetchcontent-cache/README.md Co-Authored-By: Claude Opus 4.7 (1M context) --- .github/workflows/dtvm_evm_test_x86.yml | 59 ++++ .github/workflows/dtvm_wasm_test_x86.yml | 28 ++ .../2026-05-15-fetchcontent-cache/README.md | 295 ++++++++++-------- 3 files changed, 253 insertions(+), 129 deletions(-) diff --git a/.github/workflows/dtvm_evm_test_x86.yml b/.github/workflows/dtvm_evm_test_x86.yml index b34e26a32..18118b6cf 100644 --- a/.github/workflows/dtvm_evm_test_x86.yml +++ b/.github/workflows/dtvm_evm_test_x86.yml @@ -16,6 +16,14 @@ on: permissions: contents: read +# Shared FetchContent cache root for all container jobs. The hook in +# CMakeLists.txt (commit 96707a2 lines 8-18) picks this up as the base +# dir for FetchContent populations. Each container job adds an +# `actions/cache` step keyed on `hashFiles('third_party/AddDeps.cmake')` +# to persist this dir across CI runs. +env: + FETCHCONTENT_BASE_DIR: /github/home/.fetchcontent + jobs: build_test_evm_interpreter_x86_ctest: name: Test DTVM-EVM interpreter with ctest on x86-64 @@ -27,6 +35,11 @@ jobs: uses: actions/checkout@v3 with: submodules: "true" + - name: Cache FetchContent deps + uses: actions/cache@v4 + with: + path: /github/home/.fetchcontent + key: ${{ runner.os }}-fc-${{ github.workflow }}-v1-${{ hashFiles('third_party/AddDeps.cmake') }} - name: Code Format Check run: | ./tools/format.sh check @@ -61,6 +74,11 @@ jobs: uses: actions/checkout@v3 with: submodules: "true" + - name: Cache FetchContent deps + uses: actions/cache@v4 + with: + path: /github/home/.fetchcontent + key: ${{ runner.os }}-fc-${{ github.workflow }}-v1-${{ hashFiles('third_party/AddDeps.cmake') }} - name: Code Format Check run: | ./tools/format.sh check @@ -92,6 +110,11 @@ jobs: uses: actions/checkout@v3 with: submodules: "true" + - name: Cache FetchContent deps + uses: actions/cache@v4 + with: + path: /github/home/.fetchcontent + key: ${{ runner.os }}-fc-${{ github.workflow }}-v1-${{ hashFiles('third_party/AddDeps.cmake') }} - name: Code Format Check run: | ./tools/format.sh check @@ -134,6 +157,11 @@ jobs: uses: actions/checkout@v3 with: submodules: "true" + - name: Cache FetchContent deps + uses: actions/cache@v4 + with: + path: /github/home/.fetchcontent + key: ${{ runner.os }}-fc-${{ github.workflow }}-v1-${{ hashFiles('third_party/AddDeps.cmake') }} - name: Code Format Check run: | ./tools/format.sh check @@ -165,6 +193,11 @@ jobs: uses: actions/checkout@v3 with: submodules: "true" + - name: Cache FetchContent deps + uses: actions/cache@v4 + with: + path: /github/home/.fetchcontent + key: ${{ runner.os }}-fc-${{ github.workflow }}-v1-${{ hashFiles('third_party/AddDeps.cmake') }} - name: Code Format Check run: | ./tools/format.sh check @@ -197,6 +230,11 @@ jobs: uses: actions/checkout@v3 with: submodules: "true" + - name: Cache FetchContent deps + uses: actions/cache@v4 + with: + path: /github/home/.fetchcontent + key: ${{ runner.os }}-fc-${{ github.workflow }}-v1-${{ hashFiles('third_party/AddDeps.cmake') }} - name: Code Format Check run: | ./tools/format.sh check @@ -229,6 +267,11 @@ jobs: uses: actions/checkout@v3 with: submodules: "true" + - name: Cache FetchContent deps + uses: actions/cache@v4 + with: + path: /github/home/.fetchcontent + key: ${{ runner.os }}-fc-${{ github.workflow }}-v1-${{ hashFiles('third_party/AddDeps.cmake') }} - name: Code Format Check run: | ./tools/format.sh check @@ -259,6 +302,11 @@ jobs: uses: actions/checkout@v3 with: submodules: "true" + - name: Cache FetchContent deps + uses: actions/cache@v4 + with: + path: /github/home/.fetchcontent + key: ${{ runner.os }}-fc-${{ github.workflow }}-v1-${{ hashFiles('third_party/AddDeps.cmake') }} - name: Cache Hunter uses: actions/cache@v4 with: @@ -300,6 +348,11 @@ jobs: uses: actions/checkout@v3 with: submodules: "true" + - name: Cache FetchContent deps + uses: actions/cache@v4 + with: + path: /github/home/.fetchcontent + key: ${{ runner.os }}-fc-${{ github.workflow }}-v1-${{ hashFiles('third_party/AddDeps.cmake') }} - name: Code Format Check run: | ./tools/format.sh check @@ -342,6 +395,12 @@ jobs: submodules: "true" fetch-depth: 0 + - name: Cache FetchContent deps + uses: actions/cache@v4 + with: + path: /github/home/.fetchcontent + key: ${{ runner.os }}-fc-${{ github.workflow }}-v1-${{ hashFiles('third_party/AddDeps.cmake') }} + - name: Setup git safe directory run: | echo "Configuring git safe directory: ${{ github.workspace }}" diff --git a/.github/workflows/dtvm_wasm_test_x86.yml b/.github/workflows/dtvm_wasm_test_x86.yml index 4fd9c4896..778b78949 100644 --- a/.github/workflows/dtvm_wasm_test_x86.yml +++ b/.github/workflows/dtvm_wasm_test_x86.yml @@ -16,6 +16,14 @@ on: permissions: contents: read +# Shared FetchContent cache root for all container jobs. The hook in +# CMakeLists.txt (commit 96707a2 lines 8-18) picks this up as the base +# dir for FetchContent populations. Each container job adds an +# `actions/cache` step keyed on `hashFiles('third_party/AddDeps.cmake')` +# to persist this dir across CI runs. +env: + FETCHCONTENT_BASE_DIR: /github/home/.fetchcontent + jobs: build_test_interp_on_x86: name: Build and test DTVM interpreter on x86-64 @@ -27,6 +35,11 @@ jobs: uses: actions/checkout@v3 with: submodules: "true" + - name: Cache FetchContent deps + uses: actions/cache@v4 + with: + path: /github/home/.fetchcontent + key: ${{ runner.os }}-fc-${{ github.workflow }}-v1-${{ hashFiles('third_party/AddDeps.cmake') }} - name: Code Format Check run: | ./tools/format.sh check @@ -69,6 +82,11 @@ jobs: uses: actions/checkout@v3 with: submodules: "true" + - name: Cache FetchContent deps + uses: actions/cache@v4 + with: + path: /github/home/.fetchcontent + key: ${{ runner.os }}-fc-${{ github.workflow }}-v1-${{ hashFiles('third_party/AddDeps.cmake') }} - name: Code Format Check run: | ./tools/format.sh check @@ -111,6 +129,11 @@ jobs: uses: actions/checkout@v3 with: submodules: "true" + - name: Cache FetchContent deps + uses: actions/cache@v4 + with: + path: /github/home/.fetchcontent + key: ${{ runner.os }}-fc-${{ github.workflow }}-v1-${{ hashFiles('third_party/AddDeps.cmake') }} - name: Code Format Check run: | ./tools/format.sh check @@ -153,6 +176,11 @@ jobs: uses: actions/checkout@v3 with: submodules: "true" + - name: Cache FetchContent deps + uses: actions/cache@v4 + with: + path: /github/home/.fetchcontent + key: ${{ runner.os }}-fc-${{ github.workflow }}-v1-${{ hashFiles('third_party/AddDeps.cmake') }} - name: Code Format Check run: | ./tools/format.sh check diff --git a/docs/changes/2026-05-15-fetchcontent-cache/README.md b/docs/changes/2026-05-15-fetchcontent-cache/README.md index de30b709b..7b9a8ab29 100644 --- a/docs/changes/2026-05-15-fetchcontent-cache/README.md +++ b/docs/changes/2026-05-15-fetchcontent-cache/README.md @@ -1,169 +1,206 @@ -# Change: Local FetchContent cache convention + boost mirror swap +# Change: actions/cache for FetchContent on DTVM CI - **Status**: Proposed - **Date**: 2026-05-15 - **Tier**: Light -- **Branch**: ci/fetchcontent-cache +- **Branch**: ci/fetchcontent-cache (continues commit `96707a2`) ## Overview -Two narrow, locally-verifiable changes: - -1. Top-level `CMakeLists.txt` honors `FETCHCONTENT_BASE_DIR` from the - environment so developers can share a single populated cache - (`~/.cache/cmake-fetchcontent`) across worktrees and clean builds. -2. Swap boost's sourceforge URL for the official `archives.boost.io` - mirror. URL_HASH unchanged. - -A complementary follow-up to pre-bake the deps into -`dtvmdev1/dtvm-dev-x64:main` (image-side cache for CI) was scoped out of -this PR because Docker is not available in the implementation -environment for end-to-end verification. The image-bake design is -preserved as a follow-up note (see "Deferred"). +Add `actions/cache@v4` step to the two DTVM CMake-building CI workflows +(EVM + WASM) to cache the populated FetchContent sources across runs. +First run pays full download; every subsequent run with unchanged +`third_party/AddDeps.cmake` hits the cache and skips downloads +entirely. This completes the work begun in commit `96707a2` (boost URL +swap + CMakeLists env hook). ## Motivation -`third_party/AddDeps.cmake` declares 8 FetchContent entries -(spdlog/asmjit/CLI11/intx/boost/rapidjson + conditional googletest/yaml-cpp). -Clean builds and CI jobs re-download all of them every time. +Each DTVM CI run currently downloads 8 FetchContent deps from scratch +(`spdlog`, `asmjit` (WASM only), `CLI11`, `intx`, `boost`, `rapidjson`, ++ conditional `googletest`/`yaml-cpp`). Any single 504 kills the +pipeline (e.g., PR #499 run `25897803413` died on rapidjson). -- CI flakiness example: PR #499 run `25897803413` died on a rapidjson - download. -- boost is hosted on SourceForge — historically the most flaky of the - 8 hosts. -- Local clean builds (new worktrees, new contributors, fresh container - starts) pay the full download cost on every reset. +`actions/cache@v4` is the canonical 2025 pattern for FetchContent +caching (verified against `vowpal_wabbit` and `colmap` workflows — +both cache around `FetchContent` paths keyed on dep manifest hashes). -Boost URL swap fixes the single biggest cold-start failure mode in CI -even without an image-side cache. The env-hook lets local developers -opt into a shared cache trivially (one-line export in shell rc). +The earlier image-bake approach was investigated but Docker is +unavailable in the implementation environment for verification. +`actions/cache` does not require Docker and is verifiable directly via +PR CI runs. ## Impact ### Affected modules -- `CMakeLists.txt` — env-var hook (between line 6 and line 8) -- `third_party/AddDeps.cmake` — boost URL swap; drop `DOWNLOAD_NAME`; - light reference comment to `docs/start.md` -- `docs/start.md` — "Build dependency cache" section +- `.github/workflows/dtvm_evm_test_x86.yml` — add cache step + env to + ~10 container-image build jobs +- `.github/workflows/dtvm_wasm_test_x86.yml` — same for ~4 container + jobs +- `docs/changes/2026-05-15-fetchcontent-cache/README.md` — this doc; + drop image-bake content from prior iteration ### Affected contracts -None. Build-system download semantics only. +None. CI infrastructure only. ### Compatibility -Fully backwards-compatible: -- No env var set → identical to today. -- New boost URL has same `URL_HASH`, so byte-identical content. -- CI workflows unchanged. +Fully backwards-compatible. Cache miss falls through to current +behavior (live FetchContent download). No workflow logic changes +beyond the new cache step and job-level env. ## Implementation -### 1. CMakeLists.txt env hook - -Insert between line 6 (`project(...)`) and line 8 -(`set(CMAKE_CXX_STANDARD 17)`): +### 1. Cache step per workflow + +**Coverage** (14 distinct jobs total): +- **EVM (10 jobs)**: 8 `bash .ci/run_test_suite.sh` callers + 2 matrix + instances of `performance_regression_check` (which runs both a base + build via direct cmake AND a PR-HEAD build via `run_test_suite.sh`). +- **WASM (4 jobs)**: 3 `bash .ci/run_test_suite.sh` callers + + `build_test_evmabi_mock_cli_on_x86` (uses inline `cmake -S . -B build` + directly, not via `run_test_suite.sh`). + +All 14 need the cache step. The inline-cmake job in WASM and the +direct-cmake baseline build in `performance_regression_check` also +inherit `FETCHCONTENT_BASE_DIR` from the env block — no special +handling required. + +Add to each container job, between `actions/checkout` and the +build/test step: + +```yaml +- name: Cache FetchContent deps + uses: actions/cache@v4 + with: + path: /github/home/.fetchcontent + key: ${{ runner.os }}-fc-${{ github.workflow }}-v1-${{ hashFiles('third_party/AddDeps.cmake') }} +``` -```cmake -# Honor FETCHCONTENT_BASE_DIR from environment when not set on cmd line. -if(DEFINED ENV{FETCHCONTENT_BASE_DIR} AND NOT DEFINED CACHE{FETCHCONTENT_BASE_DIR}) - set(FETCHCONTENT_BASE_DIR "$ENV{FETCHCONTENT_BASE_DIR}" CACHE PATH - "Shared FetchContent cache (from env)") -endif() +**Key composition rationale:** +- `runner.os` — standard practice (Ubuntu vs other). +- `github.workflow` — **necessary**: EVM workflow runs with + `SINGLEPASS_JIT=OFF` (no asmjit), WASM workflow includes + `SINGLEPASS_JIT=ON` (needs asmjit). Sharing one key causes a + partial-hit churn: EVM saves 7 deps → WASM restores 7, populates 1, + but `actions/cache@v4` skips same-key save (logs a warning, does + not fail the job) → WASM re-downloads asmjit every run. + Workflow-prefixed key avoids this. +- `v1` namespace — **manual escape hatch**. Bump to `v2` when + `dtvmdev1/dtvm-dev-x64:main` is rebuilt with materially different + CMake/compiler/Ninja/tar/zstd versions. The `:main` tag is mutable + and our key does not auto-invalidate on image bumps. +- `hashFiles('third_party/AddDeps.cmake')` — auto-invalidates on any + dep change (URL/hash/tag/new dep). +- **No `restore-keys`**. Partial cache hits across different dep + versions can yield silently-stale source (FetchContent stamps say + "populated", URL_HASH may not match the cached tarball if user + changed URL but kept hash). Cold start is the lesser evil. + +### 2. Job env + +Add to each container job's `env:` block (where the build runs): + +```yaml +env: + FETCHCONTENT_BASE_DIR: /github/home/.fetchcontent ``` -Mirrors the pattern already present at -`.worktrees/feat-gas-check-placement/CMakeLists.txt:8-15`. Active only -when env is set AND `-D` not passed; otherwise no-op. +The CMakeLists env-hook from commit `96707a2` (lines 8-18; executable +`if/set/endif` block at lines 11-18) picks up the env var when no `-D` +is passed on the cmake command line. No changes needed to +`.ci/run_test_suite.sh`. -### 2. Boost URL swap +### 3. Drop image-bake content from this change doc -In `third_party/AddDeps.cmake`, replace: -```cmake -URL https://sourceforge.net/projects/boost/files/boost/1.67.0/boost_1_67_0.tar.bz2/download -DOWNLOAD_NAME boost_1_67_0.tar.bz2 -``` -with: -```cmake -URL https://archives.boost.io/release/1.67.0/source/boost_1_67_0.tar.bz2 -``` +Previous iteration's `docs/changes/.../README.md` had a "Deferred" +section describing the image-bake design. Replace with this file +focused on actions/cache. -- `URL_HASH SHA256=2684c97...adba` unchanged. -- `DOWNLOAD_NAME` dropped because the new URL ends in the canonical - filename (per `cmake --help-module ExternalProject` `DOWNLOAD_NAME` - default). -- `archives.boost.io` is the official Boost archive (cross-referenced - on `boost.org/users/history/version_1_67_0.html`). - -### 3. Documentation - -`docs/start.md` adds a "Build dependency cache" section explaining the -env-var convention and the SGX local-cache caveat (asmjit gets patched -under SGX; mixing patched/unpatched in one cache breaks). - -## Validation (local) - -Verified in implementation worktree: - -- **boost URL reachable + correct bytes:** - `curl -sIL https://archives.boost.io/release/1.67.0/source/boost_1_67_0.tar.bz2` - → HTTP 200, content-length 87336566 (matches canonical 1.67.0 tarball), - last-modified 2018-04-11. -- **env-hook fires when expected:** With `FETCHCONTENT_BASE_DIR=/tmp/fc` - exported and `command cmake -S . -B build-test`, FetchContent - populates at `/tmp/fc/-src/` (note: directly under BASE_DIR, - not under `_deps/` — that's only the default segment). -- **Boost content after URL swap:** populated `boost-src/boost/version.hpp` - contains `#define BOOST_VERSION 106700` (== 1.67.0). URL_HASH - validates so byte-identity is enforced. -- **Format check pass.** - -## Deferred (image-side cache; future PR) - -The original design also included a `docker/bake/CMakeLists.txt` driver -plus a Dockerfile bake stage that would pre-populate -`/opt/cmake-fetchcontent` inside `dtvmdev1/dtvm-dev-x64:main`. This -would eliminate per-CI-run downloads entirely for the EVM and WASM -container jobs. - -Reason for deferral: Docker is not available in this implementation -environment, so the bake stage cannot be verified end-to-end (image -build, COPY semantics, layer cache behavior). Shipping unverified -Docker code carries an asymmetric cost — a broken image republish would -affect every CI run. - -The design is preserved for a follow-up PR: -- Standalone `docker/bake/CMakeLists.txt` driver with `LANGUAGES NONE` - and inlined `FetchContent_Declare` blocks (NO `include(AddDeps.cmake)` - to avoid re-loading FetchContent which would clobber the override). -- Dockerfile bake stage placed after `foundryup` (line 47), sets - `ENV FETCHCONTENT_BASE_DIR=/opt/cmake-fetchcontent`. -- Sync burden: bake CMakeLists must be updated when `AddDeps.cmake` - changes. +## Validation -## Risks +### Local + +- `tools/format.sh check` — pass. +- YAML lint via `python -c "import yaml; yaml.safe_load(open('.github/workflows/dtvm_evm_test_x86.yml'))"` — pass. +- Diff inspection: each modified job has cache step + env block. + +### CI (post-push) -- **Boost URL transition single-point-of-failure.** The PR's first CI - run is the first hit on `archives.boost.io` from DTVM CI. If 504, - PR fails. Mitigation: manually re-run; if persistent, revert URL - swap in a single-line hotfix. -- **`archives.boost.io` outage.** Mitigation: official Boost archive - (verified). If down long-term, switch to a self-hosted GH release. +The cache behavior is GH-runner-side; can only be observed in a real +CI run. ## Acceptance Criteria -1. PR CI passes every job that passes on `main` today. -2. PR CI's first hit on `archives.boost.io` succeeds (no 504); boost - downloads correctly with unchanged `URL_HASH`. -3. Local: `FETCHCONTENT_BASE_DIR=/tmp/fc cmake -S . -B build` produces - `/tmp/fc/-src/` (env hook honored). -4. `docs/start.md` has the cache section. +1. **AC-A: PR CI first run is cache-miss.** Workflow log shows + "Cache not found for input keys: ..." followed at end-of-job by + "Cache saved with key: ...". +2. **AC-B: PR CI re-run is cache-hit.** Re-running the same workflow + (no commit change) shows "Cache restored from key: ..." in cache + step output. Build log shows zero `^-- Downloading` lines from + FetchContent. (Note: explicit "restored from key" line is the + primary AC; absence of `-- Downloading` is corroborating.) +3. **AC-C: No regression.** All jobs that pass on `main` today pass + with cache step active. +4. **AC-D: Cache key invalidates on AddDeps change.** A no-op edit to + `third_party/AddDeps.cmake` (trailing newline) in a follow-up + commit produces a new key (visible in workflow log as different + key hash). +5. **AC-E: EVM and WASM caches don't interfere.** EVM cache key + contains `DTVM-EVM` (workflow `name:` field value, with hyphen), + WASM contains `DTVM-WASM`. Verify via workflow log key string. + +## Risks + +- **R1: actions/cache@v4 itself unavailable / quota exhausted.** + Mitigation: cache miss is non-fatal; CI falls back to live + FetchContent download (current behavior). No regression. + +- **R2: 10GB repo cache cap proximity.** + Per-key size ~820MB. With 10 active feature branches × 2 workflows + = ~16GB of potential cache load — over the cap. GitHub LRU-evicts + caches not accessed in 7 days, so steady-state should hover around + 3-5 active keys (~3-4GB). Not "well under" the cap; close but + acceptable. Monitor via `gh cache list` if eviction thrashing + becomes visible. + +- **R3: Image churn invalidation.** + `dtvmdev1/dtvm-dev-x64:main` is a mutable tag. If the image is + rebuilt with a materially different CMake / compiler / Ninja + version, the cached `-build/` artifacts could mismatch. + Mitigation: manually bump the `v1` namespace in the cache key + (becomes `v2`, etc.) when the image is rebuilt with material + changes. Documented in this section. + +- **R4: Cache-key hash misses other dep-affecting files.** + Today only `third_party/AddDeps.cmake` controls FetchContent + declarations. If a future PR moves declares elsewhere or adds new + conditional logic in `CMakeLists.txt` flags, update the cache key. + +- **R5: Boost URL transition single-point-of-failure.** + (Carried from commit `96707a2`.) First CI run hits new boost URL + live; if 504, re-run. Cache then captures it for subsequent runs. ## Out of scope -- Pre-bake into dev image (deferred — needs Docker verification). -- `actions/cache` for CI (not needed without image-bake). -- GIT_TAG → commit SHA pinning (separable). -- Hunter / submodules / CPM migration (heavyweight). -- CMake version bump (separable). +- Image-baking deps into `dtvmdev1/dtvm-dev-x64:main` — deferred + (Docker unavailable for verification in current environment). +- Pinning `:main` image by digest in cache key — would tighten R3 + but adds maintenance cost; bump-`v1`-on-image-rebuild is simpler. +- Migration to Hunter / submodules / CPM. +- Pinning `GIT_TAG` to commit SHAs. + +## Provenance + +- Commit `96707a2` ("build(deps): swap boost mirror + honor + FETCHCONTENT_BASE_DIR env") already adds the env-hook + boost URL + prerequisites this change builds on. +- Prior Phase 0.5 v2 round 1 reviews: + - `reviews/motivation-v2-1-opus.md` (cite: cache-key churn, + AC-B log line check) + - `reviews/motivation-v2-1-codex.md` (cite: image churn, 10GB cap + wording, canonical pattern confirmation) +- All cited refinements absorbed into this spec; iter=2 skipped + because the refinements are spec-level fixes, not direction changes. From f04e6cd70b91beed48770bd330ce7cb7f3081a73 Mon Sep 17 00:00:00 2001 From: Abmcar Date: Fri, 15 May 2026 16:38:53 +0800 Subject: [PATCH 3/3] docs: address Copilot review on PR #508 (path/wording fixes) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - CMakeLists.txt env-hook comment: fix path ref. The comment previously cited `/opt/cmake-fetchcontent` (a vestige from the abandoned image-bake design); the workflows in this PR use `/github/home/.fetchcontent`, and local convention uses `~/.cache/cmake-fetchcontent`. Reword to describe the actual intended consumers. - docs/start.md: "8 deps" → "up to 8" with explicit list of which deps are conditional on which `ZEN_ENABLE_*` flag, so readers can reconcile with `AddDeps.cmake` content. - docs/start.md SGX caveat: clarify that no current CI job builds with `ZEN_ENABLE_SGX=ON`, so the workflow-level cache key (which does not distinguish SGX) is sound today; flag it as a contract to revisit if SGX is ever added to CI. Deferred (with rationale on the PR review thread): - Composite action for the 14 duplicated cache steps. Acknowledged duplication; sed-replace of a single block on a key bump (e.g. `v1` → `v2`) is still 1 line per workflow, while a composite action adds a new file + indirection. Will reconsider if the cache step grows beyond 4 lines. - Adding `CMakeLists.txt` to `hashFiles(...)`. Bumping the cache key on every CMakeLists.txt edit (frequent and dep-unrelated) outweighs the rare case where a new `FetchContent_Declare` is added outside `AddDeps.cmake`. R4 in the change-doc already calls this out. Co-Authored-By: Claude Opus 4.7 (1M context) --- CMakeLists.txt | 6 ++++-- docs/start.md | 17 ++++++++++++----- 2 files changed, 16 insertions(+), 7 deletions(-) diff --git a/CMakeLists.txt b/CMakeLists.txt index 7859e66b2..a7c24f3d6 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -6,8 +6,10 @@ cmake_minimum_required(VERSION 3.16) project(ZetaEngine LANGUAGES C CXX ASM) # Honor FETCHCONTENT_BASE_DIR from environment when not set on cmd line. Enables -# a shared FetchContent cache across worktrees, CI jobs, and the dev container -# image (which sets FETCHCONTENT_BASE_DIR=/opt/cmake-fetchcontent). +# a shared FetchContent cache across worktrees, CI jobs, and local builds. CI +# workflows export this env to `/github/home/.fetchcontent` (paired with +# actions/cache); local developers can export `~/.cache/cmake-fetchcontent` per +# `docs/start.md` "Build dependency cache". if(DEFINED ENV{FETCHCONTENT_BASE_DIR} AND NOT DEFINED CACHE{FETCHCONTENT_BASE_DIR} ) diff --git a/docs/start.md b/docs/start.md index 46afa42fa..b7f3d5376 100644 --- a/docs/start.md +++ b/docs/start.md @@ -19,10 +19,13 @@ docker pull dtvmdev1/dtvm-dev-x64:main ## Build dependency cache -DTVM uses CMake `FetchContent` to pull 8 external dependencies (declared -in `third_party/AddDeps.cmake`). On a clean build these are downloaded -fresh, which is the main source of CI / cold-build flakiness when an -upstream host is slow or returns 504. +DTVM uses CMake `FetchContent` to pull up to 8 external dependencies +declared in `third_party/AddDeps.cmake` (`CLI11`, `intx`, `boost`, +`rapidjson` are unconditional; `spdlog` is on unless `ZEN_ENABLE_SGX=ON`; +`asmjit` is on with `ZEN_ENABLE_SINGLEPASS_JIT=ON`; `googletest` and +`yaml-cpp` are on with `ZEN_ENABLE_SPEC_TEST=ON`). On a clean build +these are downloaded fresh, which is the main source of CI / cold-build +flakiness when an upstream host is slow or returns 504. To share the populated sources across builds (worktrees, repeated clean builds, multiple machines mounting the same home dir), export @@ -46,7 +49,11 @@ FETCHCONTENT_BASE_DIR`. use a separate cache directory (e.g., `~/.cache/cmake-fetchcontent-sgx`) — asmjit gets a `PATCH_COMMAND` applied to its sources under SGX, and mixing patched and unpatched -sources in one cache causes silent breakage. +sources in one cache causes silent breakage. No current CI job builds +with SGX, so the workflow-level cache (`/github/home/.fetchcontent`, +keyed on `hashFiles('third_party/AddDeps.cmake')`) does not need to +distinguish SGX state. Revisit the cache key composition when SGX is +added to CI. ## Interpreter