diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 000000000000..69a147ffad36 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,5 @@ +build with: + +``` +nix develop -c meson compile -C build +``` diff --git a/plans/tectonix/lazy-trees.md b/plans/tectonix/lazy-trees.md new file mode 100644 index 000000000000..a380342965da --- /dev/null +++ b/plans/tectonix/lazy-trees.md @@ -0,0 +1,582 @@ +# Tectonix Lazy Trees Integration Plan + +> **Implementation Status:** This plan has been implemented with modifications. Key differences from the original plan: +> - `worldRoot` builtin was **not implemented** (the zone-based approach was deemed sufficient) +> - Zone path validation was added to ensure only exact zone roots can be accessed +> - Thread-safe caching was implemented using `std::call_once` for all lazy-init fields +> - Lazy mounting for dirty zones from checkout was added (not in original plan) +> - `prim_worldZone` became `__unsafeTectonixInternalZone` (internal API) +> - Zone names are sanitized for store path requirements (replacing invalid chars) +> - `allowPath()` is called after `mount()` to prevent resource leaks on exception +> - Git status uses `-z` flag for NUL-separated output to handle special characters +> - Debug logging was added to key functions (`getWorldTreeSha`, `getManifestContent`, etc.) +> - See actual implementation in `src/libexpr/primops/tectonix.cc` and `src/libexpr/eval.cc` + +This document outlines the plan to integrate tectonix zone access with Nix's lazy-trees infrastructure, enabling on-demand copying of zone sources to the store. + +## Background + +### Current Behavior + +When `builtins.unsafeTectonixInternalZoneSrc "//areas/tools/tec"` is called, the entire zone is immediately copied to the Nix store via `fetchToStore()`, regardless of whether the zone content is actually needed for a derivation. + +### Lazy Trees in Flakes + +With `lazy-trees = true`, flakes avoid this eager copying: + +1. `mountInput()` creates a random store path and mounts a `GitSourceAccessor` at that path +2. Files are read on-demand from the git ODB during evaluation +3. Only when the path is used as a derivation input does `devirtualize()` copy it to the store + +### Goal + +Apply the same lazy behavior to tectonix zones, while respecting zone boundaries and dirty zone detection. + +--- + +## Architectural Comparison: Flakes vs Tectonix + +### Flakes + +``` +FlakeRef (github:nixos/nixpkgs/abc123) + │ + ▼ +InputCache.getAccessor() + │ + ▼ +Input.getAccessor() → GitSourceAccessor (lazy) + │ + ▼ +mountInput() + │ + ├─► lazyTrees=false: fetchToStore() immediately + │ + └─► lazyTrees=true: + StorePath::random("nixpkgs") + storeFS->mount(storePath, accessor) + return virtual path + +Later, when used in derivation: + │ + ▼ +devirtualize() → fetchToStore() → real store path +``` + +**Key point:** Each flake is its own unit. The accessor is rooted at the flake, and the whole flake gets mounted at one store path. + +### Tectonix Challenge + +``` +world @ sha:abc123 +├── areas/ +│ ├── tools/ +│ │ ├── tec/ ← Zone (tree: deadbeef) +│ │ ├── dev/ ← Zone (tree: cafebabe) +│ │ └── ... +│ └── platform/ +│ └── ... +└── .meta/ + └── manifest.json + +Problem: Can't mount whole world at one path! + +Using /nix/store/xxx-world/areas/tools/tec as derivation src +would pull in the ENTIRE world when devirtualized. + +Solution: Mount each zone separately at its own store path. +``` + +### What Makes Tectonix Harder + +1. **Granularity mismatch**: Flakes = one input = one mount. World = one repo = thousands of zones. +2. **No `Input` abstraction**: Flakes have `fetchers::Input` with `getAccessor()`, caching, locking. Tectonix builtins are ad-hoc. +3. **Dirty zone complexity**: Flakes mark dirty inputs as "unlocked". Tectonix needs zone-granular dirty detection with checkout fallback. +4. **Two-mode operation**: Git ODB vs checkout. Flakes only have one source per input. + +### What Makes Tectonix Easier + +1. **Content-addressed by nature**: Tree SHA is the *perfect* cache key. Same tree SHA across different world commits = identical content. +2. **No resolution complexity**: No registries, no indirect references, no lock file management. +3. **Already have the accessor**: `getWorldGitAccessor()` returns a lazy `GitSourceAccessor`. +4. **Single source of truth**: One repo, one commit SHA. + +--- + +## Design + +### Core Concept: Zone Mounts by Tree SHA + +``` +builtins.worldZone "//areas/tools/tec" + │ + ▼ +getZoneStorePath(zonePath) + │ + ├─► isDirty? ─────────────────────────────┐ + │ │ │ + │ ▼ │ + │ getZoneFromCheckout() │ + │ (EXTENSION POINT: eager for now) │ + │ │ │ + │ ▼ │ + │ return store path ◄────────────────────┘ + │ + └─► !isDirty + │ + ▼ + treeSha = getWorldTreeSha(zonePath) + │ + ▼ + mountZoneByTreeSha(treeSha) + │ + ├─► cached? return cached store path + │ + └─► not cached: + accessor = repo->getAccessor(treeSha) + storePath = StorePath::random(name) + storeFS->mount(storePath, accessor) + cache[treeSha] = storePath + return storePath +``` + +### Why Tree SHA as Cache Key + +``` +World @ v1 (sha: aaa) World @ v2 (sha: bbb) +├── areas/tools/tec ├── areas/tools/tec +│ (tree: deadbeef) ─────────────│ (tree: deadbeef) ← SAME! +│ │ +├── areas/tools/dev ├── areas/tools/dev +│ (tree: cafebabe) │ (tree: 12345678) ← Changed +``` + +If `//areas/tools/tec` didn't change between commits, its tree SHA is identical. The zone cache returns the same virtual store path, and when devirtualized, the same real store path. **Natural deduplication across world revisions.** + +--- + +## Implementation + +### Phase 1: Core Infrastructure + +#### 1.1 EvalState Additions (`src/libexpr/include/nix/expr/eval.hh`) + +```cpp +// In EvalState class: + +private: + /** + * Cache tree SHA → virtual store path for lazy zone mounts. + * Thread-safe for eval-cores > 1. + */ + Sync> tectonixZoneCache_; + +public: + /** + * Get a zone's store path, handling dirty detection and lazy mounting. + * + * For clean zones with lazy-trees enabled: mounts accessor lazily + * For dirty zones: currently eager-copies from checkout (extension point) + * For lazy-trees disabled: eager-copies from git + */ + StorePath getZoneStorePath(std::string_view zonePath); + +private: + /** + * Mount a zone by tree SHA, returning a (potentially virtual) store path. + * Caches by tree SHA for deduplication across world revisions. + */ + StorePath mountZoneByTreeSha(const Hash & treeSha, std::string_view zonePath); + + /** + * Get zone store path from checkout (for dirty zones). + * EXTENSION POINT: Currently always eager. Could be made lazy later. + */ + StorePath getZoneFromCheckout(std::string_view zonePath); +``` + +#### 1.2 Implementation (`src/libexpr/eval.cc`) + +```cpp +StorePath EvalState::getZoneStorePath(std::string_view zonePath) +{ + // Normalize path + std::string zone(zonePath); + if (hasPrefix(zone, "//")) + zone = zone.substr(2); + + // Check dirty status + bool isDirty = false; + if (isTectonixSourceAvailable()) { + auto & dirtyZones = getTectonixDirtyZones(); + auto it = dirtyZones.find(std::string(zonePath)); + isDirty = it != dirtyZones.end() && it->second; + } + + if (isDirty) { + // EXTENSION POINT: For now, always eager from checkout + return getZoneFromCheckout(zonePath); + } + + // Clean zone: get tree SHA + auto treeSha = getWorldTreeSha(zonePath); + + if (!settings.lazyTrees) { + // Eager mode: immediate copy from git ODB + auto repo = getWorldRepo(); + GitAccessorOptions opts{.exportIgnore = true, .smudgeLfs = false}; + auto accessor = repo->getAccessor(treeSha, opts, "zone"); + + std::string name = "zone-" + replaceStrings(zone, "/", "-"); + auto storePath = fetchToStore( + fetchSettings, *store, + SourcePath(accessor, CanonPath::root), + FetchMode::Copy, name); + + allowPath(storePath); + return storePath; + } + + // Lazy mode: mount by tree SHA + return mountZoneByTreeSha(treeSha, zonePath); +} + +StorePath EvalState::mountZoneByTreeSha(const Hash & treeSha, std::string_view zonePath) +{ + // Check cache first (thread-safe) + { + auto cache = tectonixZoneCache_.readLock(); + auto it = cache->find(treeSha); + if (it != cache->end()) { + debug("zone cache hit for tree %s", treeSha.gitRev()); + return it->second; + } + } + + // Not cached: create accessor and mount + auto repo = getWorldRepo(); + GitAccessorOptions opts{.exportIgnore = true, .smudgeLfs = false}; + auto accessor = repo->getAccessor(treeSha, opts, "zone"); + + // Generate name from zone path + std::string zone(zonePath); + if (hasPrefix(zone, "//")) + zone = zone.substr(2); + std::string name = "zone-" + replaceStrings(zone, "/", "-"); + + // Create virtual store path + auto storePath = StorePath::random(name); + allowPath(storePath); + + // Mount accessor at this path + storeFS->mount(CanonPath(store->printStorePath(storePath)), accessor); + + // Cache it (thread-safe) + { + auto cache = tectonixZoneCache_.lock(); + auto [it, inserted] = cache->try_emplace(treeSha, storePath); + if (!inserted) { + // Another thread beat us, use their path + return it->second; + } + } + + debug("mounted zone %s (tree %s) at %s", + zonePath, treeSha.gitRev(), store->printStorePath(storePath)); + + return storePath; +} + +StorePath EvalState::getZoneFromCheckout(std::string_view zonePath) +{ + // EXTENSION POINT: Currently always eager. + // + // To make this lazy later, we'd need to: + // 1. Create a filtered accessor over the checkout path + // 2. Compute a content key (hash of modified files? mtime-based?) + // 3. Cache and mount like mountZoneByTreeSha + // + // For now: just copy from checkout. + + std::string zone(zonePath); + if (hasPrefix(zone, "//")) + zone = zone.substr(2); + + auto checkoutAccessor = getWorldCheckoutAccessor(); + if (!checkoutAccessor) + throw Error("checkout accessor not available for dirty zone '%s'", zonePath); + + auto checkoutPath = settings.tectonixCheckoutPath.get(); + auto fullPath = CanonPath(checkoutPath + "/" + zone); + + std::string name = "zone-" + replaceStrings(zone, "/", "-"); + + auto storePath = fetchToStore( + fetchSettings, *store, + SourcePath(*checkoutAccessor, fullPath), + FetchMode::Copy, name); + + allowPath(storePath); + return storePath; +} +``` + +### Phase 2: Updated Builtins (`src/libexpr/primops/tectonix.cc`) + +#### 2.1 Simplify `prim_unsafeTectonixInternalZoneSrc` + +```cpp +static void prim_unsafeTectonixInternalZoneSrc(EvalState & state, const PosIdx pos, Value ** args, Value & v) +{ + auto zonePath = state.forceStringNoCtx(*args[0], pos, + "while evaluating the 'zonePath' argument to builtins.unsafeTectonixInternalZoneSrc"); + + auto storePath = state.getZoneStorePath(zonePath); + state.allowAndSetStorePathString(storePath, v); +} +``` + +#### 2.2 New `prim_worldZone` (flake-like interface) + +```cpp +static void prim_worldZone(EvalState & state, const PosIdx pos, Value ** args, Value & v) +{ + auto zonePath = state.forceStringNoCtx(*args[0], pos, + "while evaluating the 'zonePath' argument to builtins.worldZone"); + + // Get tree SHA before we potentially fetch + auto treeSha = state.getWorldTreeSha(zonePath); + + // Check dirty status + bool isDirty = false; + if (state.isTectonixSourceAvailable()) { + auto & dirtyZones = state.getTectonixDirtyZones(); + auto it = dirtyZones.find(std::string(zonePath)); + isDirty = it != dirtyZones.end() && it->second; + } + + auto storePath = state.getZoneStorePath(zonePath); + auto storePathStr = state.store->printStorePath(storePath); + + // Build result attrset (like fetchTree) + auto attrs = state.buildBindings(4); + + attrs.alloc("outPath").mkString(storePathStr, { + NixStringContextElem::Opaque{storePath} + }); + attrs.alloc("treeSha").mkString(treeSha.gitRev(), state.mem); + attrs.alloc("zonePath").mkString(zonePath, state.mem); + attrs.alloc("dirty").mkBool(isDirty); + + v.mkAttrs(attrs); +} + +static RegisterPrimOp primop_worldZone({ + .name = "worldZone", + .args = {"zonePath"}, + .doc = R"( + Get a zone from the world repository. + + Returns an attrset with: + - outPath: Store path containing zone source (lazy with lazy-trees) + - treeSha: Git tree SHA for this zone + - zonePath: The zone path argument + - dirty: Whether the zone has uncommitted changes + + Example: `builtins.worldZone "//areas/tools/tec"` + + Requires `--tectonix-git-dir` and `--tectonix-sha` to be set. + )", + .fun = prim_worldZone, +}); +``` + +#### 2.3 New `prim_worldRoot` (read-only world access) — NOT IMPLEMENTED + +> **Note:** This builtin was not implemented. The zone-based approach (`__unsafeTectonixInternalZone`) +> was deemed sufficient for all use cases, and `worldRoot` was removed to avoid accidentally copying +> the entire world repository to the store. + +```cpp +static void prim_worldRoot(EvalState & state, const PosIdx pos, Value ** args, Value & v) +{ + // Lazily mount the whole world accessor once per evaluation + auto storePath = state.getOrMountWorldRoot(); + + v.mkPath(state.rootPath( + CanonPath(state.store->printStorePath(storePath)))); +} + +static RegisterPrimOp primop_worldRoot({ + .name = "worldRoot", + .args = {}, + .doc = R"( + Get a path to the world repository root. + + This path can be used for reading files during evaluation: + + let world = builtins.worldRoot; + in import (world + "/areas/tools/tec/zone.nix") + + WARNING: Do not use this path directly as a derivation src! + That would copy the entire world to the store. Use + builtins.worldZone for derivation sources. + + Requires `--tectonix-git-dir` and `--tectonix-sha` to be set. + )", + .fun = prim_worldRoot, +}); +``` + +With supporting method in `EvalState`: + +```cpp +StorePath EvalState::getOrMountWorldRoot() +{ + // Thread-safe lazy initialization + static std::once_flag mounted; + static StorePath worldStorePath; + + std::call_once(mounted, [this]() { + auto accessor = getWorldGitAccessor(); + worldStorePath = StorePath::random("world"); + allowPath(worldStorePath); + storeFS->mount( + CanonPath(store->printStorePath(worldStorePath)), + accessor); + }); + + return worldStorePath; +} +``` + +--- + +## Usage Examples + +### Before (eager) + +```nix +let + zoneSrc = builtins.unsafeTectonixInternalZoneSrc "//areas/tools/tec"; + # ^ Entire zone copied to store immediately +in +mkDerivation { + src = zoneSrc; + ... +} +``` + +### After (lazy) + +```nix +let + world = builtins.worldRoot; + + # Read-only access (no store copy during evaluation) + zoneNix = import (world + "/areas/tools/tec/zone.nix"); + manifest = builtins.fromJSON (builtins.readFile (world + "/.meta/manifest.json")); + + # For derivation src, use worldZone (zone-granular lazy copy) + tecZone = builtins.worldZone "//areas/tools/tec"; +in +mkDerivation { + src = tecZone.outPath; # Only copied when derivation is built + ... +} +``` + +--- + +## Builtin Migration Guide + +| Old Pattern | New Pattern | +|-------------|-------------| +| `__unsafeTectonixInternalZoneSrc path` | `(worldZone path).outPath` | +| `__unsafeTectonixInternalTreeSha path` then `__unsafeTectonixInternalTree sha` | `(worldZone path).outPath` | +| `__unsafeTectonixInternalFile path` | `builtins.readFile (worldRoot + path)` | +| `__unsafeTectonixInternalDir zone subpath` | `builtins.readDir (worldRoot + zone + "/" + subpath)` | + +The `__unsafeTectonixInternalTree` builtin can be retained for edge cases (fetching arbitrary tree SHAs not corresponding to zones), but becomes less central. + +--- + +## Extension Point: Lazy Dirty Zones + +The `getZoneFromCheckout()` function is the clear extension point for future optimization. + +### Current Behavior + +Dirty zones are always eagerly copied from checkout: + +```cpp +StorePath EvalState::getZoneFromCheckout(std::string_view zonePath) +{ + // Always eager for now + return fetchToStore(...); +} +``` + +### Future Options + +1. **Content-hash dirty files** + - Walk checkout, hash modified files + - Use combined hash as cache key + - Complex but accurate + +2. **Overlay accessor** + - Base: git ODB accessor for zone + - Overlay: checkout accessor filtered to dirty files + - Mount the composite accessor + - Cache key: `(treeSha, set of dirty file paths)` + +3. **Mtime-based caching** + - Use checkout accessor with mtime as cache key + - Simpler but may re-copy on unrelated file touches + +The interface is clean: `getZoneStorePath()` decides dirty vs clean and delegates appropriately. The dirty path can be made lazy without changing callers. + +--- + +## Testing Plan + +1. **Lazy-trees enabled, clean zone** + - Verify virtual store path is created + - Verify no immediate copy to store + - Verify devirtualization on derivation build + +2. **Lazy-trees disabled** + - Verify immediate copy (current behavior preserved) + +3. **Dirty zones** + - Verify fallback to checkout + - Verify eager copy (for now) + +4. **Cache behavior** + - Same tree SHA returns same virtual path + - Different tree SHA returns different path + - Thread-safe with `eval-cores > 1` + +5. **Cross-world-revision deduplication** + - Zone unchanged between commits → same devirtualized store path + +--- + +## Summary + +| Component | Purpose | +|-----------|---------| +| `tectonixZoneCache_` | Tree SHA → virtual store path mapping | +| `tectonixCheckoutZoneCache_` | Zone path → virtual store path for checkout zones | +| `getZoneStorePath()` | Orchestrator: dirty detection → dispatch | +| `mountZoneByTreeSha()` | Lazy mount for clean zones | +| `getZoneFromCheckout()` | Lazy mount for dirty zones from checkout | +| `__unsafeTectonixInternalZone` | High-level builtin returning attrset | +| `__unsafeTectonixInternalZoneSrc` | Simple builtin returning store path string | +| ~~`worldRoot`~~ | **Not implemented** - zone-based approach deemed sufficient | + +This design: +- Integrates cleanly with existing lazy-trees infrastructure +- Uses tree SHA for natural content-addressed caching +- Supports lazy mounting for both clean zones (from git) and dirty zones (from checkout) +- Provides flake-like API consistency via `__unsafeTectonixInternalZone` diff --git a/plans/tectonix/testing.md b/plans/tectonix/testing.md new file mode 100644 index 000000000000..608b9d6ddcbe --- /dev/null +++ b/plans/tectonix/testing.md @@ -0,0 +1,763 @@ +# Tectonix Lazy-Trees Testing Plan + +This document outlines the testing strategy for the tectonix lazy-trees integration implemented in commit `36c8f88ae`. + +## Test Infrastructure Overview + +Nix uses two main testing approaches: + +1. **C++ Unit Tests** (`src/lib*-tests/*.cc`) - gtest/gmock based, for isolated component testing +2. **Functional Tests** (`tests/functional/*.sh`) - Shell scripts for end-to-end integration + +Our testing will use both approaches. + +--- + +## Phase 1: Unit Tests for Helper Functions + +**Location:** `src/libexpr-tests/tectonix.cc` (new file) + +### 1.1 Path Normalization (`normalizeZonePath`) + +Currently a static function in `eval.cc`. Consider exposing via a test-only header or testing indirectly through builtins. + +| Test Case | Input | Expected Output | +|-----------|-------|-----------------| +| Strip double-slash prefix | `"//areas/tools/dev"` | `"areas/tools/dev"` | +| No prefix passthrough | `"areas/tools/dev"` | `"areas/tools/dev"` | +| Root path | `"//"` | `""` | +| Single slash (edge case) | `"/areas"` | `"/areas"` | + +### 1.2 Store Path Sanitization (`sanitizeZoneNameForStore`) + +| Test Case | Input | Expected Output | +|-----------|-------|-----------------| +| Slashes to dashes | `"//areas/tools/dev"` | `"areas-tools-dev"` | +| Valid chars preserved | `"foo-bar.baz_123"` | `"foo-bar.baz_123"` | +| Invalid chars replaced | `"foo@bar#baz"` | `"foo_bar_baz"` | +| Unicode chars replaced | `"föö/bär"` | `"f__-b_r"` | +| Empty after normalization | `"//"` | `""` | + +### 1.3 Zone Path Validation (`validateZonePath`) + +| Test Case | Behavior | +|-----------|----------| +| Valid zone path in manifest | Returns without error | +| Path not in manifest | Throws `EvalError` with message about "not a zone root" | +| Subpath of zone (not exact match) | Throws error | +| Parent of zone | Throws error | + +--- + +## Phase 2: Git-Utils Unit Tests + +**Location:** `src/libfetchers-tests/git-utils.cc` (extend existing) + +### 2.1 `odbOnly` Mode + +```cpp +TEST_F(GitUtilsTest, odbOnly_opens_repository) +{ + // Create bare repo with objects + // Open with odbOnly=true + // Verify can read objects by SHA +} + +TEST_F(GitUtilsTest, odbOnly_fails_without_objects_dir) +{ + // Try to open non-existent path with odbOnly=true + // Should throw appropriate error +} + +TEST_F(GitUtilsTest, odbOnly_with_reftables_extension) +{ + // Create repo with extensions.refstorage=reftables in config + // Verify odbOnly=true can still open it + // Verify odbOnly=false would fail +} +``` + +### 2.2 `getSubtreeSha` + +```cpp +TEST_F(GitUtilsTest, getSubtreeSha_finds_entry) +{ + // Create tree with subdirectory + // Verify getSubtreeSha returns correct SHA for subdir +} + +TEST_F(GitUtilsTest, getSubtreeSha_missing_entry) +{ + // Request non-existent entry + // Should throw error mentioning "not found in tree" +} + +TEST_F(GitUtilsTest, getSubtreeSha_non_directory) +{ + // Request entry that is a file, not directory + // Should throw error mentioning "not a directory" +} +``` + +### 2.3 `getCommitTree` + +```cpp +TEST_F(GitUtilsTest, getCommitTree_returns_root_tree) +{ + // Create commit with known tree + // Verify getCommitTree returns that tree SHA +} + +TEST_F(GitUtilsTest, getCommitTree_invalid_sha) +{ + // Pass tree SHA instead of commit SHA + // Should throw appropriate error +} +``` + +### 2.4 `readDirectory` with Type Information + +```cpp +TEST_F(GitUtilsTest, readDirectory_returns_types) +{ + // Create tree with: file, directory, symlink, submodule + // Verify readDirectory returns correct type for each +} +``` + +--- + +## Phase 3: EvalState Tectonix Methods + +**Location:** `src/libexpr-tests/tectonix.cc` (new file) + +These tests require a test fixture that sets up: +- A temporary git repository with known structure +- Manifest file at `.meta/manifest.json` +- EvalState configured with `tectonix-git-dir` and `tectonix-git-sha` + +### 3.1 Test Fixture + +```cpp +class TectonixTest : public LibExprTest +{ +protected: + std::filesystem::path repoPath; + std::string commitSha; + + void SetUp() override; // Create repo with zones + void TearDown() override; // Cleanup + + // Helper to create EvalState with tectonix settings + EvalState createTectonixState(bool withCheckout = false); +}; +``` + +### 3.2 `getWorldRepo` Tests + +| Test Case | Behavior | +|-----------|----------| +| Valid git dir | Returns repo reference | +| Missing git dir setting | Throws error about `--tectonix-git-dir` | +| Non-existent path | Throws git open error | +| Home expansion (`~`) | Expands correctly | +| Repeated calls | Returns same cached instance | + +### 3.3 `getWorldGitAccessor` Tests + +| Test Case | Behavior | +|-----------|----------| +| Valid SHA | Returns accessor | +| Missing SHA setting | Throws error about `--tectonix-git-sha` | +| Invalid SHA (not found) | Throws error about SHA not found | +| Tree SHA instead of commit | Throws error about "not a valid commit" | +| Blob SHA instead of commit | Throws error about "not a valid commit" | + +### 3.4 `getWorldTreeSha` Tests + +| Test Case | Input | Behavior | +|-----------|-------|----------| +| Valid zone path | `"//areas/tools/dev"` | Returns correct tree SHA | +| Nested path | `"//areas/tools/dev/src"` | Returns correct tree SHA | +| Non-existent path | `"//does/not/exist"` | Throws error | +| Path traversal attempt | `"//areas/../secret"` | Throws error about invalid component | +| Root path | `"//"` | Returns root tree SHA | +| Caching | Same path twice | Second call returns cached value | +| Intermediate caching | `"//a/b/c"` then `"//a/b"` | Second uses cached intermediate | + +### 3.5 `getManifestContent` / `getManifestJson` Tests + +| Test Case | Behavior | +|-----------|----------| +| Manifest in git | Returns content from git | +| Manifest in checkout (source-available) | Prefers checkout over git | +| Missing manifest | Throws error | +| Invalid JSON | `getManifestJson` throws parse error | +| Caching | Multiple calls return same instance | + +### 3.6 `getTectonixSparseCheckoutRoots` Tests + +| Test Case | Behavior | +|-----------|----------| +| File exists with zone IDs | Returns set of IDs | +| File missing | Returns empty set | +| Worktree `.git` file | Correctly follows gitdir reference | +| Empty file | Returns empty set | +| File with blank lines | Ignores blank lines | + +### 3.7 `getTectonixDirtyZones` Tests + +| Test Case | Behavior | +|-----------|----------| +| No dirty files | All zones marked clean | +| Modified file in zone | Zone marked dirty | +| New untracked file | Zone marked dirty | +| Deleted file | Zone marked dirty | +| Renamed file | Both source and dest zones marked dirty | +| File outside sparse checkout | Ignored | +| Git status fails | Warning logged, zones treated as clean | + +### 3.8 `getZoneStorePath` Tests + +| Test Case | Mode | Behavior | +|-----------|------|----------| +| Clean zone, lazy-trees=true | Lazy | Returns virtual store path | +| Clean zone, lazy-trees=false | Eager | Returns real store path (copied) | +| Dirty zone, lazy-trees=true | Lazy checkout | Returns virtual path from checkout | +| Dirty zone, lazy-trees=false | Eager | Returns real store path from checkout | +| Same zone twice | Any | Returns same cached path | +| Different zones, same tree SHA | Lazy | Returns same path (deduplication) | + +### 3.9 `mountZoneByTreeSha` Tests + +| Test Case | Behavior | +|-----------|----------| +| First mount | Creates accessor, mounts, caches | +| Cache hit | Returns cached path without git access | +| Concurrent mounts (same SHA) | Only one mount created | +| Different SHAs | Different store paths | + +### 3.10 `getZoneFromCheckout` Tests + +| Test Case | Behavior | +|-----------|----------| +| Valid zone in checkout | Returns store path | +| Zone not in checkout | Throws error | +| Lazy mode | Mounts live filesystem | +| Eager mode | Copies to store | + +--- + +## Phase 4: Builtin (Primop) Tests + +**Location:** `src/libexpr-tests/tectonix.cc` + +Extend the `TectonixTest` fixture to test all builtins. + +### 4.1 `__unsafeTectonixInternalManifest` + +```cpp +TEST_F(TectonixTest, manifest_returns_path_to_metadata_mapping) +{ + auto v = eval("builtins.__unsafeTectonixInternalManifest"); + ASSERT_THAT(v, IsAttrs()); + // Verify known zone paths map to correct IDs via .id +} + +TEST_F(TectonixTest, manifest_missing_id_field) +{ + // Set up manifest with missing "id" field + // Should throw error +} +``` + +### 4.2 `__unsafeTectonixInternalManifestInverted` + +```cpp +TEST_F(TectonixTest, manifest_inverted_returns_id_to_path_mapping) +{ + auto v = eval("builtins.__unsafeTectonixInternalManifestInverted"); + ASSERT_THAT(v, IsAttrs()); + // Verify known IDs map to correct paths +} + +TEST_F(TectonixTest, manifest_inverted_duplicate_id) +{ + // Set up manifest with duplicate IDs + // Should throw error about duplicate +} +``` + +### 4.3 `__unsafeTectonixInternalTreeSha` + +```cpp +TEST_F(TectonixTest, treeSha_returns_correct_sha) +{ + auto v = eval("builtins.__unsafeTectonixInternalTreeSha \"//areas/tools/dev\""); + ASSERT_THAT(v, IsString()); + // Verify SHA matches expected value +} + +TEST_F(TectonixTest, treeSha_invalid_path) +{ + ASSERT_THROW( + eval("builtins.__unsafeTectonixInternalTreeSha \"//invalid\""), + Error); +} +``` + +### 4.4 `__unsafeTectonixInternalTree` + +```cpp +TEST_F(TectonixTest, tree_fetches_by_sha) +{ + // Get a known tree SHA + // Verify __unsafeTectonixInternalTree returns store path + // Verify contents match expected +} + +TEST_F(TectonixTest, tree_invalid_sha) +{ + ASSERT_THROW( + eval("builtins.__unsafeTectonixInternalTree \"0000000000000000000000000000000000000000\""), + EvalError); +} +``` + +### 4.5 `__unsafeTectonixInternalZoneSrc` + +```cpp +TEST_F(TectonixTest, zoneSrc_returns_store_path) +{ + auto v = eval("builtins.__unsafeTectonixInternalZoneSrc \"//areas/tools/dev\""); + ASSERT_THAT(v, IsString()); + // Verify path starts with store prefix +} + +TEST_F(TectonixTest, zoneSrc_validates_zone_path) +{ + // Try to access subpath of zone (not exact zone root) + ASSERT_THROW( + eval("builtins.__unsafeTectonixInternalZoneSrc \"//areas/tools/dev/subdir\""), + EvalError); +} + +TEST_F(TectonixTest, zoneSrc_has_context) +{ + auto v = eval("builtins.__unsafeTectonixInternalZoneSrc \"//areas/tools/dev\""); + // Verify string has store path context +} +``` + +### 4.6 `__unsafeTectonixInternalZone` + +```cpp +TEST_F(TectonixTest, zone_returns_attrset) +{ + auto v = eval("builtins.__unsafeTectonixInternalZone \"//areas/tools/dev\""); + ASSERT_THAT(v, IsAttrsOfSize(5)); + + // Check outPath + auto outPath = v.attrs()->get(createSymbol("outPath")); + ASSERT_NE(outPath, nullptr); + ASSERT_THAT(*outPath->value, IsString()); + + // Check root + auto root = v.attrs()->get(createSymbol("root")); + ASSERT_NE(root, nullptr); + // root should be a path type + + // Check treeSha + auto treeSha = v.attrs()->get(createSymbol("treeSha")); + ASSERT_NE(treeSha, nullptr); + ASSERT_THAT(*treeSha->value, IsString()); + + // Check zonePath + auto zonePath = v.attrs()->get(createSymbol("zonePath")); + ASSERT_NE(zonePath, nullptr); + ASSERT_THAT(*zonePath->value, IsStringEq("//areas/tools/dev")); + + // Check dirty + auto dirty = v.attrs()->get(createSymbol("dirty")); + ASSERT_NE(dirty, nullptr); + ASSERT_THAT(*dirty->value, IsFalse()); // clean zone +} + +TEST_F(TectonixTest, zone_dirty_flag_true_for_modified) +{ + // Modify a file in checkout + // Verify dirty=true +} + +TEST_F(TectonixTest, zone_root_can_read_files) +{ + // Use root to import a nix file without triggering store copy + auto v = eval(R"( + let zone = builtins.__unsafeTectonixInternalZone "//areas/tools/dev"; + in builtins.readFile (zone.root + "/zone.nix") + )"); + ASSERT_THAT(v, IsString()); +} + +TEST_F(TectonixTest, zone_outPath_has_context) +{ + auto v = eval(R"( + (builtins.__unsafeTectonixInternalZone "//areas/tools/dev").outPath + )"); + // Verify string context includes store path +} +``` + +### 4.7 `__unsafeTectonixInternalSparseCheckoutRoots` + +```cpp +TEST_F(TectonixTest, sparseCheckoutRoots_returns_list) +{ + auto v = eval("builtins.__unsafeTectonixInternalSparseCheckoutRoots"); + ASSERT_THAT(v, IsList()); +} + +TEST_F(TectonixTest, sparseCheckoutRoots_empty_without_checkout) +{ + // Configure without checkout path + auto v = eval("builtins.__unsafeTectonixInternalSparseCheckoutRoots"); + ASSERT_THAT(v, IsListOfSize(0)); +} +``` + +### 4.8 `__unsafeTectonixInternalDirtyZones` + +```cpp +TEST_F(TectonixTest, dirtyZones_returns_attrset) +{ + auto v = eval("builtins.__unsafeTectonixInternalDirtyZones"); + ASSERT_THAT(v, IsAttrs()); +} + +TEST_F(TectonixTest, dirtyZones_values_are_booleans) +{ + auto v = eval("builtins.__unsafeTectonixInternalDirtyZones"); + // Iterate attrs, verify all values are bools +} +``` + +--- + +## Phase 5: Thread-Safety Tests + +**Location:** `src/libexpr-tests/tectonix.cc` + +### 5.1 Concurrent Zone Access + +```cpp +TEST_F(TectonixTest, concurrent_zone_mounts) +{ + // Launch multiple threads calling getZoneStorePath for same zone + // Verify all get same path + // Verify only one actual mount occurred +} + +TEST_F(TectonixTest, concurrent_different_zones) +{ + // Launch multiple threads accessing different zones + // Verify no deadlocks + // Verify each gets correct path +} + +TEST_F(TectonixTest, concurrent_tree_sha_computation) +{ + // Launch multiple threads computing tree SHAs for overlapping paths + // Verify cache is populated correctly + // Verify no races in intermediate caching +} +``` + +### 5.2 Lazy Init Thread Safety + +```cpp +TEST_F(TectonixTest, concurrent_manifest_access) +{ + // Multiple threads calling getManifestJson + // Verify all get same instance +} + +TEST_F(TectonixTest, concurrent_dirty_zone_detection) +{ + // Multiple threads calling getTectonixDirtyZones + // Verify consistent results +} +``` + +--- + +## Phase 6: Functional Tests + +**Location:** `tests/functional/tectonix/` (new directory) + +### 6.1 Setup Scripts + +Create `tests/functional/tectonix/common.sh`: + +```bash +# Create test world repository with: +# - .meta/manifest.json +# - //areas/tools/dev/ (zone with zone.nix) +# - //areas/tools/tec/ (another zone) +# - //areas/platform/core/ (zone with dependencies) + +create_test_world() { + local dir="$1" + # ... setup git repo with zones +} +``` + +### 6.2 Test Cases + +**`tests/functional/tectonix/basic.sh`:** +```bash +#!/usr/bin/env bash +source common.sh + +# Test: Basic zone access +create_test_world "$TEST_ROOT/world" + +nix eval --raw \ + --tectonix-git-dir "$TEST_ROOT/world/.git" \ + --tectonix-git-sha "$(git -C "$TEST_ROOT/world" rev-parse HEAD)" \ + --expr 'builtins.__unsafeTectonixInternalZoneSrc "//areas/tools/dev"' + +# Verify output is a store path +``` + +**`tests/functional/tectonix/lazy-trees.sh`:** +```bash +#!/usr/bin/env bash +source common.sh + +# Test: Lazy trees don't copy until needed +create_test_world "$TEST_ROOT/world" + +# Access zone with lazy-trees +result=$(nix eval --json \ + --option lazy-trees true \ + --tectonix-git-dir "$TEST_ROOT/world/.git" \ + --tectonix-git-sha "$(git -C "$TEST_ROOT/world" rev-parse HEAD)" \ + --expr '(builtins.__unsafeTectonixInternalZone "//areas/tools/dev").treeSha') + +# Verify the store path doesn't exist yet (virtual) +# Then trigger derivation build and verify it exists +``` + +**`tests/functional/tectonix/dirty-zones.sh`:** +```bash +#!/usr/bin/env bash +source common.sh + +# Test: Dirty zone detection +create_test_world "$TEST_ROOT/world" + +# Modify a file +echo "modified" >> "$TEST_ROOT/world/areas/tools/dev/zone.nix" + +result=$(nix eval --json \ + --tectonix-git-dir "$TEST_ROOT/world/.git" \ + --tectonix-git-sha "$(git -C "$TEST_ROOT/world" rev-parse HEAD)" \ + --tectonix-checkout-path "$TEST_ROOT/world" \ + --expr '(builtins.__unsafeTectonixInternalZone "//areas/tools/dev").dirty') + +[[ "$result" == "true" ]] || fail "Expected dirty=true" +``` + +**`tests/functional/tectonix/deduplication.sh`:** +```bash +#!/usr/bin/env bash +source common.sh + +# Test: Same tree SHA across commits returns same path +create_test_world "$TEST_ROOT/world" + +sha1=$(git -C "$TEST_ROOT/world" rev-parse HEAD) + +# Make commit that doesn't touch //areas/tools/dev +echo "other" >> "$TEST_ROOT/world/README.md" +git -C "$TEST_ROOT/world" add -A && git -C "$TEST_ROOT/world" commit -m "other" +sha2=$(git -C "$TEST_ROOT/world" rev-parse HEAD) + +# Get zone paths for both commits +path1=$(nix eval --raw \ + --tectonix-git-dir "$TEST_ROOT/world/.git" \ + --tectonix-git-sha "$sha1" \ + --expr 'builtins.__unsafeTectonixInternalZoneSrc "//areas/tools/dev"') + +path2=$(nix eval --raw \ + --tectonix-git-dir "$TEST_ROOT/world/.git" \ + --tectonix-git-sha "$sha2" \ + --expr 'builtins.__unsafeTectonixInternalZoneSrc "//areas/tools/dev"') + +# Should be same path due to tree SHA deduplication +[[ "$path1" == "$path2" ]] || fail "Expected same path for unchanged zone" +``` + +**`tests/functional/tectonix/errors.sh`:** +```bash +#!/usr/bin/env bash +source common.sh + +# Test: Error handling +create_test_world "$TEST_ROOT/world" + +# Missing git-dir setting +expect_failure nix eval \ + --tectonix-git-sha "abc123" \ + --expr 'builtins.__unsafeTectonixInternalManifest' + +# Invalid SHA +expect_failure nix eval \ + --tectonix-git-dir "$TEST_ROOT/world/.git" \ + --tectonix-git-sha "0000000000000000000000000000000000000000" \ + --expr 'builtins.__unsafeTectonixInternalManifest' + +# Non-zone path +expect_failure nix eval \ + --tectonix-git-dir "$TEST_ROOT/world/.git" \ + --tectonix-git-sha "$(git -C "$TEST_ROOT/world" rev-parse HEAD)" \ + --expr 'builtins.__unsafeTectonixInternalZoneSrc "//areas/tools/dev/subdir"' +``` + +--- + +## Phase 7: Error Handling Edge Cases + +**Location:** `src/libexpr-tests/tectonix.cc` + +### 7.1 Settings Validation + +| Test Case | Expected Error | +|-----------|----------------| +| Empty `tectonix-git-dir` | "must be specified" | +| Non-existent git dir | Git open error | +| Empty `tectonix-git-sha` | "must be specified" | +| Malformed SHA | Parse error | +| SHA not in repo | "not found in repository" | + +### 7.2 Manifest Errors + +| Test Case | Expected Error | +|-----------|----------------| +| Missing manifest file | "does not exist" | +| Invalid JSON | JSON parse error | +| Missing "id" field | "missing or non-string 'id' field" | +| Non-string "id" value | "non-string 'id' field" | + +### 7.3 Zone Path Errors + +| Test Case | Expected Error | +|-----------|----------------| +| Path not in manifest | "not a zone root" | +| Subpath of zone | "not a zone root" | +| Path traversal (`..`) | "invalid path component" | + +### 7.4 Git Operations Errors + +| Test Case | Expected Error | +|-----------|----------------| +| Tree SHA not found | "not found in world repository" | +| Non-tree SHA for tree access | Type error | +| Non-commit SHA for commit access | "not a valid commit" | + +--- + +## Implementation Priority + +1. **High Priority** (blocking issues identified in review): + - Phase 2: Git-utils `odbOnly` tests + - Phase 3.7: Dirty zone detection tests + - Phase 4.5-4.6: Zone builtin tests with validation + +2. **Medium Priority** (core functionality): + - Phase 3: All EvalState method tests + - Phase 4: All builtin tests + - Phase 6: Basic functional tests + +3. **Lower Priority** (polish): + - Phase 1: Helper function unit tests + - Phase 5: Thread-safety tests + - Phase 7: Comprehensive error handling tests + +--- + +## Test Data Requirements + +### Minimal Test Repository Structure + +``` +test-world/ +├── .git/ +├── .meta/ +│ └── manifest.json +├── areas/ +│ ├── tools/ +│ │ ├── dev/ +│ │ │ ├── zone.nix +│ │ │ └── src/ +│ │ │ └── main.cc +│ │ └── tec/ +│ │ └── zone.nix +│ └── platform/ +│ └── core/ +│ └── zone.nix +└── README.md +``` + +### Manifest Content + +```json +{ + "//areas/tools/dev": { "id": "W-000001" }, + "//areas/tools/tec": { "id": "W-000002" }, + "//areas/platform/core": { "id": "W-000003" } +} +``` + +--- + +## Running Tests + +### Unit Tests + +```bash +# Build and run all unit tests +meson test -C build libexpr-tests libfetchers-tests + +# Run specific test file +meson test -C build libexpr-tests --test-args='--gtest_filter=Tectonix*' +``` + +### Functional Tests + +```bash +# Run tectonix functional tests +make -C tests/functional tectonix + +# Run specific test +./tests/functional/tectonix/basic.sh +``` + +--- + +## Coverage Goals + +| Component | Line Coverage Target | +|-----------|---------------------| +| `src/libexpr/eval.cc` (tectonix methods) | 90% | +| `src/libexpr/primops/tectonix.cc` | 95% | +| `src/libfetchers/git-utils.cc` (new methods) | 90% | + +--- + +## Notes + +- Thread-safety tests should be run with `--eval-cores=4` to stress concurrent access +- Functional tests need a real git repository, not just objects +- Consider property-based testing for path normalization edge cases +- The `odbOnly` mode is critical for reftable-enabled repositories (common in enterprise) diff --git a/src/libexpr-tests/meson.build b/src/libexpr-tests/meson.build index c5dafe0de84b..70d355ce4ac1 100644 --- a/src/libexpr-tests/meson.build +++ b/src/libexpr-tests/meson.build @@ -20,6 +20,7 @@ deps_private_maybe_subproject = [ dependency('nix-expr'), dependency('nix-expr-c'), dependency('nix-expr-test-support'), + dependency('nix-fetchers'), ] deps_public_maybe_subproject = [] subdir('nix-meson-build-support/subprojects') @@ -36,6 +37,12 @@ deps_private += gtest gmock = dependency('gmock') deps_private += gmock +libgit2 = dependency('libgit2') +deps_private += libgit2 + +nlohmann_json = dependency('nlohmann_json') +deps_private += nlohmann_json + configdata = configuration_data() configdata.set_quoted('PACKAGE_VERSION', meson.project_version()) @@ -58,6 +65,7 @@ sources = files( 'nix_api_value_internal.cc', 'primops.cc', 'search-path.cc', + 'tectonix.cc', 'trivial.cc', 'value/context.cc', 'value/print.cc', diff --git a/src/libexpr-tests/package.nix b/src/libexpr-tests/package.nix index 51d52e935bf5..bbf51586064e 100644 --- a/src/libexpr-tests/package.nix +++ b/src/libexpr-tests/package.nix @@ -8,6 +8,7 @@ nix-expr-c, nix-expr-test-support, + libgit2, rapidcheck, gtest, runCommand, @@ -44,6 +45,7 @@ mkMesonExecutable (finalAttrs: { nix-expr-test-support rapidcheck gtest + libgit2 ]; mesonFlags = [ diff --git a/src/libexpr-tests/tectonix.cc b/src/libexpr-tests/tectonix.cc new file mode 100644 index 000000000000..69c59315131c --- /dev/null +++ b/src/libexpr-tests/tectonix.cc @@ -0,0 +1,928 @@ +#include +#include + +#include "nix/expr/tests/libexpr.hh" +#include "nix/expr/eval.hh" +#include "nix/expr/eval-settings.hh" +#include "nix/fetchers/git-utils.hh" +#include "nix/store/globals.hh" +#include "nix/util/file-system.hh" + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +namespace nix { + +// ============================================================================ +// Test Fixture for Tectonix Tests +// ============================================================================ + +class TectonixTest : public ::testing::Test +{ +protected: + std::unique_ptr delTmpDir; + std::filesystem::path repoPath; + std::string commitSha; + + // Manifest content for test repo + static constexpr std::string_view TEST_MANIFEST = R"({ + "//areas/tools/dev": { "id": "W-000001" }, + "//areas/tools/tec": { "id": "W-000002" }, + "//areas/platform/core": { "id": "W-000003" } + })"; + +public: + static void SetUpTestSuite() + { + initLibStore(false); + initGC(); + } + + void SetUp() override + { + // Create temp directory for git repo + repoPath = createTempDir(); + delTmpDir = std::make_unique(repoPath, true); + + // Initialize git repo and create test structure + git_libgit2_init(); + createTestRepository(); + } + + void TearDown() override + { + delTmpDir.reset(); + } + + void createTestRepository() + { + git_repository * repo = nullptr; + ASSERT_EQ(git_repository_init(&repo, repoPath.string().c_str(), 0), 0); + + // Create directory structure with files + createDir(".meta"); + writeFile(".meta/manifest.json", std::string(TEST_MANIFEST)); + + createDir("areas"); + createDir("areas/tools"); + createDir("areas/tools/dev"); + createDir("areas/tools/tec"); + createDir("areas/platform"); + createDir("areas/platform/core"); + + writeFile("areas/tools/dev/zone.nix", "{ }"); + writeFile("areas/tools/dev/README.md", "Dev zone"); + writeFile("areas/tools/tec/zone.nix", "{ }"); + writeFile("areas/platform/core/zone.nix", "{ }"); + writeFile("README.md", "Test World"); + + // Create git tree from files + git_index * index = nullptr; + ASSERT_EQ(git_repository_index(&index, repo), 0); + + // Add all files to index + ASSERT_EQ(git_index_add_bypath(index, ".meta/manifest.json"), 0); + ASSERT_EQ(git_index_add_bypath(index, "areas/tools/dev/zone.nix"), 0); + ASSERT_EQ(git_index_add_bypath(index, "areas/tools/dev/README.md"), 0); + ASSERT_EQ(git_index_add_bypath(index, "areas/tools/tec/zone.nix"), 0); + ASSERT_EQ(git_index_add_bypath(index, "areas/platform/core/zone.nix"), 0); + ASSERT_EQ(git_index_add_bypath(index, "README.md"), 0); + ASSERT_EQ(git_index_write(index), 0); + + git_oid treeOid; + ASSERT_EQ(git_index_write_tree(&treeOid, index), 0); + + git_tree * tree = nullptr; + ASSERT_EQ(git_tree_lookup(&tree, repo, &treeOid), 0); + + // Create commit + git_signature * sig = nullptr; + ASSERT_EQ(git_signature_now(&sig, "test", "test@example.com"), 0); + + git_oid commitOid; + ASSERT_EQ(git_commit_create_v( + &commitOid, repo, "HEAD", sig, sig, nullptr, + "Initial commit", tree, 0), 0); + + // Store commit SHA + char sha[GIT_OID_SHA1_HEXSIZE + 1]; + git_oid_tostr(sha, sizeof(sha), &commitOid); + commitSha = sha; + + git_signature_free(sig); + git_tree_free(tree); + git_index_free(index); + git_repository_free(repo); + } + + void createDir(const std::string & path) + { + std::filesystem::create_directories(repoPath / path); + } + + void writeFile(const std::string & path, const std::string & content) + { + auto fullPath = repoPath / path; + std::ofstream f(fullPath); + f << content; + } + + // Create EvalState with tectonix settings configured + struct TectonixEvalContext { + bool readOnlyMode = true; + fetchers::Settings fetchSettings{}; + EvalSettings evalSettings{readOnlyMode}; + ref store; + std::unique_ptr state; + + TectonixEvalContext(const std::filesystem::path & repoPath, const std::string & commitSha, bool withCheckout = false) + : store(openStore("dummy://")) + { + evalSettings.nixPath = {}; + evalSettings.tectonixGitDir = (repoPath / ".git").string(); + evalSettings.tectonixGitSha = commitSha; + if (withCheckout) { + evalSettings.tectonixCheckoutPath = repoPath.string(); + } + + state = std::make_unique( + LookupPath{}, + store, + fetchSettings, + evalSettings, + nullptr + ); + } + + Value eval(const std::string & input) + { + Value v; + Expr * e = state->parseExprFromString(input, state->rootPath(CanonPath::root)); + state->eval(e, v); + state->forceValue(v, noPos); + return v; + } + }; + + std::unique_ptr createTectonixContext(bool withCheckout = false) + { + return std::make_unique(repoPath, commitSha, withCheckout); + } +}; + +// ============================================================================ +// Phase 4: Builtin Tests - __unsafeTectonixInternalManifest +// ============================================================================ + +TEST_F(TectonixTest, manifest_returns_path_to_metadata_mapping) +{ + auto ctx = createTectonixContext(); + auto v = ctx->eval("builtins.unsafeTectonixInternalManifest"); + + ASSERT_THAT(v, IsAttrs()); + ASSERT_EQ(v.attrs()->size(), 3u); + + // Check //areas/tools/dev maps to W-000001 + auto dev = v.attrs()->get(ctx->state->symbols.create("//areas/tools/dev")); + ASSERT_NE(dev, nullptr); + ASSERT_THAT(*dev->value, IsAttrs()); + auto devId = dev->value->attrs()->get(ctx->state->symbols.create("id")); + ASSERT_NE(devId, nullptr); + ASSERT_THAT(*devId->value, IsStringEq("W-000001")); + + // Check //areas/tools/tec maps to W-000002 + auto tec = v.attrs()->get(ctx->state->symbols.create("//areas/tools/tec")); + ASSERT_NE(tec, nullptr); + ASSERT_THAT(*tec->value, IsAttrs()); + auto tecId = tec->value->attrs()->get(ctx->state->symbols.create("id")); + ASSERT_NE(tecId, nullptr); + ASSERT_THAT(*tecId->value, IsStringEq("W-000002")); + + // Check //areas/platform/core maps to W-000003 + auto core = v.attrs()->get(ctx->state->symbols.create("//areas/platform/core")); + ASSERT_NE(core, nullptr); + ASSERT_THAT(*core->value, IsAttrs()); + auto coreId = core->value->attrs()->get(ctx->state->symbols.create("id")); + ASSERT_NE(coreId, nullptr); + ASSERT_THAT(*coreId->value, IsStringEq("W-000003")); +} + +// ============================================================================ +// Phase 4: Builtin Tests - __unsafeTectonixInternalManifestInverted +// ============================================================================ + +TEST_F(TectonixTest, manifest_inverted_returns_id_to_path_mapping) +{ + auto ctx = createTectonixContext(); + auto v = ctx->eval("builtins.unsafeTectonixInternalManifestInverted"); + + ASSERT_THAT(v, IsAttrs()); + ASSERT_EQ(v.attrs()->size(), 3u); + + // Check W-000001 maps to //areas/tools/dev + auto w1 = v.attrs()->get(ctx->state->symbols.create("W-000001")); + ASSERT_NE(w1, nullptr); + ASSERT_THAT(*w1->value, IsStringEq("//areas/tools/dev")); + + // Check W-000002 maps to //areas/tools/tec + auto w2 = v.attrs()->get(ctx->state->symbols.create("W-000002")); + ASSERT_NE(w2, nullptr); + ASSERT_THAT(*w2->value, IsStringEq("//areas/tools/tec")); + + // Check W-000003 maps to //areas/platform/core + auto w3 = v.attrs()->get(ctx->state->symbols.create("W-000003")); + ASSERT_NE(w3, nullptr); + ASSERT_THAT(*w3->value, IsStringEq("//areas/platform/core")); +} + +// ============================================================================ +// Phase 4: Builtin Tests - __unsafeTectonixInternalTreeSha +// ============================================================================ + +TEST_F(TectonixTest, treeSha_returns_sha_string) +{ + auto ctx = createTectonixContext(); + auto v = ctx->eval(R"(builtins.unsafeTectonixInternalTreeSha "//areas/tools/dev")"); + + ASSERT_THAT(v, IsString()); + // SHA should be 40 hex characters + ASSERT_EQ(v.string_view().size(), 40u); + for (char c : v.string_view()) { + ASSERT_TRUE((c >= '0' && c <= '9') || (c >= 'a' && c <= 'f')); + } +} + +TEST_F(TectonixTest, treeSha_different_zones_have_different_shas) +{ + auto ctx = createTectonixContext(); + + auto devSha = ctx->eval(R"(builtins.unsafeTectonixInternalTreeSha "//areas/tools/dev")"); + auto tecSha = ctx->eval(R"(builtins.unsafeTectonixInternalTreeSha "//areas/tools/tec")"); + + ASSERT_THAT(devSha, IsString()); + ASSERT_THAT(tecSha, IsString()); + // Different zones should have different tree SHAs (different content) + ASSERT_NE(devSha.string_view(), tecSha.string_view()); +} + +TEST_F(TectonixTest, treeSha_invalid_path_throws) +{ + auto ctx = createTectonixContext(); + + ASSERT_THROW( + ctx->eval(R"(builtins.unsafeTectonixInternalTreeSha "//does/not/exist")"), + Error); +} + +// ============================================================================ +// Phase 4: Builtin Tests - __unsafeTectonixInternalTree +// ============================================================================ + +// Disabled: requires real store (dummy:// doesn't support fetchToStore) +TEST_F(TectonixTest, DISABLED_tree_returns_store_path) +{ + auto ctx = createTectonixContext(); + // Get a tree SHA first, then fetch it as a store path + auto sha = ctx->eval(R"(builtins.unsafeTectonixInternalTreeSha "//areas/tools/dev")"); + ASSERT_THAT(sha, IsString()); + + auto expr = "builtins.unsafeTectonixInternalTree \"" + std::string(sha.string_view()) + "\""; + auto v = ctx->eval(expr); + + ASSERT_THAT(v, IsString()); + // Should be a store path + auto pathStr = v.string_view(); + ASSERT_TRUE(pathStr.find("/nix/store/") == 0 || pathStr.find(settings.nixStore) == 0); +} + +TEST_F(TectonixTest, tree_invalid_sha_throws) +{ + auto ctx = createTectonixContext(); + + // Invalid SHA (not 40 hex chars) + ASSERT_THROW( + ctx->eval(R"(builtins.unsafeTectonixInternalTree "invalid")"), + Error); +} + +TEST_F(TectonixTest, tree_nonexistent_sha_throws) +{ + auto ctx = createTectonixContext(); + + // Valid format but non-existent SHA + ASSERT_THROW( + ctx->eval(R"(builtins.unsafeTectonixInternalTree "0000000000000000000000000000000000000000")"), + EvalError); +} + +// ============================================================================ +// Phase 4: Builtin Tests - __unsafeTectonixInternalZoneSrc +// ============================================================================ + +// Disabled: requires real store (dummy:// doesn't support addToStoreFromDump) +TEST_F(TectonixTest, DISABLED_zoneSrc_returns_store_path) +{ + auto ctx = createTectonixContext(); + auto v = ctx->eval(R"(builtins.unsafeTectonixInternalZoneSrc "//areas/tools/dev")"); + + ASSERT_THAT(v, IsString()); + // Should be a store path + auto pathStr = v.string_view(); + ASSERT_TRUE(pathStr.find("/nix/store/") == 0 || pathStr.find(settings.nixStore) == 0); +} + +TEST_F(TectonixTest, zoneSrc_validates_zone_path) +{ + auto ctx = createTectonixContext(); + + // Parent of zone should fail validation + ASSERT_THROW( + ctx->eval(R"(builtins.unsafeTectonixInternalZoneSrc "//areas/tools")"), + EvalError); +} + +TEST_F(TectonixTest, zoneSrc_non_zone_path_throws) +{ + auto ctx = createTectonixContext(); + + ASSERT_THROW( + ctx->eval(R"(builtins.unsafeTectonixInternalZoneSrc "//not/a/zone")"), + EvalError); +} + +// ============================================================================ +// Phase 4: Builtin Tests - __unsafeTectonixInternalZone +// ============================================================================ + +// Disabled: requires real store (dummy:// doesn't support addToStoreFromDump) +TEST_F(TectonixTest, DISABLED_zone_returns_attrset_with_expected_attrs) +{ + auto ctx = createTectonixContext(); + auto v = ctx->eval(R"(builtins.unsafeTectonixInternalZone "//areas/tools/dev")"); + + ASSERT_THAT(v, IsAttrsOfSize(5)); + + // Check outPath exists and is a string + auto outPath = v.attrs()->get(ctx->state->symbols.create("outPath")); + ASSERT_NE(outPath, nullptr); + ctx->state->forceValue(*outPath->value, noPos); + ASSERT_THAT(*outPath->value, IsString()); + + // Check root exists + auto root = v.attrs()->get(ctx->state->symbols.create("root")); + ASSERT_NE(root, nullptr); + + // Check treeSha exists and is a string + auto treeSha = v.attrs()->get(ctx->state->symbols.create("treeSha")); + ASSERT_NE(treeSha, nullptr); + ctx->state->forceValue(*treeSha->value, noPos); + ASSERT_THAT(*treeSha->value, IsString()); + ASSERT_EQ(treeSha->value->string_view().size(), 40u); + + // Check zonePath matches input + auto zonePath = v.attrs()->get(ctx->state->symbols.create("zonePath")); + ASSERT_NE(zonePath, nullptr); + ctx->state->forceValue(*zonePath->value, noPos); + ASSERT_THAT(*zonePath->value, IsStringEq("//areas/tools/dev")); + + // Check dirty is false (clean repo) + auto dirty = v.attrs()->get(ctx->state->symbols.create("dirty")); + ASSERT_NE(dirty, nullptr); + ctx->state->forceValue(*dirty->value, noPos); + ASSERT_THAT(*dirty->value, IsFalse()); +} + +// Disabled: requires real store (dummy:// doesn't support addToStoreFromDump) +TEST_F(TectonixTest, DISABLED_zone_outPath_is_store_path) +{ + auto ctx = createTectonixContext(); + auto v = ctx->eval(R"((builtins.unsafeTectonixInternalZone "//areas/tools/dev").outPath)"); + + ASSERT_THAT(v, IsString()); + auto pathStr = v.string_view(); + ASSERT_TRUE(pathStr.find("/nix/store/") == 0 || pathStr.find(settings.nixStore) == 0); +} + +TEST_F(TectonixTest, zone_invalid_path_throws) +{ + auto ctx = createTectonixContext(); + + ASSERT_THROW( + ctx->eval(R"(builtins.unsafeTectonixInternalZone "//not/a/zone")"), + EvalError); +} + +// ============================================================================ +// Phase 4: Builtin Tests - __unsafeTectonixInternalSparseCheckoutRoots +// ============================================================================ + +TEST_F(TectonixTest, sparseCheckoutRoots_returns_list) +{ + auto ctx = createTectonixContext(); + auto v = ctx->eval("builtins.unsafeTectonixInternalSparseCheckoutRoots"); + + ASSERT_THAT(v, IsList()); +} + +TEST_F(TectonixTest, sparseCheckoutRoots_empty_without_checkout) +{ + auto ctx = createTectonixContext(false); // no checkout path + auto v = ctx->eval("builtins.unsafeTectonixInternalSparseCheckoutRoots"); + + ASSERT_THAT(v, IsListOfSize(0)); +} + +// ============================================================================ +// Phase 4: Builtin Tests - __unsafeTectonixInternalDirtyZones +// ============================================================================ + +TEST_F(TectonixTest, dirtyZones_returns_attrset) +{ + auto ctx = createTectonixContext(true); // with checkout + auto v = ctx->eval("builtins.unsafeTectonixInternalDirtyZones"); + + ASSERT_THAT(v, IsAttrs()); +} + +TEST_F(TectonixTest, dirtyZones_empty_without_checkout) +{ + auto ctx = createTectonixContext(false); // no checkout + auto v = ctx->eval("builtins.unsafeTectonixInternalDirtyZones"); + + ASSERT_THAT(v, IsAttrsOfSize(0)); +} + +// ============================================================================ +// Phase 3: EvalState Method Tests - getWorldRepo +// ============================================================================ + +TEST_F(TectonixTest, getWorldRepo_returns_repo) +{ + auto ctx = createTectonixContext(); + auto repo = ctx->state->getWorldRepo(); + ASSERT_NE(&*repo, nullptr); +} + +TEST_F(TectonixTest, getWorldRepo_caches_instance) +{ + auto ctx = createTectonixContext(); + auto repo1 = ctx->state->getWorldRepo(); + auto repo2 = ctx->state->getWorldRepo(); + // Should return same instance + ASSERT_EQ(&*repo1, &*repo2); +} + +// ============================================================================ +// Phase 3: EvalState Method Tests - getWorldTreeSha +// ============================================================================ + +TEST_F(TectonixTest, getWorldTreeSha_returns_hash) +{ + auto ctx = createTectonixContext(); + auto hash = ctx->state->getWorldTreeSha("//areas/tools/dev"); + // SHA1 hash is 40 hex chars + ASSERT_EQ(hash.gitRev().size(), 40u); +} + +TEST_F(TectonixTest, getWorldTreeSha_caches_results) +{ + auto ctx = createTectonixContext(); + auto hash1 = ctx->state->getWorldTreeSha("//areas/tools/dev"); + auto hash2 = ctx->state->getWorldTreeSha("//areas/tools/dev"); + ASSERT_EQ(hash1, hash2); +} + +TEST_F(TectonixTest, getWorldTreeSha_root_returns_commit_tree) +{ + auto ctx = createTectonixContext(); + // Root path should work + auto hash = ctx->state->getWorldTreeSha("//"); + ASSERT_EQ(hash.gitRev().size(), 40u); +} + +// ============================================================================ +// Phase 3: EvalState Method Tests - getManifestJson +// ============================================================================ + +TEST_F(TectonixTest, getManifestJson_parses_correctly) +{ + auto ctx = createTectonixContext(); + auto & json = ctx->state->getManifestJson(); + + ASSERT_TRUE(json.contains("//areas/tools/dev")); + ASSERT_EQ(json["//areas/tools/dev"]["id"], "W-000001"); +} + +TEST_F(TectonixTest, getManifestJson_caches_result) +{ + auto ctx = createTectonixContext(); + auto & json1 = ctx->state->getManifestJson(); + auto & json2 = ctx->state->getManifestJson(); + // Should return same instance + ASSERT_EQ(&json1, &json2); +} + +// ============================================================================ +// Phase 3: EvalState Method Tests - isTectonixSourceAvailable +// ============================================================================ + +TEST_F(TectonixTest, isTectonixSourceAvailable_false_without_checkout) +{ + auto ctx = createTectonixContext(false); + ASSERT_FALSE(ctx->state->isTectonixSourceAvailable()); +} + +TEST_F(TectonixTest, isTectonixSourceAvailable_true_with_checkout) +{ + auto ctx = createTectonixContext(true); + ASSERT_TRUE(ctx->state->isTectonixSourceAvailable()); +} + +// ============================================================================ +// Phase 7: Error Handling Tests +// ============================================================================ + +TEST_F(TectonixTest, missing_git_dir_throws) +{ + bool readOnly = true; + fetchers::Settings fetchSettings{}; + EvalSettings evalSettings{readOnly}; + evalSettings.nixPath = {}; + evalSettings.tectonixGitDir = ""; // empty + evalSettings.tectonixGitSha = commitSha; + + EvalState evalState( + LookupPath{}, + openStore("dummy://"), + fetchSettings, + evalSettings, + nullptr + ); + + ASSERT_THROW(evalState.getWorldRepo(), Error); +} + +TEST_F(TectonixTest, missing_git_sha_throws) +{ + bool readOnly = true; + fetchers::Settings fetchSettings{}; + EvalSettings evalSettings{readOnly}; + evalSettings.nixPath = {}; + evalSettings.tectonixGitDir = (repoPath / ".git").string(); + evalSettings.tectonixGitSha = ""; // empty + + EvalState evalState( + LookupPath{}, + openStore("dummy://"), + fetchSettings, + evalSettings, + nullptr + ); + + ASSERT_THROW(evalState.getWorldGitAccessor(), Error); +} + +TEST_F(TectonixTest, missing_git_sha_tree_sha_throws) +{ + bool readOnly = true; + fetchers::Settings fetchSettings{}; + EvalSettings evalSettings{readOnly}; + evalSettings.nixPath = {}; + evalSettings.tectonixGitDir = (repoPath / ".git").string(); + evalSettings.tectonixGitSha = ""; // empty + + EvalState evalState( + LookupPath{}, + openStore("dummy://"), + fetchSettings, + evalSettings, + nullptr + ); + + ASSERT_THROW(evalState.getWorldTreeSha("//areas/tools/dev"), Error); +} + +TEST_F(TectonixTest, invalid_sha_throws) +{ + bool readOnly = true; + fetchers::Settings fetchSettings{}; + EvalSettings evalSettings{readOnly}; + evalSettings.nixPath = {}; + evalSettings.tectonixGitDir = (repoPath / ".git").string(); + evalSettings.tectonixGitSha = "0000000000000000000000000000000000000000"; // nonexistent + + EvalState evalState( + LookupPath{}, + openStore("dummy://"), + fetchSettings, + evalSettings, + nullptr + ); + + ASSERT_THROW(evalState.getWorldGitAccessor(), Error); +} + +// ============================================================================ +// Phase 5: Thread-Safety Tests +// ============================================================================ + +TEST_F(TectonixTest, concurrent_manifest_access) +{ + auto ctx = createTectonixContext(); + + std::vector threads; + std::vector results(8); + + // Multiple threads calling getManifestJson + for (size_t i = 0; i < 8; i++) { + threads.emplace_back([&, i]() { + results[i] = &ctx->state->getManifestJson(); + }); + } + + for (auto & t : threads) { + t.join(); + } + + // Verify all got same instance + for (size_t i = 1; i < 8; i++) { + ASSERT_EQ(results[0], results[i]); + } +} + +TEST_F(TectonixTest, concurrent_tree_sha_computation) +{ + auto ctx = createTectonixContext(); + + std::vector threads; + std::vector results(8); // Store as gitRev strings + + // Multiple threads computing tree SHAs for same path + for (size_t i = 0; i < 8; i++) { + threads.emplace_back([&, i]() { + results[i] = ctx->state->getWorldTreeSha("//areas/tools/dev").gitRev(); + }); + } + + for (auto & t : threads) { + t.join(); + } + + // Verify all got same hash + for (size_t i = 1; i < 8; i++) { + ASSERT_EQ(results[0], results[i]); + } +} + +TEST_F(TectonixTest, concurrent_world_repo_access) +{ + auto ctx = createTectonixContext(); + + std::vector threads; + std::vector results(8); + + // Multiple threads calling getWorldRepo + for (size_t i = 0; i < 8; i++) { + threads.emplace_back([&, i]() { + results[i] = &*ctx->state->getWorldRepo(); + }); + } + + for (auto & t : threads) { + t.join(); + } + + // Verify all got same instance + for (size_t i = 1; i < 8; i++) { + ASSERT_EQ(results[0], results[i]); + } +} + +TEST_F(TectonixTest, concurrent_different_tree_shas) +{ + auto ctx = createTectonixContext(); + + std::vector threads; + std::vector zonePaths = { + "//areas/tools/dev", + "//areas/tools/tec", + "//areas/platform/core" + }; + std::map results; // Store as string (gitRev) + std::mutex resultsMutex; + + // Multiple threads computing different tree SHAs + for (const auto & path : zonePaths) { + for (int i = 0; i < 3; i++) { + threads.emplace_back([&, path]() { + auto hash = ctx->state->getWorldTreeSha(path); + auto hashStr = hash.gitRev(); + std::lock_guard lock(resultsMutex); + auto it = results.find(path); + if (it != results.end()) { + // Verify consistency + ASSERT_EQ(it->second, hashStr); + } else { + results.emplace(path, hashStr); + } + }); + } + } + + for (auto & t : threads) { + t.join(); + } + + // Verify we got results for all zones + ASSERT_EQ(results.size(), 3u); +} + +// ============================================================================ +// Phase 7: Additional Error Handling Tests +// ============================================================================ + +TEST_F(TectonixTest, manifest_non_string_id_throws) +{ + // Create a repo with invalid manifest (id is number not string) + std::filesystem::path badRepoPath = createTempDir(); + AutoDelete delBadRepo(badRepoPath, true); + + git_repository * repo = nullptr; + ASSERT_EQ(git_repository_init(&repo, badRepoPath.string().c_str(), 0), 0); + + std::filesystem::create_directories(badRepoPath / ".meta"); + std::ofstream manifest(badRepoPath / ".meta/manifest.json"); + manifest << R"({"//zone": {"id": 12345}})"; // id is number, not string + manifest.close(); + + git_index * index = nullptr; + ASSERT_EQ(git_repository_index(&index, repo), 0); + ASSERT_EQ(git_index_add_bypath(index, ".meta/manifest.json"), 0); + ASSERT_EQ(git_index_write(index), 0); + + git_oid treeOid; + ASSERT_EQ(git_index_write_tree(&treeOid, index), 0); + + git_tree * tree = nullptr; + ASSERT_EQ(git_tree_lookup(&tree, repo, &treeOid), 0); + + git_signature * sig = nullptr; + ASSERT_EQ(git_signature_now(&sig, "test", "test@test.com"), 0); + + git_oid commitOid; + ASSERT_EQ(git_commit_create_v(&commitOid, repo, "HEAD", sig, sig, nullptr, "test", tree, 0), 0); + + char sha[GIT_OID_SHA1_HEXSIZE + 1]; + git_oid_tostr(sha, sizeof(sha), &commitOid); + + git_signature_free(sig); + git_tree_free(tree); + git_index_free(index); + git_repository_free(repo); + + // Create EvalState with bad manifest + bool readOnly = true; + fetchers::Settings fetchSettings{}; + EvalSettings evalSettings{readOnly}; + evalSettings.nixPath = {}; + evalSettings.tectonixGitDir = (badRepoPath / ".git").string(); + evalSettings.tectonixGitSha = sha; + + EvalState evalState( + LookupPath{}, + openStore("dummy://"), + fetchSettings, + evalSettings, + nullptr + ); + + // Should throw when accessing manifest + Value v; + Expr * e = evalState.parseExprFromString( + "builtins.unsafeTectonixInternalManifest", + evalState.rootPath(CanonPath::root)); + + ASSERT_THROW({ + evalState.eval(e, v); + evalState.forceValue(v, noPos); + }, Error); +} + +TEST_F(TectonixTest, zone_path_traversal_throws) +{ + auto ctx = createTectonixContext(); + + // Path traversal attempt should fail + ASSERT_THROW( + ctx->state->getWorldTreeSha("//areas/../.git"), + Error); +} + +TEST_F(TectonixTest, nonexistent_git_dir_throws) +{ + bool readOnly = true; + fetchers::Settings fetchSettings{}; + EvalSettings evalSettings{readOnly}; + evalSettings.nixPath = {}; + evalSettings.tectonixGitDir = "/nonexistent/path/to/repo/.git"; + evalSettings.tectonixGitSha = "0000000000000000000000000000000000000000"; + + EvalState evalState( + LookupPath{}, + openStore("dummy://"), + fetchSettings, + evalSettings, + nullptr + ); + + ASSERT_THROW(evalState.getWorldRepo(), Error); +} + +TEST_F(TectonixTest, treeSha_for_nonexistent_subpath_throws) +{ + auto ctx = createTectonixContext(); + + // Path that doesn't exist in the tree + ASSERT_THROW( + ctx->state->getWorldTreeSha("//areas/tools/dev/nonexistent/deep/path"), + Error); +} + +TEST_F(TectonixTest, manifest_missing_id_field_throws) +{ + // Create a repo with manifest missing id field + std::filesystem::path badRepoPath = createTempDir(); + AutoDelete delBadRepo(badRepoPath, true); + + git_repository * repo = nullptr; + ASSERT_EQ(git_repository_init(&repo, badRepoPath.string().c_str(), 0), 0); + + std::filesystem::create_directories(badRepoPath / ".meta"); + std::ofstream manifest(badRepoPath / ".meta/manifest.json"); + manifest << R"({"//zone": {"name": "test"}})"; // missing "id" field + manifest.close(); + + git_index * index = nullptr; + ASSERT_EQ(git_repository_index(&index, repo), 0); + ASSERT_EQ(git_index_add_bypath(index, ".meta/manifest.json"), 0); + ASSERT_EQ(git_index_write(index), 0); + + git_oid treeOid; + ASSERT_EQ(git_index_write_tree(&treeOid, index), 0); + + git_tree * tree = nullptr; + ASSERT_EQ(git_tree_lookup(&tree, repo, &treeOid), 0); + + git_signature * sig = nullptr; + ASSERT_EQ(git_signature_now(&sig, "test", "test@test.com"), 0); + + git_oid commitOid; + ASSERT_EQ(git_commit_create_v(&commitOid, repo, "HEAD", sig, sig, nullptr, "test", tree, 0), 0); + + char sha[GIT_OID_SHA1_HEXSIZE + 1]; + git_oid_tostr(sha, sizeof(sha), &commitOid); + + git_signature_free(sig); + git_tree_free(tree); + git_index_free(index); + git_repository_free(repo); + + // Create EvalState with bad manifest + bool readOnly = true; + fetchers::Settings fetchSettings{}; + EvalSettings evalSettings{readOnly}; + evalSettings.nixPath = {}; + evalSettings.tectonixGitDir = (badRepoPath / ".git").string(); + evalSettings.tectonixGitSha = sha; + + EvalState evalState( + LookupPath{}, + openStore("dummy://"), + fetchSettings, + evalSettings, + nullptr + ); + + // Should throw when accessing manifest + Value v; + Expr * e = evalState.parseExprFromString( + "builtins.unsafeTectonixInternalManifest", + evalState.rootPath(CanonPath::root)); + + ASSERT_THROW({ + evalState.eval(e, v); + evalState.forceValue(v, noPos); + }, Error); +} + +} // namespace nix diff --git a/src/libexpr/eval.cc b/src/libexpr/eval.cc index 46393b79c5ec..800b458ec2b4 100644 --- a/src/libexpr/eval.cc +++ b/src/libexpr/eval.cc @@ -24,7 +24,9 @@ #include "nix/fetchers/fetch-to-store.hh" #include "nix/fetchers/tarball.hh" #include "nix/fetchers/input-cache.hh" +#include "nix/fetchers/git-utils.hh" #include "nix/util/current-process.hh" +#include "nix/util/processes.hh" #include "nix/store/async-path-writer.hh" #include "nix/expr/parallel-eval.hh" @@ -33,6 +35,7 @@ #include #include #include +#include #include #include #include @@ -330,6 +333,7 @@ EvalState::EvalState( , importResolutionCache(make_ref()) , fileEvalCache(make_ref()) , regexCache(makeRegexCache()) + , worldTreeShaCache(make_ref()) #if NIX_USE_BOEHMGC , baseEnvP(std::allocate_shared(traceable_allocator(), &mem.allocEnv(BASE_ENV_SIZE))) , baseEnv(**baseEnvP) @@ -422,6 +426,537 @@ void EvalState::allowAndSetStorePathString(const StorePath & storePath, Value & mkStorePathString(storePath, v); } +ref EvalState::getWorldRepo() const +{ + std::call_once(worldRepoFlag, [this]() { + auto gitDir = settings.tectonixGitDir.get(); + if (gitDir.empty()) + throw Error("--tectonix-git-dir must be specified to use tectonix builtins"); + + // Expand ~ to home directory + if (hasPrefix(gitDir, "~/")) + gitDir = getHome() + gitDir.substr(1); + + worldRepo = GitRepo::openRepo(std::filesystem::path(gitDir), {.bare = true, .odbOnly = true}); + debug("opened world repo at %s", gitDir); + }); + return *worldRepo; +} + +const std::string & EvalState::requireTectonixGitSha() const +{ + auto & sha = settings.tectonixGitSha.get(); + if (sha.empty()) + throw Error("--tectonix-git-sha must be specified to use tectonix builtins"); + return sha; +} + +ref EvalState::getWorldGitAccessor() const +{ + std::call_once(worldGitAccessorFlag, [this]() { + auto & sha = requireTectonixGitSha(); + + auto repo = getWorldRepo(); + auto hash = Hash::parseNonSRIUnprefixed(sha, HashAlgorithm::SHA1); + + if (!repo->hasObject(hash)) + throw Error("tectonix-git-sha '%s' not found in repository", sha); + + // Validate that the SHA is a commit by trying to get its tree. + // This gives a clear error if someone accidentally passes a tree or blob SHA. + try { + repo->getCommitTree(hash); + } catch (Error & e) { + throw Error("tectonix-git-sha '%s' does not appear to be a valid commit: %s", sha, e.what()); + } + + // exportIgnore=false: The world accessor is used for path validation and tree SHA + // computation, where we need to see all files. Zone accessors (mountZoneByTreeSha, + // getZoneStorePath) use exportIgnore=true to honor .gitattributes for actual content. + GitAccessorOptions opts{.exportIgnore = false, .smudgeLfs = false}; + worldGitAccessor = repo->getAccessor(hash, opts, "world"); + debug("created world accessor at commit %s", sha); + }); + return *worldGitAccessor; +} + +std::optional> EvalState::getWorldCheckoutAccessor() const +{ + if (!isTectonixSourceAvailable()) + return std::nullopt; + + std::call_once(worldCheckoutAccessorFlag, [this]() { + // Use the global filesystem accessor with the checkout path as root + worldCheckoutAccessor = getFSSourceAccessor(); + }); + return *worldCheckoutAccessor; +} + +bool EvalState::isTectonixSourceAvailable() const +{ + return !settings.tectonixCheckoutPath.get().empty(); +} + +// Helper to normalize zone paths: strip leading // prefix +// Zone paths in manifest have // prefix (e.g., //areas/tools/dev) +// Filesystem operations need paths without // (e.g., areas/tools/dev) +static std::string normalizeZonePath(std::string_view zonePath) +{ + std::string path(zonePath); + if (hasPrefix(path, "//")) + path = path.substr(2); + return path; +} + +// Helper to sanitize zone path for use in store path names. +// Store paths only allow: a-zA-Z0-9 and +-._?= +// Replaces / with - and any other invalid chars with _ +static std::string sanitizeZoneNameForStore(std::string_view zonePath) +{ + auto zone = normalizeZonePath(zonePath); + std::string result; + result.reserve(zone.size()); + for (char c : zone) { + if (c == '/') { + result += '-'; + } else if ((c >= '0' && c <= '9') || (c >= 'a' && c <= 'z') || + (c >= 'A' && c <= 'Z') || c == '+' || c == '-' || + c == '.' || c == '_' || c == '?' || c == '=') { + result += c; + } else { + result += '_'; + } + } + return result; +} + +Hash EvalState::getWorldTreeSha(std::string_view worldPath) const +{ + auto path = normalizeZonePath(worldPath); + + // Check cache first + if (auto cached = getConcurrent(*worldTreeShaCache, path)) { + debug("getWorldTreeSha cache hit for '%s'", path); + return *cached; + } + + // Compute by walking from root + auto repo = getWorldRepo(); + auto & sha = requireTectonixGitSha(); + auto commitSha = Hash::parseNonSRIUnprefixed(sha, HashAlgorithm::SHA1); + + // Get the root tree SHA from the commit + auto rootTreeSha = repo->getCommitTree(commitSha); + + // Walk path components, caching intermediate results + Hash currentSha = rootTreeSha; + std::string currentPath; + + // Reuse cached accessor for path validation + auto accessor = getWorldGitAccessor(); + + for (auto & component : tokenizeString>(path, "/")) { + if (component.empty()) continue; + if (component == ".." || component == ".") + throw Error("invalid path component '%s' in world path '%s'", component, worldPath); + + std::string nextPath = currentPath.empty() ? component : currentPath + "/" + component; + + // Check if this level is cached + if (auto cached = getConcurrent(*worldTreeShaCache, nextPath)) { + currentSha = *cached; + currentPath = nextPath; + continue; + } + + // Need to compute: get tree entry for this component + auto fullPath = CanonPath("/" + nextPath); + auto stat = accessor->maybeLstat(fullPath); + + if (!stat || stat->type != SourceAccessor::Type::tDirectory) + throw Error("path '%s' does not exist or is not a directory in world", nextPath); + + // Get the tree SHA for this subtree + currentSha = repo->getSubtreeSha(currentSha, component); + + // Cache this level. Note: concurrent threads may compute and insert the same + // path simultaneously. This is benign because they will compute the same SHA + // (deterministic from git tree), so either insertion succeeds or finds an + // equivalent value. We use try_emplace which is atomic for concurrent_flat_map. + worldTreeShaCache->try_emplace(nextPath, currentSha); + currentPath = nextPath; + } + + debug("getWorldTreeSha computed '%s' -> %s", path, currentSha.gitRev()); + return currentSha; +} + +const std::set & EvalState::getTectonixSparseCheckoutRoots() const +{ + std::call_once(tectonixSparseCheckoutRootsFlag, [this]() { + if (isTectonixSourceAvailable()) { + auto checkoutPath = settings.tectonixCheckoutPath.get(); + + // Read .git to find the actual git directory + // It can be either a directory or a file containing "gitdir: " + auto dotGitPath = std::filesystem::path(checkoutPath) / ".git"; + std::filesystem::path gitDir; + + if (std::filesystem::is_directory(dotGitPath)) { + gitDir = dotGitPath; + } else if (std::filesystem::is_regular_file(dotGitPath)) { + auto gitdirContent = readFile(dotGitPath.string()); + // Parse "gitdir: \n" + if (hasPrefix(gitdirContent, "gitdir: ")) { + auto path = trim(gitdirContent.substr(8)); + gitDir = std::filesystem::path(path); + // Handle relative paths + if (gitDir.is_relative()) + gitDir = std::filesystem::path(checkoutPath) / gitDir; + } + } + + if (!gitDir.empty()) { + // Read sparse-checkout-roots + auto sparseRootsPath = gitDir / "info" / "sparse-checkout-roots"; + if (std::filesystem::exists(sparseRootsPath)) { + auto content = readFile(sparseRootsPath.string()); + for (auto & line : tokenizeString>(content, "\n")) { + auto trimmed = trim(line); + if (!trimmed.empty()) + tectonixSparseCheckoutRoots.insert(std::string(trimmed)); + } + } + } + } + }); + return tectonixSparseCheckoutRoots; +} + +const std::map & EvalState::getTectonixDirtyZones() const +{ + std::call_once(tectonixDirtyZonesFlag, [this]() { + if (!isTectonixSourceAvailable()) + return; + + // Get sparse checkout roots (zone IDs) + auto & sparseRoots = getTectonixSparseCheckoutRoots(); + if (sparseRoots.empty()) + return; + + // Get manifest (uses cached parsed JSON) + const nlohmann::json * manifest; + try { + manifest = &getManifestJson(); + } catch (nlohmann::json::parse_error & e) { + warn("failed to parse manifest for dirty zone detection: %s", e.what()); + return; + } catch (Error &) { + // Manifest file not available (e.g., not in world repo) + return; + } + + // Build map of zone ID -> zone path for sparse roots only + std::map zoneIdToPath; + for (auto & [path, value] : manifest->items()) { + if (!value.contains("id") || !value.at("id").is_string()) { + warn("zone '%s' in manifest has missing or non-string 'id' field", path); + continue; + } + auto & id = value.at("id").get_ref(); + if (sparseRoots.count(id)) + zoneIdToPath[id] = path; + } + + // Initialize all sparse-checked-out zones as not dirty + for (auto & [zoneId, zonePath] : zoneIdToPath) { + tectonixDirtyZones[zonePath] = false; + } + + // Get dirty files via git status with -z for NUL-separated output + // This handles filenames with special characters correctly + auto checkoutPath = settings.tectonixCheckoutPath.get(); + std::string gitStatusOutput; + try { + gitStatusOutput = runProgram("git", true, {"-C", checkoutPath, "status", "--porcelain", "-z"}); + } catch (ExecError & e) { + // If git status fails, treat all zones as clean (fallback) + // This ensures call_once completes and we don't retry with partial state + warn("failed to get git status for dirty zone detection in '%s': %s; treating all zones as clean", checkoutPath, e.what()); + return; + } + + // Parse NUL-separated output + // Format with -z: XY SP path NUL [orig-path NUL for renames/copies] + size_t pos = 0; + while (pos < gitStatusOutput.size()) { + // Find the next NUL + auto nulPos = gitStatusOutput.find('\0', pos); + if (nulPos == std::string::npos) + break; + + auto entry = gitStatusOutput.substr(pos, nulPos - pos); + pos = nulPos + 1; + + // Git porcelain format: "XY PATH" where XY is 2-char status, then space, then path + // Minimum valid entry is "X P" (4 chars): status + space + 1-char path + if (entry.size() < 4) continue; + + // XY is first 2 chars, then space, then path + char xy0 = entry[0]; + std::string rawPath = entry.substr(3); + + // Collect paths to check - destination path is always included + std::vector pathsToCheck; + pathsToCheck.push_back("/" + rawPath); + + // For renames (R) and copies (C), also process the original path + // Both source and destination zones should be marked dirty + if (xy0 == 'R' || xy0 == 'C') { + auto nextNul = gitStatusOutput.find('\0', pos); + if (nextNul != std::string::npos) { + auto origPath = gitStatusOutput.substr(pos, nextNul - pos); + pathsToCheck.push_back("/" + origPath); + pos = nextNul + 1; + } + } + + // Find which zone(s) these files belong to + for (const auto & filePath : pathsToCheck) { + for (auto & [zonePath, dirty] : tectonixDirtyZones) { + // Normalize zone path for comparison with git status output + // filePath is "/areas/..." and zonePath is "//areas/..." + auto normalized = "/" + normalizeZonePath(zonePath); + + if (hasPrefix(filePath, normalized + "/") || filePath == normalized) { + tectonixDirtyZones[zonePath] = true; + break; + } + } + } + } + + size_t dirtyCount = 0; + for (const auto & [_, dirty] : tectonixDirtyZones) + if (dirty) dirtyCount++; + debug("computed dirty zones: %d of %d zones are dirty", dirtyCount, tectonixDirtyZones.size()); + }); + return tectonixDirtyZones; +} + +// Path to the tectonix manifest file within the world repository +static constexpr std::string_view TECTONIX_MANIFEST_PATH = "/.meta/manifest.json"; + +const std::string & EvalState::getManifestContent() const +{ + // Cached for the lifetime of evaluation. This is intentional: evaluation is + // bound to a specific git SHA (tectonix-git-sha), so the manifest content is + // immutable for this EvalState instance. + std::call_once(tectonixManifestFlag, [this]() { + auto fullPath = CanonPath(TECTONIX_MANIFEST_PATH); + + // In source-available mode, check checkout first + if (isTectonixSourceAvailable()) { + auto checkoutAccessor = getWorldCheckoutAccessor(); + if (checkoutAccessor) { + auto checkoutPath = settings.tectonixCheckoutPath.get(); + auto checkoutFullPath = CanonPath(checkoutPath + fullPath.abs()); + if ((*checkoutAccessor)->pathExists(checkoutFullPath)) { + tectonixManifestContent = (*checkoutAccessor)->readFile(checkoutFullPath); + debug("loaded manifest from checkout: %s", checkoutFullPath); + return; + } + } + } + + // Fall back to git + auto accessor = getWorldGitAccessor(); + if (!accessor->pathExists(fullPath)) + throw Error("manifest.json does not exist at %s in world", TECTONIX_MANIFEST_PATH); + + tectonixManifestContent = accessor->readFile(fullPath); + debug("loaded manifest from git at %s", fullPath); + }); + return tectonixManifestContent; +} + +const nlohmann::json & EvalState::getManifestJson() const +{ + std::call_once(tectonixManifestJsonFlag, [this]() { + tectonixManifestJson = std::make_unique( + nlohmann::json::parse(getManifestContent())); + }); + return *tectonixManifestJson; +} + +StorePath EvalState::getZoneStorePath(std::string_view zonePath) +{ + // Check dirty status using original zonePath (with // prefix) since + // tectonixDirtyZones keys come directly from manifest with // prefix + bool isDirty = false; + if (isTectonixSourceAvailable()) { + auto & dirtyZones = getTectonixDirtyZones(); + auto it = dirtyZones.find(std::string(zonePath)); + isDirty = it != dirtyZones.end() && it->second; + } + + if (isDirty) { + debug("getZoneStorePath: %s is dirty, using checkout", zonePath); + return getZoneFromCheckout(zonePath); + } + + // Clean zone: get tree SHA + auto treeSha = getWorldTreeSha(zonePath); + + if (!settings.lazyTrees) { + debug("getZoneStorePath: %s clean, eager copy from git (tree %s)", zonePath, treeSha.gitRev()); + // Eager mode: immediate copy from git ODB + auto repo = getWorldRepo(); + // exportIgnore=true: honor .gitattributes for zone content (unlike world accessor) + GitAccessorOptions opts{.exportIgnore = true, .smudgeLfs = false}; + auto accessor = repo->getAccessor(treeSha, opts, "zone"); + + std::string name = "zone-" + sanitizeZoneNameForStore(zonePath); + auto storePath = fetchToStore( + fetchSettings, *store, + SourcePath(accessor, CanonPath::root), + FetchMode::Copy, name); + + allowPath(storePath); + return storePath; + } + + debug("getZoneStorePath: %s clean, lazy mount (tree %s)", zonePath, treeSha.gitRev()); + return mountZoneByTreeSha(treeSha, zonePath); +} + +StorePath EvalState::mountZoneByTreeSha(const Hash & treeSha, std::string_view zonePath) +{ + // Double-checked locking pattern for concurrent zone mounting: + // 1. Read lock check (fast path - allows concurrent readers) + { + auto cache = tectonixZoneCache_.readLock(); + auto it = cache->find(treeSha); + if (it != cache->end()) { + debug("zone cache hit for tree %s", treeSha.gitRev()); + return it->second; + } + } // Read lock released + + // 2. Write lock check (catch races between read unlock and write lock) + { + auto cache = tectonixZoneCache_.lock(); + auto it = cache->find(treeSha); + if (it != cache->end()) { + debug("zone cache hit for tree %s (after lock upgrade)", treeSha.gitRev()); + return it->second; + } + } // Write lock released - expensive work happens without holding lock + + // 3. Perform expensive git operations without holding lock. + // This allows concurrent mounts of different zones. Multiple threads may + // race to mount the same zone, but we check again before inserting. + auto repo = getWorldRepo(); + // exportIgnore=true: honor .gitattributes for zone content (unlike world accessor) + GitAccessorOptions opts{.exportIgnore = true, .smudgeLfs = false}; + auto accessor = repo->getAccessor(treeSha, opts, "zone"); + + // Generate name from zone path (sanitized for store path requirements) + std::string name = "zone-" + sanitizeZoneNameForStore(zonePath); + + // Create virtual store path + auto storePath = StorePath::random(name); + + // 4. Re-acquire write lock and check again before mounting + auto cache = tectonixZoneCache_.lock(); + auto it = cache->find(treeSha); + if (it != cache->end()) { + // Another thread mounted while we were working - use their result + debug("zone cache hit for tree %s (after work)", treeSha.gitRev()); + return it->second; + } + + // Mount accessor at this path first, then allow the path. + // This order ensures we don't leave allowed paths without mounts on exception. + storeFS->mount(CanonPath(store->printStorePath(storePath)), accessor); + allowPath(storePath); + + // Insert into cache (we hold the lock, so this will succeed) + cache->emplace(treeSha, storePath); + + debug("mounted zone %s (tree %s) at %s", + zonePath, treeSha.gitRev(), store->printStorePath(storePath)); + + return storePath; +} + +StorePath EvalState::getZoneFromCheckout(std::string_view zonePath) +{ + auto zone = normalizeZonePath(zonePath); + std::string name = "zone-" + sanitizeZoneNameForStore(zonePath); + auto checkoutPath = settings.tectonixCheckoutPath.get(); + auto fullPath = std::filesystem::path(checkoutPath) / zone; + + if (!settings.lazyTrees) { + // Eager mode: immediate copy from checkout + auto checkoutAccessor = getWorldCheckoutAccessor(); + if (!checkoutAccessor) + throw Error("checkout accessor not available for dirty zone '%s'", zonePath); + + auto storePath = fetchToStore( + fetchSettings, *store, + SourcePath(*checkoutAccessor, CanonPath(checkoutPath + "/" + zone)), + FetchMode::Copy, name); + + allowPath(storePath); + return storePath; + } + + // Lazy mode: check cache first with read lock (fast path) + { + auto cache = tectonixCheckoutZoneCache_.readLock(); + auto it = cache->find(std::string(zonePath)); + if (it != cache->end()) { + debug("checkout zone cache hit for %s", zonePath); + return it->second; + } + } + + // Not in cache - acquire write lock and check again (double-checked locking) + // This prevents duplicate mounts when multiple threads race + auto cache = tectonixCheckoutZoneCache_.lock(); + auto it = cache->find(std::string(zonePath)); + if (it != cache->end()) { + debug("checkout zone cache hit for %s (after lock)", zonePath); + return it->second; + } + + // Still not cached: create accessor and mount while holding the lock + if (!std::filesystem::exists(fullPath)) + throw Error("zone '%s' not found in checkout at '%s'", zonePath, fullPath.string()); + + // Note: This mounts the live checkout directory, meaning files are read on-demand + // during evaluation. If the checkout is modified mid-evaluation, behavior is + // undefined. This is analogous to normal file reads and acceptable for local + // development workflows where dirty zones are being actively worked on. + debug("mounting live checkout for dirty zone %s - modifications during evaluation may cause undefined behavior", zonePath); + auto accessor = makeFSSourceAccessor(fullPath); + + // Create virtual store path + auto storePath = StorePath::random(name); + + // Mount accessor at this path first, then allow the path. + // This order ensures we don't leave allowed paths without mounts on exception. + storeFS->mount(CanonPath(store->printStorePath(storePath)), accessor); + allowPath(storePath); + + // Insert into cache (we hold the lock, so this will succeed) + cache->emplace(std::string(zonePath), storePath); + + debug("mounted checkout zone %s at %s", zonePath, store->printStorePath(storePath)); + return storePath; +} + inline static bool isJustSchemePrefix(std::string_view prefix) { return !prefix.empty() && prefix[prefix.size() - 1] == ':' diff --git a/src/libexpr/include/nix/expr/eval-settings.hh b/src/libexpr/include/nix/expr/eval-settings.hh index f367541ec2f6..1e06cab9987e 100644 --- a/src/libexpr/include/nix/expr/eval-settings.hh +++ b/src/libexpr/include/nix/expr/eval-settings.hh @@ -399,6 +399,43 @@ struct EvalSettings : Config Note that enabling the debugger (`--debugger`) disables multi-threaded evaluation. )"}; + + Setting tectonixGitDir{ + this, + "~/world/git", + "tectonix-git-dir", + R"( + Path to the git directory for tectonix builtins (default: `~/world/git`). + + This enables the tectonix builtins (`builtins.unsafeTectonixInternalTreeSha`, `builtins.unsafeTectonixInternalTree`, + `builtins.unsafeTectonixInternalZoneSrc`, `builtins.unsafeTectonixInternalZone`, `builtins.unsafeTectonixInternalManifest`) + which provide native access to files from a git repository during Nix evaluation. + )"}; + + Setting tectonixGitSha{ + this, + "", + "tectonix-git-sha", + R"( + Git commit SHA to use for git-backed tectonix builtins. + + This specifies the commit to read from when using tectonix builtins that + access the world repository by commit (manifest, tree SHA, zone access). + It is optional unless those builtins are invoked. Typically set to HEAD + of the repository. + )"}; + + Setting tectonixCheckoutPath{ + this, + "", + "tectonix-checkout-path", + R"( + Path to checkout directory for source-available mode. + + When set, uncommitted files in the checkout are preferred over git content + for tectonix builtins. This enables local development workflows where changes + are visible before committing. + )"}; }; /** diff --git a/src/libexpr/include/nix/expr/eval.hh b/src/libexpr/include/nix/expr/eval.hh index c9cfb1a573bf..754c62d925ec 100644 --- a/src/libexpr/include/nix/expr/eval.hh +++ b/src/libexpr/include/nix/expr/eval.hh @@ -24,8 +24,12 @@ #include #include +#include + #include +#include #include +#include #include namespace nix { @@ -52,6 +56,7 @@ enum RepairFlag : bool; struct MemorySourceAccessor; struct MountedSourceAccessor; struct AsyncPathWriter; +struct GitRepo; namespace eval_cache { class EvalCache; @@ -513,6 +518,66 @@ private: */ const ref regexCache; + /** Lazy-initialized git repository for world builtins (thread-safe via once_flag) */ + mutable std::once_flag worldRepoFlag; + mutable std::optional> worldRepo; + + /** Lazy-initialized source accessor for world git content (thread-safe via once_flag) */ + mutable std::once_flag worldGitAccessorFlag; + mutable std::optional> worldGitAccessor; + + /** Lazy-initialized source accessor for world checkout (thread-safe via once_flag) */ + mutable std::once_flag worldCheckoutAccessorFlag; + mutable std::optional> worldCheckoutAccessor; + + /** Cache: world path → tree SHA (lazy computed, cached at each path level) */ + const ref> worldTreeShaCache; + + /** Lazy-initialized set of zone IDs in sparse checkout (thread-safe via once_flag) */ + mutable std::once_flag tectonixSparseCheckoutRootsFlag; + mutable std::set tectonixSparseCheckoutRoots; + + /** Lazy-initialized map of zone path → dirty status (thread-safe via once_flag) */ + mutable std::once_flag tectonixDirtyZonesFlag; + mutable std::map tectonixDirtyZones; + + /** Cached manifest content (thread-safe via once_flag) */ + mutable std::once_flag tectonixManifestFlag; + mutable std::string tectonixManifestContent; + + /** Cached parsed manifest JSON (thread-safe via once_flag) */ + mutable std::once_flag tectonixManifestJsonFlag; + mutable std::unique_ptr tectonixManifestJson; + + /** + * Cache tree SHA → virtual store path for lazy zone mounts. + * Thread-safe for eval-cores > 1. + */ + mutable SharedSync> tectonixZoneCache_; + + /** + * Cache zone path → virtual store path for lazy checkout zone mounts. + * Thread-safe for eval-cores > 1. + */ + mutable SharedSync> tectonixCheckoutZoneCache_; + + /** + * Mount a zone by tree SHA, returning a (potentially virtual) store path. + * Caches by tree SHA for deduplication across world revisions. + */ + StorePath mountZoneByTreeSha(const Hash & treeSha, std::string_view zonePath); + + /** + * Get zone store path from checkout (for dirty zones). + * With lazy-trees enabled, mounts lazily and caches by zone path. + */ + StorePath getZoneFromCheckout(std::string_view zonePath); + + /** + * Return the configured tectonix git SHA, or throw if unset. + */ + const std::string & requireTectonixGitSha() const; + public: /** @@ -544,6 +609,52 @@ public: return lookupPath; } + /** Get the world git repository, initializing lazily */ + ref getWorldRepo() const; + + /** + * Get accessor for world git content at worldSha. + * + * exportIgnore policy for tectonix accessors: + * - World accessor (getWorldGitAccessor): exportIgnore=false + * Used for path validation and tree SHA computation; needs to see all files + * - Zone accessors (mountZoneByTreeSha, getZoneStorePath): exportIgnore=true + * Used for actual zone content; honors .gitattributes for filtered output + * - Raw tree accessor (__unsafeTectonixInternalTree): exportIgnore=false + * Low-level access by SHA; provides unfiltered content + */ + ref getWorldGitAccessor() const; + + /** Get accessor for world checkout (only in source-available mode) */ + std::optional> getWorldCheckoutAccessor() const; + + /** Get tree SHA for a world path, with lazy caching */ + Hash getWorldTreeSha(std::string_view worldPath) const; + + /** Check if we're in source-available mode */ + bool isTectonixSourceAvailable() const; + + /** Get set of zone IDs in sparse checkout (source-available mode only) */ + const std::set & getTectonixSparseCheckoutRoots() const; + + /** Get map of zone path → dirty status (only for sparse-checked-out zones) */ + const std::map & getTectonixDirtyZones() const; + + /** Get cached manifest content (thread-safe, lazy-loaded) */ + const std::string & getManifestContent() const; + + /** Get cached parsed manifest JSON (thread-safe, lazy-loaded) */ + const nlohmann::json & getManifestJson() const; + + /** + * Get a zone's store path, handling dirty detection and lazy mounting. + * + * For clean zones with lazy-trees enabled: mounts accessor lazily + * For dirty zones: currently eager-copies from checkout (extension point) + * For lazy-trees disabled: eager-copies from git + */ + StorePath getZoneStorePath(std::string_view zonePath); + /** * Return a `SourcePath` that refers to `path` in the root * filesystem. diff --git a/src/libexpr/primops/meson.build b/src/libexpr/primops/meson.build index b8abc6409af9..5d948a49c3f9 100644 --- a/src/libexpr/primops/meson.build +++ b/src/libexpr/primops/meson.build @@ -9,4 +9,5 @@ sources += files( 'fetchMercurial.cc', 'fetchTree.cc', 'fromTOML.cc', + 'tectonix.cc', ) diff --git a/src/libexpr/primops/tectonix.cc b/src/libexpr/primops/tectonix.cc new file mode 100644 index 000000000000..a5d2f7a97288 --- /dev/null +++ b/src/libexpr/primops/tectonix.cc @@ -0,0 +1,350 @@ +#include "nix/expr/primops.hh" +#include "nix/expr/eval-inline.hh" +#include "nix/expr/eval-settings.hh" +#include "nix/fetchers/git-utils.hh" +#include "nix/store/store-api.hh" +#include "nix/fetchers/fetch-to-store.hh" + +#include +#include + +namespace nix { + +// Helper to get cached manifest JSON (avoids repeated parsing) +static const nlohmann::json & getManifest(EvalState & state) +{ + return state.getManifestJson(); +} + +// Helper to validate that a zone path exists in the manifest +static void validateZonePath(EvalState & state, const PosIdx pos, std::string_view zonePath) +{ + auto & manifest = getManifest(state); + if (!manifest.contains(std::string(zonePath))) + state.error("'%s' is not a zone root (must be an exact path from the manifest)", zonePath) + .atPos(pos).debugThrow(); +} + +// ============================================================================ +// builtins.worldManifest +// Returns path -> zone metadata mapping from //.meta/manifest.json +// ============================================================================ +static void prim_worldManifest(EvalState & state, const PosIdx pos, Value ** args, Value & v) +{ + auto json = getManifest(state); + + auto attrs = state.buildBindings(json.size()); + for (auto & [path, value] : json.items()) { + if (!value.contains("id") || !value.at("id").is_string()) + throw Error("zone '%s' in manifest has missing or non-string 'id' field", path); + auto idStr = value.at("id").get(); + + auto zoneAttrs = state.buildBindings(1); + zoneAttrs.alloc("id").mkString(idStr, state.mem); + attrs.alloc(state.symbols.create(path)).mkAttrs(zoneAttrs); + } + v.mkAttrs(attrs); +} + +static RegisterPrimOp primop_worldManifest({ + .name = "__unsafeTectonixInternalManifest", + .args = {}, + .doc = R"( + Get the world manifest as a Nix attrset mapping zone paths to zone metadata. + + Example: `builtins.unsafeTectonixInternalManifest."//areas/tools/dev".id` returns `"W-123456"`. + + Uses `--tectonix-git-dir` (defaults to `~/world/git`) and requires + `--tectonix-git-sha` to be set. + )", + .fun = prim_worldManifest, +}); + +// ============================================================================ +// builtins.worldManifestInverted +// Returns zoneId -> path mapping (inverse of worldManifest) +// ============================================================================ +static void prim_worldManifestInverted(EvalState & state, const PosIdx pos, Value ** args, Value & v) +{ + auto json = getManifest(state); + + // Track seen IDs to detect duplicates + std::set seenIds; + + auto attrs = state.buildBindings(json.size()); + for (auto & [path, value] : json.items()) { + if (!value.contains("id") || !value.at("id").is_string()) + throw Error("zone '%s' in manifest has missing or non-string 'id' field", path); + auto idStr = value.at("id").get(); + + if (!seenIds.insert(idStr).second) + throw Error("duplicate zone ID '%s' in manifest (zone '%s')", idStr, path); + + attrs.alloc(state.symbols.create(idStr)).mkString(path, state.mem); + } + v.mkAttrs(attrs); +} + +static RegisterPrimOp primop_worldManifestInverted({ + .name = "__unsafeTectonixInternalManifestInverted", + .args = {}, + .doc = R"( + Get the inverted world manifest as a Nix attrset mapping zone IDs to zone paths. + + Example: `builtins.unsafeTectonixInternalManifestInverted."W-123456"` returns `"//areas/tools/dev"`. + + Uses `--tectonix-git-dir` (defaults to `~/world/git`) and requires + `--tectonix-git-sha` to be set. + )", + .fun = prim_worldManifestInverted, +}); + +// ============================================================================ +// builtins.unsafeTectonixInternalTreeSha worldPath +// Returns the git tree SHA for a world path +// ============================================================================ +static void prim_unsafeTectonixInternalTreeSha(EvalState & state, const PosIdx pos, Value ** args, Value & v) +{ + auto worldPath = state.forceStringNoCtx(*args[0], pos, + "while evaluating the 'worldPath' argument to builtins.unsafeTectonixInternalTreeSha"); + + auto sha = state.getWorldTreeSha(worldPath); + v.mkString(sha.gitRev(), state.mem); +} + +static RegisterPrimOp primop_unsafeTectonixInternalTreeSha({ + .name = "__unsafeTectonixInternalTreeSha", + .args = {"worldPath"}, + .doc = R"( + Get the git tree SHA for a path in the world repository. + + Example: `builtins.unsafeTectonixInternalTreeSha "//areas/tools/tec"` returns the tree SHA + for that zone. + + Uses `--tectonix-git-dir` (defaults to `~/world/git`) and requires + `--tectonix-git-sha` to be set. + )", + .fun = prim_unsafeTectonixInternalTreeSha, +}); + +// ============================================================================ +// builtins.unsafeTectonixInternalTree treeSha +// Returns a store path containing the tree contents +// ============================================================================ +static void prim_unsafeTectonixInternalTree(EvalState & state, const PosIdx pos, Value ** args, Value & v) +{ + auto treeSha = state.forceStringNoCtx(*args[0], pos, + "while evaluating the 'treeSha' argument to builtins.unsafeTectonixInternalTree"); + + auto repo = state.getWorldRepo(); + auto hash = Hash::parseNonSRIUnprefixed(treeSha, HashAlgorithm::SHA1); + + if (!repo->hasObject(hash)) + state.error("tree SHA '%s' not found in world repository", treeSha) + .atPos(pos).debugThrow(); + + // exportIgnore=false: This is raw tree access by SHA, used for low-level operations. + // Unlike zone accessors (which use exportIgnore=true to honor .gitattributes for + // filtered zone content), this provides unfiltered access to exact tree contents. + GitAccessorOptions opts{.exportIgnore = false, .smudgeLfs = false}; + auto accessor = repo->getAccessor(hash, opts, "world-tree"); + + auto storePath = fetchToStore( + state.fetchSettings, + *state.store, + SourcePath(accessor, CanonPath::root), + FetchMode::Copy, + "world-tree-" + std::string(treeSha).substr(0, 12)); + + state.allowAndSetStorePathString(storePath, v); +} + +static RegisterPrimOp primop_unsafeTectonixInternalTree({ + .name = "__unsafeTectonixInternalTree", + .args = {"treeSha"}, + .doc = R"( + Fetch a git tree by SHA from the world repository and return it as a store path. + + Example: `builtins.unsafeTectonixInternalTree "abc123..."` returns `/nix/store/...-world-tree-abc123`. + + Uses `--tectonix-git-dir` (defaults to `~/world/git`). + )", + .fun = prim_unsafeTectonixInternalTree, +}); + +// ============================================================================ +// builtins.unsafeTectonixInternalZoneSrc zonePath +// Returns a store path containing the zone source +// With lazy-trees enabled, returns a virtual store path that is only +// materialized when used as a derivation input. +// ============================================================================ +static void prim_unsafeTectonixInternalZoneSrc(EvalState & state, const PosIdx pos, Value ** args, Value & v) +{ + auto zonePath = state.forceStringNoCtx(*args[0], pos, + "while evaluating the 'zonePath' argument to builtins.unsafeTectonixInternalZoneSrc"); + + validateZonePath(state, pos, zonePath); + + auto storePath = state.getZoneStorePath(zonePath); + state.allowAndSetStorePathString(storePath, v); +} + +static RegisterPrimOp primop_unsafeTectonixInternalZoneSrc({ + .name = "__unsafeTectonixInternalZoneSrc", + .args = {"zonePath"}, + .doc = R"( + Get the source of a zone as a store path. + + With `lazy-trees = true`, returns a virtual store path that is only + materialized when used as a derivation input (devirtualized). + + In source-available mode with uncommitted changes, uses checkout content + (always eager for dirty zones). + + Example: `builtins.unsafeTectonixInternalZoneSrc "//areas/tools/tec"` + + Uses `--tectonix-git-dir` (defaults to `~/world/git`) and requires + `--tectonix-git-sha` to be set. + )", + .fun = prim_unsafeTectonixInternalZoneSrc, +}); + +// ============================================================================ +// builtins.unsafeTectonixInternalSparseCheckoutRoots +// Returns list of zone IDs in sparse checkout +// ============================================================================ +static void prim_unsafeTectonixInternalSparseCheckoutRoots(EvalState & state, const PosIdx pos, Value ** args, Value & v) +{ + auto & roots = state.getTectonixSparseCheckoutRoots(); + + auto list = state.buildList(roots.size()); + size_t i = 0; + for (const auto & root : roots) { + (list[i++] = state.allocValue())->mkString(root, state.mem); + } + v.mkList(list); +} + +static RegisterPrimOp primop_unsafeTectonixInternalSparseCheckoutRoots({ + .name = "__unsafeTectonixInternalSparseCheckoutRoots", + .args = {}, + .doc = R"( + Get the list of zone IDs that are in the sparse checkout. + + Returns an empty list if not in source-available mode or if no + sparse-checkout-roots file exists. + + Example: `builtins.unsafeTectonixInternalSparseCheckoutRoots` returns `["W-000000" "W-1337af" ...]`. + + Requires `--tectonix-checkout-path` to be set. + )", + .fun = prim_unsafeTectonixInternalSparseCheckoutRoots, +}); + +// ============================================================================ +// builtins.unsafeTectonixInternalDirtyZones +// Returns map of zone paths to dirty status +// ============================================================================ +static void prim_unsafeTectonixInternalDirtyZones(EvalState & state, const PosIdx pos, Value ** args, Value & v) +{ + auto & dirtyZones = state.getTectonixDirtyZones(); + + auto attrs = state.buildBindings(dirtyZones.size()); + for (const auto & [zonePath, dirty] : dirtyZones) { + attrs.alloc(state.symbols.create(zonePath)).mkBool(dirty); + } + v.mkAttrs(attrs); +} + +static RegisterPrimOp primop_unsafeTectonixInternalDirtyZones({ + .name = "__unsafeTectonixInternalDirtyZones", + .args = {}, + .doc = R"( + Get the dirty status of zones in the sparse checkout. + + Returns an attrset mapping zone paths to booleans indicating whether + the zone has uncommitted changes. + + Only includes zones that are in the sparse checkout. + + Example: `builtins.unsafeTectonixInternalDirtyZones."//areas/tools/dev"` returns `true` or `false`. + + Requires `--tectonix-checkout-path` to be set. + )", + .fun = prim_unsafeTectonixInternalDirtyZones, +}); + +// ============================================================================ +// builtins.__unsafeTectonixInternalZone zonePath +// Returns an attrset with zone info (flake-like interface) +// ============================================================================ +static void prim_unsafeTectonixInternalZone(EvalState & state, const PosIdx pos, Value ** args, Value & v) +{ + auto zonePath = state.forceStringNoCtx(*args[0], pos, + "while evaluating the 'zonePath' argument to builtins.__unsafeTectonixInternalZone"); + + validateZonePath(state, pos, zonePath); + + // Get tree SHA before we potentially fetch + auto treeSha = state.getWorldTreeSha(zonePath); + + // Check dirty status + bool isDirty = false; + if (state.isTectonixSourceAvailable()) { + auto & dirtyZones = state.getTectonixDirtyZones(); + auto it = dirtyZones.find(std::string(zonePath)); + isDirty = it != dirtyZones.end() && it->second; + } + + auto storePath = state.getZoneStorePath(zonePath); + auto storePathStr = state.store->printStorePath(storePath); + + // Build result attrset (like fetchTree) + auto attrs = state.buildBindings(5); + + // outPath: string with context (for use as derivation src) + attrs.alloc("outPath").mkString(storePathStr, { + NixStringContextElem::Opaque{storePath} + }, state.mem); + + // root: path value (for reading files without devirtualization) + attrs.alloc("root").mkPath( + state.rootPath(CanonPath(storePathStr)), state.mem); + + attrs.alloc("treeSha").mkString(treeSha.gitRev(), state.mem); + attrs.alloc("zonePath").mkString(zonePath, state.mem); + attrs.alloc("dirty").mkBool(isDirty); + + v.mkAttrs(attrs); +} + +static RegisterPrimOp primop_unsafeTectonixInternalZone({ + .name = "__unsafeTectonixInternalZone", + .args = {"zonePath"}, + .doc = R"( + Get a zone from the world repository. + + Returns an attrset with: + - outPath: Store path string with context (for use as derivation src) + - root: Path value for reading files (no devirtualization) + - treeSha: Git tree SHA for this zone + - zonePath: The zone path argument + - dirty: Whether the zone has uncommitted changes + + With `lazy-trees = true`, the zone is mounted lazily. Use `root` to + read files without triggering a copy to the store: + + let zone = builtins.__unsafeTectonixInternalZone "//areas/tools/tec"; + in import (zone.root + "/zone.nix") + + Use `outPath` as derivation src (triggers copy at build time): + + mkDerivation { src = zone.outPath; } + + Uses `--tectonix-git-dir` (defaults to `~/world/git`) and requires + `--tectonix-git-sha` to be set. + )", + .fun = prim_unsafeTectonixInternalZone, +}); + +} // namespace nix diff --git a/src/libfetchers-tests/git-utils.cc b/src/libfetchers-tests/git-utils.cc index 0b21fd0c67d5..3f3474670ac5 100644 --- a/src/libfetchers-tests/git-utils.cc +++ b/src/libfetchers-tests/git-utils.cc @@ -14,6 +14,7 @@ #include #include +#include namespace nix { @@ -174,6 +175,259 @@ TEST_F(GitUtilsTest, peel_reference) git_repository_free(rawRepo); } +// ============================================================================ +// Tests for odbOnly mode (Phase 2) +// ============================================================================ + +TEST_F(GitUtilsTest, odbOnly_opens_repository) +{ + // First create some content in the repo + git_repository * rawRepo = nullptr; + ASSERT_EQ(git_repository_open(&rawRepo, tmpDir.string().c_str()), 0); + + // Create a blob + git_oid blob_oid; + const char * blob_content = "test content"; + ASSERT_EQ(git_blob_create_from_buffer(&blob_oid, rawRepo, blob_content, strlen(blob_content)), 0); + + git_repository_free(rawRepo); + + // Now open with odbOnly=true (must use .git directory) + auto gitDir = tmpDir / ".git"; + auto repo = GitRepo::openRepo(gitDir, {.odbOnly = true}); + ASSERT_NE(&*repo, nullptr); + + // Should be able to check if object exists + char sha[GIT_OID_SHA1_HEXSIZE + 1]; + git_oid_tostr(sha, sizeof(sha), &blob_oid); + auto hash = Hash::parseNonSRIUnprefixed(sha, HashAlgorithm::SHA1); + ASSERT_TRUE(repo->hasObject(hash)); +} + +TEST_F(GitUtilsTest, odbOnly_fails_without_objects_dir) +{ + // Create a path that doesn't have a git objects directory + auto nonExistentPath = tmpDir / "nonexistent"; + + ASSERT_THROW( + GitRepo::openRepo(nonExistentPath, {.odbOnly = true}), + Error); +} + +TEST_F(GitUtilsTest, odbOnly_accesses_objects_directly) +{ + // Create a repo and verify odbOnly can access its objects directly + git_repository * rawRepo = nullptr; + ASSERT_EQ(git_repository_open(&rawRepo, tmpDir.string().c_str()), 0); + + // Create a blob + git_oid blob_oid; + const char * blob_content = "test content for odbOnly direct access"; + ASSERT_EQ(git_blob_create_from_buffer(&blob_oid, rawRepo, blob_content, strlen(blob_content)), 0); + + // Create a tree with the blob + git_treebuilder * builder = nullptr; + ASSERT_EQ(git_treebuilder_new(&builder, rawRepo, nullptr), 0); + ASSERT_EQ(git_treebuilder_insert(nullptr, builder, "test.txt", &blob_oid, GIT_FILEMODE_BLOB), 0); + + git_oid treeOid; + ASSERT_EQ(git_treebuilder_write(&treeOid, builder), 0); + git_treebuilder_free(builder); + + git_repository_free(rawRepo); + + // Open with odbOnly and verify we can access objects + auto gitDir = tmpDir / ".git"; + auto repo = GitRepo::openRepo(gitDir, {.odbOnly = true}); + ASSERT_NE(&*repo, nullptr); + + // Verify blob exists + char blobSha[GIT_OID_SHA1_HEXSIZE + 1]; + git_oid_tostr(blobSha, sizeof(blobSha), &blob_oid); + auto blobHash = Hash::parseNonSRIUnprefixed(blobSha, HashAlgorithm::SHA1); + ASSERT_TRUE(repo->hasObject(blobHash)); + + // Verify tree exists + char treeSha[GIT_OID_SHA1_HEXSIZE + 1]; + git_oid_tostr(treeSha, sizeof(treeSha), &treeOid); + auto treeHash = Hash::parseNonSRIUnprefixed(treeSha, HashAlgorithm::SHA1); + ASSERT_TRUE(repo->hasObject(treeHash)); +} + +// ============================================================================ +// Tests for getSubtreeSha (Phase 2) +// ============================================================================ + +TEST_F(GitUtilsTest, getSubtreeSha_finds_entry) +{ + git_repository * rawRepo = nullptr; + ASSERT_EQ(git_repository_open(&rawRepo, tmpDir.string().c_str()), 0); + + // Create a blob for file content + git_oid blob_oid; + const char * blob_content = "file content"; + ASSERT_EQ(git_blob_create_from_buffer(&blob_oid, rawRepo, blob_content, strlen(blob_content)), 0); + + // Create inner tree (subdir) + git_treebuilder * innerBuilder = nullptr; + ASSERT_EQ(git_treebuilder_new(&innerBuilder, rawRepo, nullptr), 0); + ASSERT_EQ(git_treebuilder_insert(nullptr, innerBuilder, "file.txt", &blob_oid, GIT_FILEMODE_BLOB), 0); + + git_oid innerTreeOid; + ASSERT_EQ(git_treebuilder_write(&innerTreeOid, innerBuilder), 0); + git_treebuilder_free(innerBuilder); + + // Create outer tree with subdir + git_treebuilder * outerBuilder = nullptr; + ASSERT_EQ(git_treebuilder_new(&outerBuilder, rawRepo, nullptr), 0); + ASSERT_EQ(git_treebuilder_insert(nullptr, outerBuilder, "subdir", &innerTreeOid, GIT_FILEMODE_TREE), 0); + + git_oid outerTreeOid; + ASSERT_EQ(git_treebuilder_write(&outerTreeOid, outerBuilder), 0); + git_treebuilder_free(outerBuilder); + + git_repository_free(rawRepo); + + // Now test getSubtreeSha + auto repo = openRepo(); + + char outerSha[GIT_OID_SHA1_HEXSIZE + 1]; + git_oid_tostr(outerSha, sizeof(outerSha), &outerTreeOid); + auto outerHash = Hash::parseNonSRIUnprefixed(outerSha, HashAlgorithm::SHA1); + + char innerSha[GIT_OID_SHA1_HEXSIZE + 1]; + git_oid_tostr(innerSha, sizeof(innerSha), &innerTreeOid); + auto expectedInnerHash = Hash::parseNonSRIUnprefixed(innerSha, HashAlgorithm::SHA1); + + auto resultHash = repo->getSubtreeSha(outerHash, "subdir"); + ASSERT_EQ(resultHash, expectedInnerHash); +} + +TEST_F(GitUtilsTest, getSubtreeSha_missing_entry_throws) +{ + git_repository * rawRepo = nullptr; + ASSERT_EQ(git_repository_open(&rawRepo, tmpDir.string().c_str()), 0); + + // Create empty tree + git_treebuilder * builder = nullptr; + ASSERT_EQ(git_treebuilder_new(&builder, rawRepo, nullptr), 0); + + // Add a dummy entry so tree isn't empty + git_oid blob_oid; + const char * blob_content = "x"; + ASSERT_EQ(git_blob_create_from_buffer(&blob_oid, rawRepo, blob_content, strlen(blob_content)), 0); + ASSERT_EQ(git_treebuilder_insert(nullptr, builder, "existing", &blob_oid, GIT_FILEMODE_BLOB), 0); + + git_oid treeOid; + ASSERT_EQ(git_treebuilder_write(&treeOid, builder), 0); + git_treebuilder_free(builder); + git_repository_free(rawRepo); + + auto repo = openRepo(); + + char sha[GIT_OID_SHA1_HEXSIZE + 1]; + git_oid_tostr(sha, sizeof(sha), &treeOid); + auto treeHash = Hash::parseNonSRIUnprefixed(sha, HashAlgorithm::SHA1); + + ASSERT_THROW(repo->getSubtreeSha(treeHash, "nonexistent"), Error); +} + +// ============================================================================ +// Tests for getCommitTree (Phase 2) +// ============================================================================ + +TEST_F(GitUtilsTest, getCommitTree_returns_root_tree) +{ + git_repository * rawRepo = nullptr; + ASSERT_EQ(git_repository_open(&rawRepo, tmpDir.string().c_str()), 0); + + // Create a blob + git_oid blob_oid; + const char * blob_content = "content"; + ASSERT_EQ(git_blob_create_from_buffer(&blob_oid, rawRepo, blob_content, strlen(blob_content)), 0); + + // Create a tree + git_treebuilder * builder = nullptr; + ASSERT_EQ(git_treebuilder_new(&builder, rawRepo, nullptr), 0); + ASSERT_EQ(git_treebuilder_insert(nullptr, builder, "file.txt", &blob_oid, GIT_FILEMODE_BLOB), 0); + + git_oid treeOid; + ASSERT_EQ(git_treebuilder_write(&treeOid, builder), 0); + git_treebuilder_free(builder); + + git_tree * tree = nullptr; + ASSERT_EQ(git_tree_lookup(&tree, rawRepo, &treeOid), 0); + + // Create a commit + git_signature * sig = nullptr; + ASSERT_EQ(git_signature_now(&sig, "test", "test@example.com"), 0); + + git_oid commitOid; + ASSERT_EQ(git_commit_create_v(&commitOid, rawRepo, "HEAD", sig, sig, nullptr, "test commit", tree, 0), 0); + + git_signature_free(sig); + git_tree_free(tree); + git_repository_free(rawRepo); + + // Now test getCommitTree + auto repo = openRepo(); + + char commitSha[GIT_OID_SHA1_HEXSIZE + 1]; + git_oid_tostr(commitSha, sizeof(commitSha), &commitOid); + auto commitHash = Hash::parseNonSRIUnprefixed(commitSha, HashAlgorithm::SHA1); + + char treeSha[GIT_OID_SHA1_HEXSIZE + 1]; + git_oid_tostr(treeSha, sizeof(treeSha), &treeOid); + auto expectedTreeHash = Hash::parseNonSRIUnprefixed(treeSha, HashAlgorithm::SHA1); + + auto resultHash = repo->getCommitTree(commitHash); + ASSERT_EQ(resultHash, expectedTreeHash); +} + +TEST_F(GitUtilsTest, getCommitTree_invalid_sha_throws) +{ + auto repo = openRepo(); + + // Use a SHA that doesn't exist + auto invalidHash = Hash::parseNonSRIUnprefixed( + "0000000000000000000000000000000000000000", HashAlgorithm::SHA1); + + ASSERT_THROW(repo->getCommitTree(invalidHash), Error); +} + +// ============================================================================ +// Tests for hasObject +// ============================================================================ + +TEST_F(GitUtilsTest, hasObject_returns_true_for_existing) +{ + git_repository * rawRepo = nullptr; + ASSERT_EQ(git_repository_open(&rawRepo, tmpDir.string().c_str()), 0); + + git_oid blob_oid; + const char * blob_content = "test"; + ASSERT_EQ(git_blob_create_from_buffer(&blob_oid, rawRepo, blob_content, strlen(blob_content)), 0); + git_repository_free(rawRepo); + + auto repo = openRepo(); + + char sha[GIT_OID_SHA1_HEXSIZE + 1]; + git_oid_tostr(sha, sizeof(sha), &blob_oid); + auto hash = Hash::parseNonSRIUnprefixed(sha, HashAlgorithm::SHA1); + + ASSERT_TRUE(repo->hasObject(hash)); +} + +TEST_F(GitUtilsTest, hasObject_returns_false_for_missing) +{ + auto repo = openRepo(); + + auto missingHash = Hash::parseNonSRIUnprefixed( + "0000000000000000000000000000000000000000", HashAlgorithm::SHA1); + + ASSERT_FALSE(repo->hasObject(missingHash)); +} + TEST(GitUtils, isLegalRefName) { ASSERT_TRUE(isLegalRefName("A/b")); diff --git a/src/libfetchers/git-utils.cc b/src/libfetchers/git-utils.cc index f21313a10404..840f0ae9f7ec 100644 --- a/src/libfetchers/git-utils.cc +++ b/src/libfetchers/git-utils.cc @@ -108,6 +108,13 @@ static void initLibGit2() std::call_once(initialized, []() { if (git_libgit2_init() < 0) throw Error("initialising libgit2: %s", git_error_last()->message); + + // Register support for additional git extensions. + // This allows opening repos with extensions that libgit2 doesn't natively support, + // as long as we don't actually need the extension's functionality. + // "refstorage" is used by reftables - we can ignore it since we only access objects by SHA. + const char * extensions[] = { "refstorage" }; + git_libgit2_opts(GIT_OPT_SET_EXTENSIONS, extensions, 1); }); } @@ -265,6 +272,29 @@ struct GitRepoImpl : GitRepo, std::enable_shared_from_this { initLibGit2(); + if (options.odbOnly) { + /* Open only the object database, bypassing full repository validation. + This is useful for repositories with unsupported extensions like reftables. + We create a fake repository wrapping the ODB for API compatibility. */ + + git_odb * rawOdb = nullptr; + if (git_odb_open(&rawOdb, (path / "objects").string().c_str())) + throw Error("opening Git object database %s: %s", path / "objects", git_error_last()->message); + + // Use RAII to ensure cleanup on any exception path + ObjectDb odb(rawOdb); + + if (git_repository_wrap_odb(Setter(repo), odb.get())) + throw Error("wrapping Git object database: %s", git_error_last()->message); + + // wrap_odb took ownership on success, release from unique_ptr to prevent double-free + odb.release(); + + // odbOnly mode is strictly read-only: no mempack backend, no write support. + // Attempting to write objects in this mode will fail. + return; + } + initRepoAtomically(path, options); if (git_repository_open(Setter(repo), path.string().c_str())) throw Error("opening Git repository %s: %s", path, git_error_last()->message); @@ -595,6 +625,34 @@ struct GitRepoImpl : GitRepo, std::enable_shared_from_this return true; } + Hash getSubtreeSha(const Hash & treeSha, const std::string & entryName) override + { + git_tree * tree = nullptr; + auto oid = hashToOID(treeSha); + + if (git_tree_lookup(&tree, *this, &oid)) + throw Error("looking up tree %s: %s", treeSha.gitRev(), git_error_last()->message); + + Finally freeTree([&]() { git_tree_free(tree); }); + + auto entry = git_tree_entry_byname(tree, entryName.c_str()); + if (!entry) + throw Error("entry '%s' not found in tree %s", entryName, treeSha.gitRev()); + + if (git_tree_entry_type(entry) != GIT_OBJECT_TREE) + throw Error("'%s' in tree %s is not a directory", entryName, treeSha.gitRev()); + + return toHash(*git_tree_entry_id(entry)); + } + + Hash getCommitTree(const Hash & commitSha) override + { + auto oid = hashToOID(commitSha); + auto obj = lookupObject(*this, oid); + auto tree = peelObject(obj.get(), GIT_OBJECT_TREE); + return toHash(*git_object_id(tree.get())); + } + /** * A 'GitSourceAccessor' with no regard for export-ignore. */ @@ -856,7 +914,9 @@ struct GitSourceAccessor : SourceAccessor return Stat{.type = tSymlink}; else if (mode == GIT_FILEMODE_COMMIT) - // Treat submodules as an empty directory. + // Submodules appear as commits (GIT_FILEMODE_COMMIT) in the parent tree. + // We report them as directories so listing works, but they appear empty + // since we don't recursively fetch submodule content. return Stat{.type = tDirectory}; else @@ -876,8 +936,18 @@ struct GitSourceAccessor : SourceAccessor for (size_t n = 0; n < count; ++n) { auto entry = git_tree_entry_byindex(tree.get(), n); + auto mode = git_tree_entry_filemode(entry); + std::optional type; + if (mode == GIT_FILEMODE_TREE) + type = Type::tDirectory; + else if (mode == GIT_FILEMODE_BLOB || mode == GIT_FILEMODE_BLOB_EXECUTABLE) + type = Type::tRegular; + else if (mode == GIT_FILEMODE_LINK) + type = Type::tSymlink; + else if (mode == GIT_FILEMODE_COMMIT) + type = Type::tDirectory; // submodule (appears empty, see lstat() comment) // FIXME: add to cache - res.emplace(std::string(git_tree_entry_name(entry)), DirEntry{}); + res.emplace(std::string(git_tree_entry_name(entry)), type); } return res; diff --git a/src/libfetchers/git.cc b/src/libfetchers/git.cc index 7f33d9d8c606..dd9cb5c307de 100644 --- a/src/libfetchers/git.cc +++ b/src/libfetchers/git.cc @@ -795,6 +795,51 @@ struct GitInputScheme : InputScheme }; } + /** + * Try to serve a git input from the cache using (rev, url) as the + * cache key. Returns nullopt on cache miss. + */ + std::optional, Input>> + getAccessorFromCache(const Settings & settings, Store & store, const Input & input) const + { + auto rev = input.getRev(); + if (!rev) return std::nullopt; + + auto url = getStrAttr(input.attrs, "url"); + + Cache::Key cacheKey{"gitRevUrl", { + {"rev", rev->gitRev()}, + {"url", url}, + {"submodules", getSubmodulesAttr(input) ? "1" : "0"}, + {"exportIgnore", getExportIgnoreAttr(input) ? "1" : "0"}, + {"lfs", getLfsAttr(input) ? "1" : "0"}, + }}; + + auto cached = settings.getCache()->lookupStorePath(cacheKey, store); + if (!cached) + return std::nullopt; + + debug("using cached store path for git input '%s'", input.to_string()); + + auto accessor = store.requireStoreObjectAccessor(cached->storePath); + + auto options = getGitAccessorOptions(input); + auto fp = options.makeFingerprint(*rev); + if (options.submodules) + fp += ";s"; + accessor->fingerprint = fp; + accessor->setPathDisplay("«" + input.to_string(true) + "»"); + + Input result(input); + + if (auto lm = maybeGetIntAttr(cached->value, "lastModified")) + result.attrs.insert_or_assign("lastModified", *lm); + if (auto rc = maybeGetIntAttr(cached->value, "revCount")) + result.attrs.insert_or_assign("revCount", *rc); + + return std::make_pair(accessor, std::move(result)); + } + /** * Get a `SourceAccessor` for the given Git revision using Nix < 2.20 semantics, i.e. using `git archive` or `git * checkout`. @@ -997,20 +1042,30 @@ struct GitInputScheme : InputScheme auto accessor = repo->getAccessor(rev, options, "«" + input.to_string(true) + "»"); + /* Track whether the legacy (git archive) fallback was used. If so, + we must not cache the result in gitRevUrl, because the legacy + store path has different content (exportIgnore/CRLF/export-subst + applied) than what the cache key (exportIgnore=0) implies. Caching + it would poison subsequent modern fetches of the same rev. */ + bool usedLegacyFallback = false; + if (settings.nix219Compat && !options.smudgeLfs && accessor->pathExists(CanonPath(".gitattributes"))) { /* Use Nix 2.19 semantics to generate locks, but if a NAR hash is specified, support Nix >= 2.20 semantics * as well. */ warn("Using Nix 2.19 semantics to export Git repository '%s'.", input.to_string()); auto accessorModern = accessor; accessor = getLegacyGitAccessor(store, repoInfo, repoDir, rev, options); + usedLegacyFallback = true; if (expectedNarHash) { auto narHashLegacy = fetchToStore2(settings, store, {accessor}, FetchMode::DryRun, input.getName()).second; if (expectedNarHash != narHashLegacy) { auto narHashModern = fetchToStore2(settings, store, {accessorModern}, FetchMode::DryRun, input.getName()).second; - if (expectedNarHash == narHashModern) + if (expectedNarHash == narHashModern) { accessor = accessorModern; + usedLegacyFallback = false; + } } } } else { @@ -1032,6 +1087,7 @@ struct GitInputScheme : InputScheme expectedNarHash->to_string(HashFormat::SRI, true), narHashNew.to_string(HashFormat::SRI, true)); accessor = accessorLegacy; + usedLegacyFallback = true; } } } @@ -1084,6 +1140,32 @@ struct GitInputScheme : InputScheme } } + // Cache for future rev+url lookups. + // Skip caching when the legacy fallback was used, because the + // legacy store path has exportIgnore/CRLF/export-subst applied + // and would not match the cache key (which says exportIgnore=0). + if (!usedLegacyFallback) { + auto url = getStrAttr(input.attrs, "url"); + auto [storePath, _narHash] = fetchToStore2( + settings, store, {accessor}, FetchMode::DryRun, input.getName()); + + Cache::Key cacheKey{"gitRevUrl", { + {"rev", rev.gitRev()}, + {"url", url}, + {"submodules", getSubmodulesAttr(input) ? "1" : "0"}, + {"exportIgnore", getExportIgnoreAttr(input) ? "1" : "0"}, + {"lfs", getLfsAttr(input) ? "1" : "0"}, + }}; + + Attrs cacheValue; + if (auto lm = input.getLastModified()) + cacheValue.insert_or_assign("lastModified", uint64_t(*lm)); + if (auto rc = input.getRevCount()) + cacheValue.insert_or_assign("revCount", uint64_t(*rc)); + + settings.getCache()->upsert(cacheKey, store, cacheValue, storePath); + } + assert(!origRev || origRev == rev); return {accessor, std::move(input)}; @@ -1189,6 +1271,19 @@ struct GitInputScheme : InputScheme throw UnimplementedError("exportIgnore and submodules are not supported together yet"); } + // Try to serve from cache (rev + url) before doing any network fetch. + // Skip the cache when nix219Compat is enabled, because it changes + // which content is returned (legacy git-archive behavior) and the + // cache doesn't distinguish between the two modes. + if (input.getRev() && !settings.nix219Compat) { + try { + if (auto cached = getAccessorFromCache(settings, store, input)) + return *cached; + } catch (Error & e) { + debug("cache lookup failed for git input '%s': %s", input.to_string(), e.what()); + } + } + auto [accessor, final] = input.getRef() || input.getRev() || !repoInfo.getPath() ? getAccessorFromCommit(settings, store, repoInfo, std::move(input)) : getAccessorFromWorkdir(settings, store, repoInfo, std::move(input)); @@ -1203,7 +1298,12 @@ struct GitInputScheme : InputScheme if (auto rev = input.getRev()) // FIXME: this can return a wrong fingerprint for the legacy (`git archive`) case, since we don't know here // whether to append the `;legacy` suffix or not. - return options.makeFingerprint(*rev); + { + auto fp = options.makeFingerprint(*rev); + if (options.submodules) + fp += ";s"; + return fp; + } else { auto repoInfo = getRepoInfo(input); if (auto repoPath = repoInfo.getPath(); repoPath && repoInfo.workdirInfo.submodules.empty()) { diff --git a/src/libfetchers/include/nix/fetchers/git-utils.hh b/src/libfetchers/include/nix/fetchers/git-utils.hh index eada8745c3eb..f8c749481028 100644 --- a/src/libfetchers/include/nix/fetchers/git-utils.hh +++ b/src/libfetchers/include/nix/fetchers/git-utils.hh @@ -40,6 +40,11 @@ struct GitRepo bool create = false; bool bare = false; bool packfilesOnly = false; + /** + * Open only the object database, bypassing full repository validation. + * Useful for repos with unsupported extensions (e.g., reftables). + */ + bool odbOnly = false; }; static ref openRepo(const std::filesystem::path & path, Options options); @@ -104,6 +109,12 @@ struct GitRepo virtual bool hasObject(const Hash & oid) = 0; + /** Get the SHA of a subtree entry within a tree object */ + virtual Hash getSubtreeSha(const Hash & treeSha, const std::string & entryName) = 0; + + /** Get the root tree SHA from a commit SHA */ + virtual Hash getCommitTree(const Hash & commitSha) = 0; + virtual ref getAccessor(const Hash & rev, const GitAccessorOptions & options, std::string displayPrefix) = 0; diff --git a/src/libmain/shared.cc b/src/libmain/shared.cc index cac9e38ad857..4909000f6719 100644 --- a/src/libmain/shared.cc +++ b/src/libmain/shared.cc @@ -294,7 +294,7 @@ void parseCmdLine( std::string version() { - return fmt("(Determinate Nix %s) %s", determinateNixVersion, nixVersion); + return fmt("(Tecnix %s) %s", determinateNixVersion, nixVersion); } void printVersion(const std::string & programName) diff --git a/src/libstore-tests/gcs-url.cc b/src/libstore-tests/gcs-url.cc new file mode 100644 index 000000000000..5e07949c97ba --- /dev/null +++ b/src/libstore-tests/gcs-url.cc @@ -0,0 +1,210 @@ +#include "nix/store/gcs-url.hh" +#include "nix/util/tests/gmock-matchers.hh" + +#include +#include + +namespace nix { + +// ============================================================================= +// ParsedGcsURL Tests +// ============================================================================= + +struct ParsedGcsURLTestCase +{ + std::string url; + ParsedGcsURL expected; + std::string description; +}; + +class ParsedGcsURLTest : public ::testing::WithParamInterface, public ::testing::Test +{}; + +TEST_P(ParsedGcsURLTest, parseGcsURLSuccessfully) +{ + const auto & testCase = GetParam(); + auto parsed = ParsedGcsURL::parse(parseURL(testCase.url)); + ASSERT_EQ(parsed, testCase.expected); +} + +INSTANTIATE_TEST_SUITE_P( + ValidUrls, + ParsedGcsURLTest, + ::testing::Values( + ParsedGcsURLTestCase{ + "gs://my-bucket/my-key.txt", + { + .bucket = "my-bucket", + .key = {"my-key.txt"}, + .writable = false, + }, + "basic_gcs_bucket", + }, + ParsedGcsURLTestCase{ + "gs://nix-cache/nix/store/abc123.nar.xz", + { + .bucket = "nix-cache", + .key = {"nix", "store", "abc123.nar.xz"}, + .writable = false, + }, + "nested_path", + }, + ParsedGcsURLTestCase{ + "gs://my-bucket/path/to/deep/file.txt", + { + .bucket = "my-bucket", + .key = {"path", "to", "deep", "file.txt"}, + .writable = false, + }, + "deeply_nested_path", + }, + ParsedGcsURLTestCase{ + "gs://bucket-with-dashes/key", + { + .bucket = "bucket-with-dashes", + .key = {"key"}, + .writable = false, + }, + "bucket_with_dashes", + }, + ParsedGcsURLTestCase{ + "gs://bucket123/file-with-special_chars.tar.gz", + { + .bucket = "bucket123", + .key = {"file-with-special_chars.tar.gz"}, + .writable = false, + }, + "key_with_special_chars", + }, + ParsedGcsURLTestCase{ + "gs://my-bucket/key?write=true", + { + .bucket = "my-bucket", + .key = {"key"}, + .writable = true, + }, + "with_write_true", + }, + ParsedGcsURLTestCase{ + "gs://my-bucket/key?write=false", + { + .bucket = "my-bucket", + .key = {"key"}, + .writable = false, + }, + "with_write_false", + }, + ParsedGcsURLTestCase{ + "gs://cache/path/to/nar.xz?write=true", + { + .bucket = "cache", + .key = {"path", "to", "nar.xz"}, + .writable = true, + }, + "nested_path_with_write", + }), + [](const ::testing::TestParamInfo & info) { return info.param.description; }); + +// Parameterized test for invalid GCS URLs +struct InvalidGcsURLTestCase +{ + std::string url; + std::string expectedErrorSubstring; + std::string description; +}; + +class InvalidParsedGcsURLTest : public ::testing::WithParamInterface, public ::testing::Test +{}; + +TEST_P(InvalidParsedGcsURLTest, parseGcsURLErrors) +{ + const auto & testCase = GetParam(); + + ASSERT_THAT( + [&testCase]() { ParsedGcsURL::parse(parseURL(testCase.url)); }, + ::testing::ThrowsMessage(testing::HasSubstrIgnoreANSIMatcher(testCase.expectedErrorSubstring))); +} + +INSTANTIATE_TEST_SUITE_P( + InvalidUrls, + InvalidParsedGcsURLTest, + ::testing::Values( + InvalidGcsURLTestCase{"gs:///key", "error: URI has a missing or invalid bucket name", "empty_bucket"}, + InvalidGcsURLTestCase{"gs://127.0.0.1/key", "error: URI has a missing or invalid bucket name", "ip_address_bucket"}, + InvalidGcsURLTestCase{"gs://", "error: URI has a missing or invalid bucket name", "completely_empty"}, + InvalidGcsURLTestCase{"gs://bucket", "error: URI has a missing or invalid key", "missing_key"}), + [](const ::testing::TestParamInfo & info) { return info.param.description; }); + +// ============================================================================= +// GCS URL to HTTPS Conversion Tests +// ============================================================================= + +struct GcsToHttpsConversionTestCase +{ + ParsedGcsURL input; + ParsedURL expected; + std::string expectedRendered; + std::string description; +}; + +class GcsToHttpsConversionTest : public ::testing::WithParamInterface, + public ::testing::Test +{}; + +TEST_P(GcsToHttpsConversionTest, ConvertsCorrectly) +{ + const auto & testCase = GetParam(); + auto result = testCase.input.toHttpsUrl(); + EXPECT_EQ(result, testCase.expected) << "Failed for: " << testCase.description; + EXPECT_EQ(result.to_string(), testCase.expectedRendered); +} + +INSTANTIATE_TEST_SUITE_P( + GcsToHttpsConversion, + GcsToHttpsConversionTest, + ::testing::Values( + GcsToHttpsConversionTestCase{ + ParsedGcsURL{ + .bucket = "my-bucket", + .key = {"my-key.txt"}, + .writable = false, + }, + ParsedURL{ + .scheme = "https", + .authority = ParsedURL::Authority{.host = "storage.googleapis.com"}, + .path = {"", "my-bucket", "my-key.txt"}, + }, + "https://storage.googleapis.com/my-bucket/my-key.txt", + "basic_conversion", + }, + GcsToHttpsConversionTestCase{ + ParsedGcsURL{ + .bucket = "nix-cache", + .key = {"nix", "store", "abc123.nar.xz"}, + .writable = false, + }, + ParsedURL{ + .scheme = "https", + .authority = ParsedURL::Authority{.host = "storage.googleapis.com"}, + .path = {"", "nix-cache", "nix", "store", "abc123.nar.xz"}, + }, + "https://storage.googleapis.com/nix-cache/nix/store/abc123.nar.xz", + "nested_path_conversion", + }, + GcsToHttpsConversionTestCase{ + ParsedGcsURL{ + .bucket = "bucket", + .key = {"path", "to", "deep", "object.txt"}, + .writable = true, // writable doesn't affect HTTPS URL conversion + }, + ParsedURL{ + .scheme = "https", + .authority = ParsedURL::Authority{.host = "storage.googleapis.com"}, + .path = {"", "bucket", "path", "to", "deep", "object.txt"}, + }, + "https://storage.googleapis.com/bucket/path/to/deep/object.txt", + "deeply_nested_path_conversion", + }), + [](const ::testing::TestParamInfo & info) { return info.param.description; }); + +} // namespace nix diff --git a/src/libstore-tests/meson.build b/src/libstore-tests/meson.build index 58f624611a40..a4586a40c60d 100644 --- a/src/libstore-tests/meson.build +++ b/src/libstore-tests/meson.build @@ -63,6 +63,7 @@ sources = files( 'derived-path.cc', 'downstream-placeholder.cc', 'dummy-store.cc', + 'gcs-url.cc', 'http-binary-cache-store.cc', 'legacy-ssh-store.cc', 'local-binary-cache-store.cc', diff --git a/src/libstore/filetransfer.cc b/src/libstore/filetransfer.cc index 7be3389e073b..3b21b88dcbdf 100644 --- a/src/libstore/filetransfer.cc +++ b/src/libstore/filetransfer.cc @@ -9,6 +9,8 @@ #include "store-config-private.hh" #include "nix/store/s3-url.hh" +#include "nix/store/gcs-url.hh" +#include "nix/store/gcs-creds.hh" #include #if NIX_WITH_AWS_AUTH # include "nix/store/aws-creds.hh" @@ -131,7 +133,13 @@ struct curlFileTransfer : public FileTransfer { result.urls.push_back(request.uri.to_string()); - requestHeaders = curl_slist_append(requestHeaders, "Accept-Encoding: zstd, br, gzip, deflate, bzip2, xz"); + // Don't set Accept-Encoding for AWS-signed requests. Services like GCS + // modify this header (adding "gzip(gfe)"), which breaks SigV4 signature + // validation since the header value no longer matches what was signed. +#if NIX_WITH_AWS_AUTH + if (!request.awsSigV4Provider) +#endif + requestHeaders = curl_slist_append(requestHeaders, "Accept-Encoding: zstd, br, gzip, deflate, bzip2, xz"); if (!request.expectedETag.empty()) requestHeaders = curl_slist_append(requestHeaders, ("If-None-Match: " + request.expectedETag).c_str()); if (!request.mimeType.empty()) @@ -139,6 +147,12 @@ struct curlFileTransfer : public FileTransfer for (auto it = request.headers.begin(); it != request.headers.end(); ++it) { requestHeaders = curl_slist_append(requestHeaders, fmt("%s: %s", it->first, it->second).c_str()); } + + // Set up bearer token authentication if provided (for OAuth2, e.g., GCS) + if (request.bearerToken) { + requestHeaders = + curl_slist_append(requestHeaders, fmt("Authorization: Bearer %s", *request.bearerToken).c_str()); + } } ~TransferItem() @@ -915,6 +929,13 @@ struct curlFileTransfer : public FileTransfer return enqueueItem(make_ref(*this, std::move(modifiedRequest), std::move(callback))); } + /* Handle gs:// URIs by converting to HTTPS and adding OAuth2 bearer token */ + if (request.uri.scheme() == "gs") { + auto modifiedRequest = request; + modifiedRequest.setupForGcs(); + return enqueueItem(make_ref(*this, std::move(modifiedRequest), std::move(callback))); + } + return enqueueItem(make_ref(*this, request, std::move(callback))); } @@ -988,6 +1009,23 @@ void FileTransferRequest::setupForS3() #endif } +void FileTransferRequest::setupForGcs() +{ + auto parsedGcs = ParsedGcsURL::parse(uri.parsed()); + // Update the request URI to use HTTPS + uri = parsedGcs.toHttpsUrl(); + + // Get OAuth2 bearer token from Application Default Credentials + // Use read-only scope by default, read-write if ?write=true is specified + if (auto token = getGcsCredentialsProvider()->maybeGetAccessToken(parsedGcs.writable)) { + bearerToken = std::move(*token); + debug("Using GCS OAuth2 bearer token for request (writable=%s)", parsedGcs.writable ? "true" : "false"); + } else { + // No credentials - try as public bucket + debug("GCS request without authentication (no credentials found)"); + } +} + std::future FileTransfer::enqueueFileTransfer(const FileTransferRequest & request) { auto promise = std::make_shared>(); diff --git a/src/libstore/gcs-creds.cc b/src/libstore/gcs-creds.cc new file mode 100644 index 000000000000..5ca7c46d8425 --- /dev/null +++ b/src/libstore/gcs-creds.cc @@ -0,0 +1,481 @@ +#include "nix/store/gcs-creds.hh" +#include "nix/store/filetransfer.hh" +#include "nix/util/base-n.hh" +#include "nix/util/environment-variables.hh" +#include "nix/util/json-utils.hh" +#include "nix/util/logging.hh" +#include "nix/util/users.hh" + +#include +#include +#include +#include +#include + +#include +#include +#include +#include + +namespace nix { + +namespace { + +// RAII wrappers for OpenSSL resources +struct BioDeleter +{ + void operator()(BIO * bio) const + { + if (bio) + BIO_free(bio); + } +}; +using UniqueBio = std::unique_ptr; + +struct EvpPkeyDeleter +{ + void operator()(EVP_PKEY * pkey) const + { + if (pkey) + EVP_PKEY_free(pkey); + } +}; +using UniqueEvpPkey = std::unique_ptr; + +struct EvpMdCtxDeleter +{ + void operator()(EVP_MD_CTX * ctx) const + { + if (ctx) + EVP_MD_CTX_free(ctx); + } +}; +using UniqueEvpMdCtx = std::unique_ptr; + +// Google's OAuth2 token endpoint +constexpr std::string_view TOKEN_URI = "https://oauth2.googleapis.com/token"; + +// GCE metadata server for instances running on Google Cloud +constexpr std::string_view GCE_METADATA_TOKEN_URL = + "http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token"; + +// JWT grant type for service accounts +constexpr std::string_view JWT_GRANT_TYPE = "urn:ietf:params:oauth:grant-type:jwt-bearer"; + +// GCS scopes +constexpr std::string_view GCS_SCOPE_READ_ONLY = "https://www.googleapis.com/auth/devstorage.read_only"; +constexpr std::string_view GCS_SCOPE_READ_WRITE = "https://www.googleapis.com/auth/devstorage.read_write"; + +/** + * URL-encode a string for use in application/x-www-form-urlencoded bodies. + */ +std::string urlEncode(std::string_view s) +{ + std::string result; + result.reserve(s.size()); + for (char c : s) { + if ((c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z') || (c >= '0' && c <= '9') || c == '-' || c == '_' + || c == '.' || c == '~') { + result += c; + } else { + result += '%'; + result += "0123456789ABCDEF"[(c >> 4) & 0xF]; + result += "0123456789ABCDEF"[c & 0xF]; + } + } + return result; +} + +/** + * Base64url encode (URL-safe alphabet, no padding). + * Used for JWT encoding. + */ +std::string base64urlEncode(std::string_view data) +{ + auto encoded = base64::encode(std::as_bytes(std::span{data.data(), data.size()})); + + // Convert to URL-safe alphabet and remove padding + for (char & c : encoded) { + if (c == '+') + c = '-'; + else if (c == '/') + c = '_'; + } + + // Remove trailing '=' padding + while (!encoded.empty() && encoded.back() == '=') { + encoded.pop_back(); + } + + return encoded; +} + +/** + * Check if running on GCE by probing the metadata server. + * Uses a single attempt with no retries to fail quickly on non-GCE. + */ +bool isRunningOnGce() +{ + static std::once_flag flag; + static bool result = false; + + std::call_once(flag, []() { + try { + FileTransferRequest req(VerbatimURL{std::string(GCE_METADATA_TOKEN_URL)}); + req.headers.emplace_back("Metadata-Flavor", "Google"); + req.tries = 1; // Single attempt, no retries - fail fast on non-GCE + + getFileTransfer()->download(req); + result = true; + debug("GCE metadata server detected"); + } catch (...) { + // Not on GCE, or metadata server not accessible + } + }); + + return result; +} + +/** + * Find the ADC credential file path. + * Order: + * 1. GOOGLE_APPLICATION_CREDENTIALS env var + * 2. ~/.config/gcloud/application_default_credentials.json + */ +std::optional findCredentialFile() +{ + // 1. Check GOOGLE_APPLICATION_CREDENTIALS + if (auto envPath = getEnv("GOOGLE_APPLICATION_CREDENTIALS")) { + auto path = std::filesystem::path(*envPath); + if (std::filesystem::exists(path)) { + debug("Using GCS credentials from GOOGLE_APPLICATION_CREDENTIALS: %s", path.string()); + return path; + } + warn("GOOGLE_APPLICATION_CREDENTIALS set but file not found: %s", *envPath); + } + + // 2. Check well-known ADC location + auto adcPath = getHome() / ".config" / "gcloud" / "application_default_credentials.json"; + if (std::filesystem::exists(adcPath)) { + debug("Using GCS credentials from default location: %s", adcPath.string()); + return adcPath; + } + + return std::nullopt; +} + +/** + * Load and parse a credential JSON file. + */ +nlohmann::json loadCredentialFile(const std::filesystem::path & path) +{ + std::ifstream file(path); + if (!file.is_open()) { + throw GcsAuthError("Cannot open credential file: %s", path.string()); + } + + try { + return nlohmann::json::parse(file); + } catch (nlohmann::json::parse_error & e) { + throw GcsAuthError("Invalid JSON in credential file %s: %s", path.string(), e.what()); + } +} + +/** + * Sign data with RSA-SHA256 using the given PEM private key. + */ +std::string rsaSha256Sign(std::string_view data, const std::string & privateKeyPem) +{ + // Load the private key + UniqueBio bio(BIO_new_mem_buf(privateKeyPem.data(), static_cast(privateKeyPem.size()))); + if (!bio) { + throw GcsAuthError("Failed to create BIO for private key"); + } + + UniqueEvpPkey pkey(PEM_read_bio_PrivateKey(bio.get(), nullptr, nullptr, nullptr)); + if (!pkey) { + unsigned long err = ERR_get_error(); + char errBuf[256]; + ERR_error_string_n(err, errBuf, sizeof(errBuf)); + throw GcsAuthError("Failed to parse service account private key: %s", errBuf); + } + + // Create signing context + UniqueEvpMdCtx ctx(EVP_MD_CTX_new()); + if (!ctx) { + throw GcsAuthError("Failed to create EVP_MD_CTX"); + } + + if (EVP_DigestSignInit(ctx.get(), nullptr, EVP_sha256(), nullptr, pkey.get()) != 1) { + throw GcsAuthError("EVP_DigestSignInit failed"); + } + + if (EVP_DigestSignUpdate(ctx.get(), data.data(), data.size()) != 1) { + throw GcsAuthError("EVP_DigestSignUpdate failed"); + } + + // Determine signature length + size_t sigLen = 0; + if (EVP_DigestSignFinal(ctx.get(), nullptr, &sigLen) != 1) { + throw GcsAuthError("EVP_DigestSignFinal (length query) failed"); + } + + // Allocate and get signature + std::vector sigBuf(sigLen); + if (EVP_DigestSignFinal(ctx.get(), sigBuf.data(), &sigLen) != 1) { + throw GcsAuthError("EVP_DigestSignFinal failed"); + } + + return std::string(reinterpret_cast(sigBuf.data()), sigLen); +} + +/** + * Create a signed JWT for service account authentication. + */ +std::string createServiceAccountJwt(const std::string & clientEmail, const std::string & privateKey, std::string_view scope) +{ + auto now = std::chrono::system_clock::now(); + auto iat = std::chrono::duration_cast(now.time_since_epoch()).count(); + auto exp = iat + 3600; // 1 hour + + // JWT Header + nlohmann::json header = {{"alg", "RS256"}, {"typ", "JWT"}}; + + // JWT Claims + nlohmann::json claims = { + {"iss", clientEmail}, + {"sub", clientEmail}, + {"aud", TOKEN_URI}, + {"iat", iat}, + {"exp", exp}, + {"scope", scope}, + }; + + auto headerB64 = base64urlEncode(header.dump()); + auto claimsB64 = base64urlEncode(claims.dump()); + auto signatureInput = headerB64 + "." + claimsB64; + + // Sign with RSA-SHA256 + auto signature = rsaSha256Sign(signatureInput, privateKey); + auto signatureB64 = base64urlEncode(signature); + + return signatureInput + "." + signatureB64; +} + +} // anonymous namespace + +class GcsCredentialProviderImpl : public GcsCredentialProvider +{ +public: + std::string getAccessToken(bool writable) override + { + // First, check cache under lock + { + std::lock_guard lock(mutex); + + auto & cachedToken = writable ? cachedTokenReadWrite : cachedTokenReadOnly; + if (cachedToken && !cachedToken->isExpired()) { + return cachedToken->token; + } + + // Load credentials if not yet loaded (this is fast, file I/O only) + if (!credentialsLoaded) { + loadCredentials(); + } + } + + // Refresh token without holding the lock (HTTP call can be slow) + auto scope = writable ? GCS_SCOPE_READ_WRITE : GCS_SCOPE_READ_ONLY; + auto newToken = refreshToken(scope); + + // Store the new token under lock + { + std::lock_guard lock(mutex); + auto & cachedToken = writable ? cachedTokenReadWrite : cachedTokenReadOnly; + + // Another thread may have refreshed while we were waiting. + // Use the newer token (longer expiry). + if (!cachedToken || cachedToken->expiresAt < newToken.expiresAt) { + cachedToken = newToken; + } + return cachedToken->token; + } + } + +private: + std::mutex mutex; + std::optional cachedTokenReadOnly; + std::optional cachedTokenReadWrite; + bool credentialsLoaded = false; + + // Credential type: "authorized_user", "service_account", or "gce_metadata" + std::string credentialType; + + // For authorized_user + std::string clientId; + std::string clientSecret; + std::string refreshTokenValue; + + // For service_account + std::string clientEmail; + std::string privateKey; + + void loadCredentials() + { + // First, check for credential file (highest priority) + auto credPath = findCredentialFile(); + if (credPath) { + auto json = loadCredentialFile(*credPath); + auto & obj = getObject(json); + + credentialType = getString(valueAt(obj, "type")); + + if (credentialType == "authorized_user") { + clientId = getString(valueAt(obj, "client_id")); + clientSecret = getString(valueAt(obj, "client_secret")); + refreshTokenValue = getString(valueAt(obj, "refresh_token")); + debug("Loaded authorized_user credentials"); + } else if (credentialType == "service_account") { + clientEmail = getString(valueAt(obj, "client_email")); + privateKey = getString(valueAt(obj, "private_key")); + debug("Loaded service_account credentials for %s", clientEmail); + } else { + throw GcsAuthError("Unsupported GCS credential type: %s", credentialType); + } + + credentialsLoaded = true; + return; + } + + // Fall back to GCE metadata server if running on Google Cloud + if (isRunningOnGce()) { + credentialType = "gce_metadata"; + debug("Using GCE metadata server for credentials"); + credentialsLoaded = true; + return; + } + + throw GcsAuthError( + "No GCS credentials found. Run 'gcloud auth application-default login', " + "set GOOGLE_APPLICATION_CREDENTIALS, or run on GCE/Cloud Run/GKE"); + } + + GcsAccessToken refreshToken(std::string_view scope) + { + if (credentialType == "authorized_user") { + return refreshAuthorizedUserToken(); + } else if (credentialType == "service_account") { + return refreshServiceAccountToken(scope); + } else if (credentialType == "gce_metadata") { + return refreshGceMetadataToken(); + } + throw GcsAuthError("Unknown credential type: %s", credentialType); + } + + GcsAccessToken refreshAuthorizedUserToken() + { + debug("Refreshing GCS access token using authorized_user refresh token"); + + // Note: authorized_user tokens use the scopes granted at `gcloud auth` time, + // not per-request scopes. The scope parameter is not used here. + + // POST to token endpoint with refresh_token grant + std::string body = "client_id=" + urlEncode(clientId) + "&client_secret=" + urlEncode(clientSecret) + + "&refresh_token=" + urlEncode(refreshTokenValue) + "&grant_type=refresh_token"; + + FileTransferRequest req(VerbatimURL{std::string(TOKEN_URI)}); + req.method = HttpMethod::Post; + req.mimeType = "application/x-www-form-urlencoded"; + + StringSource bodySource(body); + req.data = FileTransferRequest::UploadData(bodySource); + + auto result = getFileTransfer()->upload(req); + return parseTokenResponse(result.data); + } + + GcsAccessToken refreshServiceAccountToken(std::string_view scope) + { + debug("Refreshing GCS access token using service_account JWT (scope: %s)", scope); + + // Create and sign JWT assertion with the requested scope + auto jwt = createServiceAccountJwt(clientEmail, privateKey, scope); + + std::string body = "grant_type=" + urlEncode(std::string(JWT_GRANT_TYPE)) + "&assertion=" + urlEncode(jwt); + + FileTransferRequest req(VerbatimURL{std::string(TOKEN_URI)}); + req.method = HttpMethod::Post; + req.mimeType = "application/x-www-form-urlencoded"; + + StringSource bodySource(body); + req.data = FileTransferRequest::UploadData(bodySource); + + auto result = getFileTransfer()->upload(req); + return parseTokenResponse(result.data); + } + + GcsAccessToken refreshGceMetadataToken() + { + debug("Refreshing GCS access token from GCE metadata server"); + + // GCE metadata server provides tokens for the instance's service account. + // Scopes are determined by the instance configuration, not per-request. + FileTransferRequest req(VerbatimURL{std::string(GCE_METADATA_TOKEN_URL)}); + req.headers.emplace_back("Metadata-Flavor", "Google"); + + auto result = getFileTransfer()->download(req); + return parseTokenResponse(result.data); + } + + GcsAccessToken parseTokenResponse(const std::string & response) + { + try { + auto json = nlohmann::json::parse(response); + auto & obj = getObject(json); + + // Check for error response + if (auto * errPtr = optionalValueAt(obj, "error")) { + auto error = getString(*errPtr); + auto description = optionalValueAt(obj, "error_description"); + throw GcsAuthError( + "OAuth2 token request failed: %s%s", + error, + description ? (" - " + getString(*description)) : ""); + } + + auto accessToken = getString(valueAt(obj, "access_token")); + auto expiresIn = getInteger(valueAt(obj, "expires_in")); + + debug("Obtained GCS access token, expires in %d seconds", expiresIn); + + return GcsAccessToken{ + .token = accessToken, + .expiresAt = std::chrono::steady_clock::now() + std::chrono::seconds(expiresIn)}; + } catch (nlohmann::json::exception & e) { + throw GcsAuthError("Failed to parse OAuth2 token response: %s\nResponse: %s", e.what(), response); + } + } +}; + +std::optional GcsCredentialProvider::maybeGetAccessToken(bool writable) +{ + try { + return getAccessToken(writable); + } catch (GcsAuthError & e) { + debug("GCS credential lookup failed: %s", e.what()); + return std::nullopt; + } +} + +ref makeGcsCredentialsProvider() +{ + return make_ref(); +} + +ref getGcsCredentialsProvider() +{ + static auto instance = makeGcsCredentialsProvider(); + return instance; +} + +} // namespace nix diff --git a/src/libstore/gcs-url.cc b/src/libstore/gcs-url.cc new file mode 100644 index 000000000000..9d41d4588e06 --- /dev/null +++ b/src/libstore/gcs-url.cc @@ -0,0 +1,51 @@ +#include "nix/store/gcs-url.hh" +#include "nix/util/error.hh" + +namespace nix { + +ParsedGcsURL ParsedGcsURL::parse(const ParsedURL & parsed) +try { + if (parsed.scheme != "gs") + throw BadURL("URI scheme '%s' is not 'gs'", parsed.scheme); + + if (!parsed.authority || parsed.authority->host.empty() + || parsed.authority->hostType != ParsedURL::Authority::HostType::Name) + throw BadURL("URI has a missing or invalid bucket name"); + + if (parsed.path.size() <= 1 || !parsed.path.front().empty()) + throw BadURL("URI has a missing or invalid key"); + + // Skip the first empty path segment (from leading /) + std::vector key(parsed.path.begin() + 1, parsed.path.end()); + + // Check for write=true query parameter + bool writable = false; + auto it = parsed.query.find("write"); + if (it != parsed.query.end() && it->second == "true") { + writable = true; + } + + return ParsedGcsURL{ + .bucket = parsed.authority->host, + .key = std::move(key), + .writable = writable, + }; +} catch (BadURL & e) { + e.addTrace({}, "while parsing GCS URI: '%s'", parsed.to_string()); + throw; +} + +ParsedURL ParsedGcsURL::toHttpsUrl() const +{ + std::vector path{""}; + path.push_back(bucket); + path.insert(path.end(), key.begin(), key.end()); + + return ParsedURL{ + .scheme = "https", + .authority = ParsedURL::Authority{.host = "storage.googleapis.com"}, + .path = std::move(path), + }; +} + +} // namespace nix diff --git a/src/libstore/include/nix/store/filetransfer.hh b/src/libstore/include/nix/store/filetransfer.hh index fa8a649e2b36..47db4a93058e 100644 --- a/src/libstore/include/nix/store/filetransfer.hh +++ b/src/libstore/include/nix/store/filetransfer.hh @@ -154,6 +154,12 @@ struct FileTransferRequest * When provided, these credentials will be used with curl's CURLOPT_USERNAME/PASSWORD option. */ std::optional usernameAuth; + + /** + * Optional bearer token for OAuth2 authentication (e.g., Google Cloud Storage). + * When provided, adds "Authorization: Bearer " header to requests. + */ + std::optional bearerToken; #if NIX_WITH_AWS_AUTH /** * Pre-resolved AWS session token for S3 requests. @@ -204,6 +210,7 @@ struct FileTransferRequest } void setupForS3(); + void setupForGcs(); private: friend struct curlFileTransfer; diff --git a/src/libstore/include/nix/store/gcs-creds.hh b/src/libstore/include/nix/store/gcs-creds.hh new file mode 100644 index 000000000000..5085381db276 --- /dev/null +++ b/src/libstore/include/nix/store/gcs-creds.hh @@ -0,0 +1,79 @@ +#pragma once +///@file + +#include "nix/util/error.hh" +#include "nix/util/ref.hh" + +#include +#include +#include + +namespace nix { + +/** + * GCS access token with expiration tracking + */ +struct GcsAccessToken +{ + std::string token; + std::chrono::steady_clock::time_point expiresAt; + + bool isExpired() const + { + // Refresh 60 seconds before actual expiry for safety margin + return std::chrono::steady_clock::now() >= (expiresAt - std::chrono::seconds(60)); + } +}; + +class GcsAuthError : public Error +{ +public: + using Error::Error; +}; + +/** + * Provider for Google Cloud Storage credentials. + * Implements Application Default Credentials (ADC) discovery: + * 1. GOOGLE_APPLICATION_CREDENTIALS environment variable + * 2. ~/.config/gcloud/application_default_credentials.json + * 3. GCE metadata server (when running on Google Cloud) + * + * Supports credential types: + * - authorized_user: Uses refresh token (from `gcloud auth application-default login`) + * - service_account: Uses JWT signed with private key + * - gce_metadata: Fetches tokens from GCE metadata server (automatic on GCE/GKE/Cloud Run) + */ +class GcsCredentialProvider +{ +public: + /** + * Get an access token for GCS requests. + * Automatically refreshes expired tokens. + * + * @param writable If true, request read/write scope; otherwise read-only + * @return Access token string + * @throws GcsAuthError if credentials cannot be resolved + */ + virtual std::string getAccessToken(bool writable = false) = 0; + + /** + * Try to get an access token, returning nullopt on failure. + * + * @param writable If true, request read/write scope; otherwise read-only + */ + std::optional maybeGetAccessToken(bool writable = false); + + virtual ~GcsCredentialProvider() { } +}; + +/** + * Create a new GCS credential provider. + */ +ref makeGcsCredentialsProvider(); + +/** + * Get a reference to the global GCS credential provider. + */ +ref getGcsCredentialsProvider(); + +} // namespace nix diff --git a/src/libstore/include/nix/store/gcs-url.hh b/src/libstore/include/nix/store/gcs-url.hh new file mode 100644 index 000000000000..1eed256ff0e6 --- /dev/null +++ b/src/libstore/include/nix/store/gcs-url.hh @@ -0,0 +1,41 @@ +#pragma once +///@file + +#include "nix/util/url.hh" + +#include +#include + +namespace nix { + +/** + * Parsed gs:// URL for Google Cloud Storage + */ +struct ParsedGcsURL +{ + std::string bucket; + std::vector key; + /** + * Whether write access is requested (via ?write=true query param). + * Defaults to false (read-only). + */ + bool writable = false; + + /** + * Parse a gs:// URL. + * + * @param parsed The parsed URL to convert + * @return ParsedGcsURL with bucket and key extracted + * @throws BadURL if the URL is not a valid gs:// URL + */ + static ParsedGcsURL parse(const ParsedURL & parsed); + + /** + * Convert to HTTPS URL for storage.googleapis.com + */ + ParsedURL toHttpsUrl() const; + + auto operator<=>(const ParsedGcsURL & other) const = default; +}; + +} // namespace nix diff --git a/src/libstore/include/nix/store/meson.build b/src/libstore/include/nix/store/meson.build index 91bce9ba9b92..fad5bcbec003 100644 --- a/src/libstore/include/nix/store/meson.build +++ b/src/libstore/include/nix/store/meson.build @@ -43,6 +43,8 @@ headers = [ config_pub_h ] + files( 'export-import.hh', 'filetransfer.hh', 'gc-store.hh', + 'gcs-creds.hh', + 'gcs-url.hh', 'globals.hh', 'http-binary-cache-store.hh', 'indirect-root-store.hh', diff --git a/src/libstore/meson.build b/src/libstore/meson.build index 0a0d2b8cac65..1005ea159947 100644 --- a/src/libstore/meson.build +++ b/src/libstore/meson.build @@ -160,6 +160,14 @@ deps_public += nlohmann_json sqlite = dependency('sqlite3', 'sqlite', version : '>=3.6.19') deps_private += sqlite +# OpenSSL for GCS service account JWT signing +openssl = dependency( + 'libcrypto', + 'openssl', + version : '>= 1.1.1', +) +deps_private += openssl + s3_aws_auth = get_option('s3-aws-auth') aws_crt_cpp = cxx.find_library('aws-crt-cpp', required : s3_aws_auth) @@ -357,6 +365,12 @@ if s3_aws_auth.enabled() sources += files('aws-creds.cc') endif +# GCS (Google Cloud Storage) support +sources += files( + 'gcs-creds.cc', + 'gcs-url.cc', +) + subdir('include/nix/store') if host_machine.system() == 'linux' diff --git a/tests/functional/meson.build b/tests/functional/meson.build index d917d91c3f34..547708b657db 100644 --- a/tests/functional/meson.build +++ b/tests/functional/meson.build @@ -220,6 +220,7 @@ subdir('flakes') subdir('git') subdir('git-hashing') subdir('local-overlay-store') +subdir('tectonix') foreach suite : suites workdir = suite['workdir'] diff --git a/tests/functional/tectonix/basic.sh b/tests/functional/tectonix/basic.sh new file mode 100644 index 000000000000..8b76d14efb7e --- /dev/null +++ b/tests/functional/tectonix/basic.sh @@ -0,0 +1,75 @@ +#!/usr/bin/env bash +# Basic tectonix functionality tests + +source "$(dirname "${BASH_SOURCE[0]}")/common.sh" + +# Create test world +TEST_WORLD="$TEST_ROOT/world" +create_test_world "$TEST_WORLD" +HEAD_SHA=$(get_head_sha "$TEST_WORLD") + +echo "Testing basic zone access..." + +# Test: Manifest access +manifest=$(tectonix_eval_json "$TEST_WORLD/.git" "$HEAD_SHA" \ + 'builtins.unsafeTectonixInternalManifest') +echo "Manifest: $manifest" + +# Verify manifest contains expected zones +echo "$manifest" | grepQuiet "//areas/tools/dev" +echo "$manifest" | grepQuiet "W-000001" + +# Test: Inverted manifest +inverted=$(tectonix_eval_json "$TEST_WORLD/.git" "$HEAD_SHA" \ + 'builtins.unsafeTectonixInternalManifestInverted') +echo "Inverted manifest: $inverted" + +echo "$inverted" | grepQuiet "W-000001" +echo "$inverted" | grepQuiet "//areas/tools/dev" + +# Test: Tree SHA access +tree_sha=$(tectonix_eval "$TEST_WORLD/.git" "$HEAD_SHA" \ + 'builtins.unsafeTectonixInternalTreeSha "//areas/tools/dev"') +echo "Tree SHA for //areas/tools/dev: $tree_sha" + +# Verify SHA is 40 hex characters +if [[ ! "$tree_sha" =~ ^[0-9a-f]{40}$ ]]; then + fail "Tree SHA should be 40 hex characters, got: $tree_sha" +fi + +# Test: Zone source access +zone_src=$(tectonix_eval "$TEST_WORLD/.git" "$HEAD_SHA" \ + 'builtins.unsafeTectonixInternalZoneSrc "//areas/tools/dev"') +echo "Zone source path: $zone_src" + +# Verify it's a store path +if [[ ! "$zone_src" =~ ^${NIX_STORE_DIR:-/nix/store}/ ]]; then + fail "Zone source should be a store path, got: $zone_src" +fi + +# Test: Zone attribute set - verify individual attributes +zone_outpath=$(tectonix_eval "$TEST_WORLD/.git" "$HEAD_SHA" \ + '(builtins.unsafeTectonixInternalZone "//areas/tools/dev").outPath') +echo "Zone outPath: $zone_outpath" +[[ -n "$zone_outpath" ]] || fail "Zone should have outPath" + +zone_treeSha=$(tectonix_eval "$TEST_WORLD/.git" "$HEAD_SHA" \ + '(builtins.unsafeTectonixInternalZone "//areas/tools/dev").treeSha') +echo "Zone treeSha: $zone_treeSha" +[[ -n "$zone_treeSha" ]] || fail "Zone should have treeSha" + +zone_zonePath=$(tectonix_eval "$TEST_WORLD/.git" "$HEAD_SHA" \ + '(builtins.unsafeTectonixInternalZone "//areas/tools/dev").zonePath') +echo "Zone zonePath: $zone_zonePath" +[[ "$zone_zonePath" == "//areas/tools/dev" ]] || fail "Zone zonePath should be //areas/tools/dev, got: $zone_zonePath" + +zone_dirty=$(tectonix_eval_json "$TEST_WORLD/.git" "$HEAD_SHA" \ + '(builtins.unsafeTectonixInternalZone "//areas/tools/dev").dirty') +echo "Zone dirty: $zone_dirty" + +# Verify dirty is false (clean repo) +if [[ "$zone_dirty" == "true" ]]; then + fail "Zone should not be dirty in clean repo" +fi + +echo "Basic tests passed!" diff --git a/tests/functional/tectonix/common.sh b/tests/functional/tectonix/common.sh new file mode 100644 index 000000000000..63946cdae7d9 --- /dev/null +++ b/tests/functional/tectonix/common.sh @@ -0,0 +1,110 @@ +# shellcheck shell=bash + +# Common setup for tectonix functional tests + +set -eu -o pipefail + +if [[ -z "${TECTONIX_COMMON_SH_SOURCED-}" ]]; then + +TECTONIX_COMMON_SH_SOURCED=1 + +# Source the main test framework +source "$(dirname "${BASH_SOURCE[0]}")/../common.sh" + +requireGit + +# Create a test world repository with zones +create_test_world() { + local dir="$1" + + # Initialize git repo + git init "$dir" + cd "$dir" + + # Create directory structure + mkdir -p .meta + mkdir -p areas/tools/dev + mkdir -p areas/tools/tec + mkdir -p areas/platform/core + + # Create manifest + cat > .meta/manifest.json << 'MANIFEST_EOF' +{ + "//areas/tools/dev": { "id": "W-000001" }, + "//areas/tools/tec": { "id": "W-000002" }, + "//areas/platform/core": { "id": "W-000003" } +} +MANIFEST_EOF + + # Create zone files + echo '{ }' > areas/tools/dev/zone.nix + echo 'Dev zone README' > areas/tools/dev/README.md + echo '{ }' > areas/tools/tec/zone.nix + echo '{ }' > areas/platform/core/zone.nix + echo 'Test World' > README.md + + # Configure git + git config user.email "test@example.com" + git config user.name "Test User" + + # Create sparse-checkout-roots so dirty zone detection works + mkdir -p .git/info + cat > .git/info/sparse-checkout-roots << 'SPARSE_EOF' +W-000001 +W-000002 +W-000003 +SPARSE_EOF + + # Commit everything + git add -A + git commit -m "Initial commit" + + cd - > /dev/null +} + +# Get the HEAD SHA of a repo +get_head_sha() { + local dir="$1" + git -C "$dir" rev-parse HEAD +} + +# Evaluate a nix expression with tectonix settings +tectonix_eval() { + local git_dir="$1" + local git_sha="$2" + local expr="$3" + shift 3 + + nix eval --raw \ + --extra-experimental-features 'nix-command' \ + --option tectonix-git-dir "$git_dir" \ + --option tectonix-git-sha "$git_sha" \ + "$@" \ + --expr "$expr" +} + +# Evaluate with JSON output +tectonix_eval_json() { + local git_dir="$1" + local git_sha="$2" + local expr="$3" + shift 3 + + nix eval --json \ + --extra-experimental-features 'nix-command' \ + --option tectonix-git-dir "$git_dir" \ + --option tectonix-git-sha "$git_sha" \ + "$@" \ + --expr "$expr" +} + +# Expect a command to fail +expect_failure() { + if "$@" 2>/dev/null; then + echo "Expected command to fail: $*" >&2 + return 1 + fi + return 0 +} + +fi # TECTONIX_COMMON_SH_SOURCED diff --git a/tests/functional/tectonix/deduplication.sh b/tests/functional/tectonix/deduplication.sh new file mode 100644 index 000000000000..c09c9eb73b5f --- /dev/null +++ b/tests/functional/tectonix/deduplication.sh @@ -0,0 +1,55 @@ +#!/usr/bin/env bash +# Test that same tree SHA across commits returns same path (deduplication) + +source "$(dirname "${BASH_SOURCE[0]}")/common.sh" + +# Create test world +TEST_WORLD="$TEST_ROOT/world" +create_test_world "$TEST_WORLD" + +cd "$TEST_WORLD" + +# Get first commit SHA +SHA1=$(git rev-parse HEAD) +echo "First commit: $SHA1" + +# Make a commit that doesn't touch //areas/tools/dev +echo "Other content" >> README.md +git add README.md +git commit -m "Update README only" + +# Get second commit SHA +SHA2=$(git rev-parse HEAD) +echo "Second commit: $SHA2" + +cd - > /dev/null + +# Get tree SHAs for both commits +tree_sha1=$(tectonix_eval "$TEST_WORLD/.git" "$SHA1" \ + 'builtins.unsafeTectonixInternalTreeSha "//areas/tools/dev"') +echo "Tree SHA at commit 1: $tree_sha1" + +tree_sha2=$(tectonix_eval "$TEST_WORLD/.git" "$SHA2" \ + 'builtins.unsafeTectonixInternalTreeSha "//areas/tools/dev"') +echo "Tree SHA at commit 2: $tree_sha2" + +# Tree SHAs should be identical since we didn't modify that zone +if [[ "$tree_sha1" != "$tree_sha2" ]]; then + fail "Tree SHA should be same across commits for unchanged zone" +fi + +# Get zone paths for both commits +path1=$(tectonix_eval "$TEST_WORLD/.git" "$SHA1" \ + 'builtins.unsafeTectonixInternalZoneSrc "//areas/tools/dev"') +echo "Zone path at commit 1: $path1" + +path2=$(tectonix_eval "$TEST_WORLD/.git" "$SHA2" \ + 'builtins.unsafeTectonixInternalZoneSrc "//areas/tools/dev"') +echo "Zone path at commit 2: $path2" + +# Should be same path due to tree SHA deduplication +if [[ "$path1" != "$path2" ]]; then + fail "Expected same path for unchanged zone across commits" +fi + +echo "Deduplication test passed!" diff --git a/tests/functional/tectonix/dirty-zones.sh b/tests/functional/tectonix/dirty-zones.sh new file mode 100644 index 000000000000..7670b128a1cc --- /dev/null +++ b/tests/functional/tectonix/dirty-zones.sh @@ -0,0 +1,57 @@ +#!/usr/bin/env bash +# Test dirty zone detection + +source "$(dirname "${BASH_SOURCE[0]}")/common.sh" + +# Create test world +TEST_WORLD="$TEST_ROOT/world" +create_test_world "$TEST_WORLD" +HEAD_SHA=$(get_head_sha "$TEST_WORLD") + +echo "Testing dirty zone detection..." + +# First, verify zone is clean +dirty_status=$(tectonix_eval_json "$TEST_WORLD/.git" "$HEAD_SHA" \ + '(builtins.unsafeTectonixInternalZone "//areas/tools/dev").dirty' \ + --option tectonix-checkout-path "$TEST_WORLD") +echo "Clean zone dirty status: $dirty_status" + +if [[ "$dirty_status" == "true" ]]; then + fail "Zone should be clean before modification" +fi + +# Modify a file in the zone +echo "Modified content" >> "$TEST_WORLD/areas/tools/dev/zone.nix" + +# Now check dirty status +dirty_status_after=$(tectonix_eval_json "$TEST_WORLD/.git" "$HEAD_SHA" \ + '(builtins.unsafeTectonixInternalZone "//areas/tools/dev").dirty' \ + --option tectonix-checkout-path "$TEST_WORLD") +echo "Dirty zone dirty status: $dirty_status_after" + +if [[ "$dirty_status_after" != "true" ]]; then + fail "Zone should be dirty after modification" +fi + +# Check dirtyZones builtin +dirty_zones=$(tectonix_eval_json "$TEST_WORLD/.git" "$HEAD_SHA" \ + 'builtins.unsafeTectonixInternalDirtyZones' \ + --option tectonix-checkout-path "$TEST_WORLD") +echo "Dirty zones: $dirty_zones" + +# Verify the modified zone appears as dirty +if ! echo "$dirty_zones" | grepQuiet "//areas/tools/dev"; then + fail "Modified zone should appear in dirtyZones" +fi + +# Verify an unmodified zone is not dirty +clean_dirty_status=$(tectonix_eval_json "$TEST_WORLD/.git" "$HEAD_SHA" \ + '(builtins.unsafeTectonixInternalZone "//areas/tools/tec").dirty' \ + --option tectonix-checkout-path "$TEST_WORLD") +echo "Unmodified zone dirty status: $clean_dirty_status" + +if [[ "$clean_dirty_status" == "true" ]]; then + fail "Unmodified zone should not be dirty" +fi + +echo "Dirty zone tests passed!" diff --git a/tests/functional/tectonix/errors.sh b/tests/functional/tectonix/errors.sh new file mode 100644 index 000000000000..e698b0e1e629 --- /dev/null +++ b/tests/functional/tectonix/errors.sh @@ -0,0 +1,72 @@ +#!/usr/bin/env bash +# Error handling tests for tectonix builtins + +source "$(dirname "${BASH_SOURCE[0]}")/common.sh" + +# Create test world +TEST_WORLD="$TEST_ROOT/world" +create_test_world "$TEST_WORLD" +HEAD_SHA=$(get_head_sha "$TEST_WORLD") + +echo "Testing error handling..." + +# Test: Missing git-dir setting +echo "Testing missing git-dir..." +expect_failure nix eval --json \ + --extra-experimental-features 'nix-command' \ + --option tectonix-git-sha "$HEAD_SHA" \ + --expr 'builtins.unsafeTectonixInternalManifest' + +# Test: Invalid SHA +echo "Testing invalid SHA..." +expect_failure nix eval --json \ + --extra-experimental-features 'nix-command' \ + --option tectonix-git-dir "$TEST_WORLD/.git" \ + --option tectonix-git-sha "0000000000000000000000000000000000000000" \ + --expr 'builtins.unsafeTectonixInternalManifest' + +# Test: Missing git-sha +echo "Testing missing git-sha..." +expect_failure nix eval --json \ + --extra-experimental-features 'nix-command' \ + --option tectonix-git-dir "$TEST_WORLD/.git" \ + --expr 'builtins.unsafeTectonixInternalManifest' + +# Test: Non-zone path (parent of zone) +echo "Testing non-zone path (parent)..." +expect_failure tectonix_eval "$TEST_WORLD/.git" "$HEAD_SHA" \ + 'builtins.unsafeTectonixInternalZoneSrc "//areas/tools"' + +# Test: Non-zone path (subpath of zone) +echo "Testing non-zone path (subpath)..." +expect_failure tectonix_eval "$TEST_WORLD/.git" "$HEAD_SHA" \ + 'builtins.unsafeTectonixInternalZoneSrc "//areas/tools/dev/subdir"' + +# Test: Non-existent path +echo "Testing non-existent path..." +expect_failure tectonix_eval "$TEST_WORLD/.git" "$HEAD_SHA" \ + 'builtins.unsafeTectonixInternalZoneSrc "//does/not/exist"' + +# Test: Invalid tree SHA for __unsafeTectonixInternalTree +echo "Testing invalid tree SHA..." +expect_failure tectonix_eval "$TEST_WORLD/.git" "$HEAD_SHA" \ + 'builtins.unsafeTectonixInternalTree "0000000000000000000000000000000000000000"' + +# Test: Tree access works without git SHA + +echo "Testing tree access without git SHA..." +TREE_SHA=$(git -C "$TEST_WORLD" rev-parse HEAD^{tree}) +nix eval --raw \ + --extra-experimental-features 'nix-command' \ + --option tectonix-git-dir "$TEST_WORLD/.git" \ + --expr "builtins.unsafeTectonixInternalTree \"$TREE_SHA\"" > /dev/null + +# Test: Non-existent git directory +echo "Testing non-existent git directory..." +expect_failure nix eval --json \ + --extra-experimental-features 'nix-command' \ + --option tectonix-git-dir "/nonexistent/path/.git" \ + --option tectonix-git-sha "$HEAD_SHA" \ + --expr 'builtins.unsafeTectonixInternalManifest' + +echo "Error handling tests passed!" diff --git a/tests/functional/tectonix/meson.build b/tests/functional/tectonix/meson.build new file mode 100644 index 000000000000..72fd3f982471 --- /dev/null +++ b/tests/functional/tectonix/meson.build @@ -0,0 +1,11 @@ +suites += { + 'name' : 'tectonix', + 'deps' : [], + 'tests' : [ + 'basic.sh', + 'errors.sh', + 'deduplication.sh', + 'dirty-zones.sh', + ], + 'workdir' : meson.current_source_dir(), +}