Skip to content

Agent install serves a STALE manifest from the cached main.tar.gz when a manifest change leaves registry-index.json untouched (residual of #254) #270

Description

@pawellisowski

Summary

A manifest change made inside the rolling main.tar.gz (e.g. flipping a status: field) does not bust the agent-install tarball cache, because the cache key only rotates when registry-index.json changes. So users who cached main.tar.gz before the manifest change keep installing the stale manifest indefinitely (no TTL on the tarball cache). This is the residual case that #254's content-fingerprint fix does not cover.

Concrete, reproduced repro (live)

#269 flipped 20-agents/_core/file/manifest.yaml status: planned → available (2026-06-25) but did not touch registry-index.json (last changed by #257, 2026-06-18). On a machine whose ~/.aware/cache/agents/tarball-<sha>.tar.gz was cached on 2026-06-18:

$ aware agent install file
✓ installed file
$ grep '^status:' ~/.aware/agents/file/manifest.yaml
status: planned            # ← STALE (main has `available`)
$ aware app run <app-with-a-file/write-node>
error: node "w" references agent "file", which is declared but not yet runnable
       (no shipped/installable transport binary)
error: validation failed: [E_APP_AGENT_UNAVAILABLE]

# purge the URL+fingerprint-keyed tarball cache, reinstall:
$ rm ~/.aware/cache/agents/tarball-*.tar.gz && aware agent uninstall file && aware agent install file
$ grep '^status:' ~/.aware/agents/file/manifest.yaml
status: available          # ← correct; app run then works

The file/write/write-csv verbs are wired in-process (#268) and aware agent invoke file write-csv works — only aware app run's validate gate rejects the stale-planned manifest. So the binary is fine; the manifest delivery is stale.

Root cause (file:line)

  • cli/src/install/registry.rs:58 — tarball cache file = tarball-{sha256(tarball_url + index.snapshot_fingerprint())}.tar.gz.
  • cli/src/registry/index.rs snapshot_fingerprint() hashes the serialized index (registry-index.json). A change inside main.tar.gz that leaves registry-index.json byte-identical produces the same fingerprint → same cache filename → the stale archive is reused (registry.rs:63 if cache_file.is_file() { copy }).
  • The tarball cache has no TTL (unlike the index/catalog, which carry a 1h TTL in cli/src/registry/fetch.rs:19).

The code comment at registry.rs:48-57 already anticipates this: updated-at is "the registry's manual refresh lever" — but nothing bumps it when a manifest inside the tarball changes, so the lever is never pulled.

Impact

Any manifest edit that ships via the rolling main.tar.gz without an accompanying registry-index.json change (status flips, description/keyword fixes, input/output schema tweaks) silently fails to reach users with a warm cache. The file agent is a live example — it breaks file/write apps with a confusing E_APP_AGENT_UNAVAILABLE even though the runtime supports the verb.

Suggested fixes (pick one)

  1. Auto-bump registry-index.json updated-at in CI/release whenever any 20-agents/**/manifest.yaml changes (or on every release tag) — pulls the documented lever automatically.
  2. Add a TTL to the tarball cache mirroring the index's 1h TTL (fetch.rs:19), so a warm cache self-refreshes (bounded re-download, not per-agent).
  3. Key the cache on the archive's real content (ETag / commit SHA of refs/heads/main) instead of the index fingerprint.

Mitigation already applied

PR # bumps updated-at now so the current stale file caches bust on the next install (live, 1h TTL). That's a one-off; this issue tracks the systemic fix so it can't recur.

Cross-ref: #254 (the prior fingerprint fix this is a residual of), #268/#269 (the file agent), #243 (the per-agent re-download optimization the shared cache serves).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingqa-readyFixed and merged — awaiting QA verification

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions