feat(verify): add provider enumeration mode to ans-verify#32
Conversation
New `ans-verify list -provider <host>` subcommand walks the log's
entry tiles via the tlog-tiles spec, decodes V1/V2 producer
envelopes, and reports every agent whose `agent.host` falls under
the given suffix. Optional flags collapse to currently-live agents,
verify each match's SCITT receipt, and bound concurrency. No
server-side changes; this is purely a new CLI surface on the
existing cmd/ans-verify binary.
The walker fails closed on every checkable property:
* Checkpoint signature verified against /root-keys via
logstore.VerifyC2SPECDSA before any tile fetch (omission-attack
guard against a tampered logSize).
* Receipt payload cross-checked byte-for-byte against the tile
leaf bytes during -verify (leaf-substitution guard).
* agentId interpolated into URLs only after passing a UUID guard.
* All HTTP bodies read through io.LimitReader at 32 MiB.
* TL-supplied strings printed via %q so embedded ANSI/newlines
cannot spoof CLI output.
The existing single-agent ans-verify [-agent] <uuid> form is
unchanged.
Signed-off-by: Layer8 <NWillAU900@gmail.com>
There was a problem hiding this comment.
Pull request overview
Adds a new ans-verify list -provider <host> subcommand to the existing offline verifier CLI, enabling client-side enumeration of agents under a provider suffix by walking transparency-log entry tiles (optionally collapsing to “live” state and verifying SCITT receipts).
Changes:
- Introduces a tile-walker that fetches/decodes entry bundles, filters by provider host suffix, and optionally reduces results to “live” agents.
- Adds checkpoint-note parsing + signature verification against
/root-keys, and optional per-match receipt verification with a leaf-substitution cross-check. - Adds comprehensive unit/integration-style tests for helper logic and the walker/verification flows, plus minimal subcommand dispatch in
main.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| cmd/ans-verify/walk.go | Implements provider enumeration (list), tile walking, live-reduction, checkpoint verification, and optional receipt verification. |
| cmd/ans-verify/walk_test.go | Adds tests for provider matching, tile walking, checkpoint note verification, response caps, and receipt verification behaviors. |
| cmd/ans-verify/main.go | Adds subcommand dispatch to route ans-verify list ... to the new implementation without changing the existing single-agent path. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| entries, err := decodeEntryBundle(raw) | ||
| if err != nil { | ||
| wrapped := fmt.Errorf("decode %s: %w", path, err) | ||
| results[tileIdx] = tileResult{err: wrapped} | ||
| recordErr(wrapped) | ||
| continue | ||
| } | ||
| results[tileIdx] = tileResult{entries: entries} |
| for tileIdx := range nTiles { | ||
| jobs <- tileIdx | ||
| } | ||
| close(jobs) | ||
| wg.Wait() |
| fs := flag.NewFlagSet("list", flag.ExitOnError) | ||
| var ( | ||
| baseURL string | ||
| provider string | ||
| live bool | ||
| doVerify bool | ||
| concurrency int | ||
| ) | ||
| fs.StringVar(&baseURL, "url", "http://localhost:18081", | ||
| "Base URL of the transparency log") | ||
| fs.StringVar(&provider, "provider", "", | ||
| "Provider host suffix to filter on (e.g. darknetian.com)") | ||
| fs.BoolVar(&live, "live", true, | ||
| "Collapse to one row per agent and drop revoked/deprecated agents") | ||
| fs.BoolVar(&doVerify, "verify", false, | ||
| "After listing, fetch and verify each matched agent's SCITT receipt") | ||
| fs.IntVar(&concurrency, "concurrency", 8, | ||
| "Number of parallel HTTP workers (1-64)") | ||
| if err := fs.Parse(args); err != nil { | ||
| fmt.Fprintln(os.Stderr, err) | ||
| os.Exit(1) | ||
| } |
…ware producer, flag handling Three findings from the automated review on PR godaddy#32: 1. Tile-size validation. After decoding an entry bundle, assert the leaf count matches the expected width — EntryBundleWidth (256) for a full tile, or the path's `.p/<N>` width for a partial tile. The checkpoint signature binds the tree shape but not the contents of any individual tile; without this guard a hostile or buggy TL can serve a truncated bundle (omitting leaves) or an oversized one (injecting extras) and the walker would silently accept it, undermining the "fail closed" property even after the checkpoint passes. Two new regression tests pin both directions. 2. Producer respects cancellation. The producer loop now selects on `wctx.Done()` between sends, so once a worker records the first error and cancels the context, the producer breaks out of the enqueue loop instead of pushing every remaining tile index. For a large log this avoids significant churn after the first failure. Workers still drain whatever is already in the channel via their existing wctx.Err() check, so close(jobs) never deadlocks. 3. Flag-parse handling. flag.NewFlagSet was constructed with flag.ExitOnError, which calls os.Exit(2) internally before Parse returns — the `if err := fs.Parse(...); err != nil` block was dead code. Switched to flag.ContinueOnError with an explicit fs.Usage so a parse failure prints consistent usage text and exits 1, matching the rest of the binary's error handling. Signed-off-by: Layer8 <NWillAU900@gmail.com>
|
Thanks for the review — all three are good catches. Addressed in 1. Tile-size validation ( 2. Producer respects cancellation ( 3. Flag handling ( Diff: nicknacnic:feat/ans-verify-list-provider → d7c288f |
|
Thanks for the contribution! This PR actually surfaced a bug in the checkpoint signing that differed from the intended signature format. Could you please update your PR with the latest changes from main and update the parsing to handle the correct format? |
Updates signTestCheckpoint in walk_test.go to produce ASN.1 DER ECDSA signatures, matching the production wire shape restored in godaddy/ans PR godaddy#38 ("fix(tl): emit DER C2SP checkpoint signatures"). VerifyC2SPECDSA on main still accepts IEEE P1363 r||s as a legacy fallback for older local-dev checkpoints, so the previous P1363 fixture continued to pass — but the tests should pin the format verifiers will see in production, not the deprecated one. Walker production code is unchanged: verifyCheckpointNote delegates ECDSA verification to logstore.VerifyC2SPECDSA, which after PR godaddy#38 transparently accepts DER (primary) and P1363 (legacy). No other adjustments needed. Signed-off-by: Layer8 <NWillAU900@gmail.com>
|
Thanks Connor — pulled in The walker's production code didn't need changes: What did need updating was the test fixture — Verified end-to-end against a fresh build of this branch: brought up Branch tip: |
csnitker-godaddy
left a comment
There was a problem hiding this comment.
Thanks for getting that updated! The changes look good and I verified they work against the local and production transparency log.
I think we probably need to come back to this cli sooner than later and implement something like cobra / viper to manage the CLI flags and help text but I can circle back on that when I get a bit of time
Summary
Adds
ans-verify list -provider <host>— a client-side enumeration mode that walks the log's entry tiles, decodes V1/V2 producer envelopes, and reports every agent whoseagent.hostfalls under the given provider suffix. Optional flags collapse to currently-live agents, verify each match's SCITT receipt, and bound concurrency. No server-side changes; this is purely a new CLI surface on the existingcmd/ans-verifybinary.Use case: an operator running a provider domain (an enterprise, a demo zone, a CTF) wants to see every agent the TL has logged under that domain without registering each
agentIdout-of-band. The reference verifier already exposes per-agent reads; this lets an offline verifier reconstruct the by-provider view from the log itself.What's new
CLI:
-provider— host suffix to match. Exact-match plus strict subdomain (x.suffix); rejects substring spoofs likeevilsuffix.comagainstsuffix.com. Case-insensitive, trailing-dot tolerant.-live(defaulttrue) — collapse to one row peransNamekeeping the latest leaf, dropAGENT_REVOKEDandAGENT_DEPRECATED. Unknown event types are kept (forward-compat against future active states).-verify(defaultfalse) — for each match, fetch and verify the SCITT receipt.-concurrency N(default 8, clamped 1–64) — parallelism for tile fetches and verify fetches.The existing single-agent
ans-verify [-agent] <uuid>form is unchanged.Security properties
The walker fails closed on every checkable property. The threat model assumes the TL endpoint could be hostile or compromised — the verifier is the trust anchor, not the server.
Checkpoint signature is verified before any tile fetch. A new
verifiedCheckpointfetches/checkpoint(raw signed note), parses it, and verifies the C2SP ECDSA-P256 signature against/root-keysvialogstore.VerifyC2SPECDSA. Without this step, a hostile TL could lie aboutlogSizeand hide tiles containing agents the attacker wants to omit. Pure parser isverifyCheckpointNote.Leaf-substitution guard. When
-verifyis on, the walker stores the JCS-canonical envelope bytes it read from each tile. After receipt verification, it assertsbytes.Equal(receipt.ExtractPayload(rec), m.LeafBytes). Mismatch is a hard per-match failure with a clear "possible leaf substitution" error. Without this, a TL could serve a forged tile (fake host claim) plus a real receipt for an unrelated agent and the receipt-only check would pass.Path-injection guard on
agentId. The agentId on each match comes from a TL leaf the verifier doesn't fully trust. Anything that isn't a UUID is refused before being interpolated into the receipt URL.Response-size cap. All HTTP bodies are read through
io.LimitReader(body, 32 MiB+1); oversize responses are rejected, not truncated.Log-injection-safe output. All TL-supplied strings (
ansName,host,eventType,agentId) are printed via%q, so ANSI escapes and embedded newlines cannot spoof CLI output.External context cancellation surfaces an error. A cancelled or timed-out parent context returns the ctx error rather than a silently truncated match list. The concurrent fetcher captures the first triggering error atomically so the user sees the root cause, not whichever tile happened to be lowest-indexed at the moment of cancel.
Out of scope
ans-verify <uuid>single-agent path still fetches its receipt directly without first verifying a checkpoint. That's a parallel gap worth closing in a follow-up; this PR doesn't change that path.EQUIVALENCE_LINKevents (PR [AI assisted] feat(event): EquivalenceLink V2 event for cross-anchor binding #20) are deliberately skipped byextractAgentIdentitysince they carry noagentblock. A future-include-linksmode could surface them; not needed today.Files touched
cmd/ans-verify/walk.go(new) — walker, dedup, verifier, checkpoint parser,listsubcommand.cmd/ans-verify/walk_test.go(new) — unit coverage for every helper plus integration of walker + verifier againsthttptest.Serverfixtures.cmd/ans-verify/main.go— three-line subcommand dispatch at the top ofmain(). No other changes to existing behavior.Test plan
make checkpasses (fmt,vet,golangci-lint,test-cover).go test -race ./cmd/ans-verify/...passes.context.Cancelsurfaces a non-nil error rather than partial results.scripts/demo/start.sh+register.sh, ranlistandlist -verify, revoked one and confirmed live-mode reduction drops it, confirmed raw mode preserves the full lifecycle. Checkpoint signature verified on every invocation, all receipts verified including leaf-substitution cross-check.