docs: add Index Authority Receipts for IFC evidence#270
docs: add Index Authority Receipts for IFC evidence#270Maurice Witten (blocksifrdev) wants to merge 1 commit into
Conversation
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a880e2b115
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
| if float(economics.get("storage_reduction_x", 0)) < float(policy.get("min_storage_reduction_x", 0)): | ||
| return "REQUIRE_DENSE_FALLBACK" | ||
|
|
||
| if float(economics.get("single_query_speedup_x", 0)) < float(policy.get("min_single_query_speedup_x", 0)): |
There was a problem hiding this comment.
Recompute ratios before applying policy thresholds
When verifying a receipt whose raw bytes/latencies don’t match the derived storage_reduction_x/single_query_speedup_x, this branch authorizes based only on the supplied derived numbers. For example, a receipt can set baseline.bytes_per_vector == candidate_bytes_per_vector and equal latencies, but inflate both *_x fields above the policy thresholds and still get ALLOW_INDEX_FIRST. Since those ratios are included alongside their source values in the receipt, the verifier should recompute or at least cross-check them before using them for authorization.
Useful? React with 👍 / 👎.
| args = parser.parse_args() | ||
|
|
||
| data = load_json(args.receipt) | ||
| errors = shape_errors(data) |
There was a problem hiding this comment.
Reject schema-invalid receipts before computing decisions
The verifier’s only validation here is the custom shape_errors() subset, so receipts that violate the checked-in JSON schema either crash later or are still authorized. For example, omitting the schema-required decision.policy raises a traceback in compute_decision, while strings for numeric fields or extra properties are accepted even though the schema rejects them. Since this command is the documented verifier for machine-readable receipts, it should run full schema validation or mirror the required/type/additionalProperties checks before computing authorization.
Useful? React with 👍 / 👎.
| if float(economics.get("single_query_speedup_x", 0)) < float(policy.get("min_single_query_speedup_x", 0)): | ||
| return "REQUIRE_DENSE_FALLBACK" | ||
|
|
||
| return "ALLOW_INDEX_FIRST" |
There was a problem hiding this comment.
Handle the HNSW-comparison decision state
The schema and docs advertise REQUIRE_HNSW_COMPARISON as a valid decision, but compute_decision has no path that can return it; after the existing checks pass, every receipt falls through to ALLOW_INDEX_FIRST. A receipt for the documented regime where graph/ANN comparison is required will therefore always fail with a decision mismatch, so the verifier needs a policy/scope predicate for that state or the state should not be accepted as valid.
Useful? React with 👍 / 👎.
| ifc = data["ifc"] | ||
| evidence = data["evidence"] | ||
| economics = data["economics"] | ||
| policy = data["decision"]["policy"] |
There was a problem hiding this comment.
Require verifier-owned acceptance policy
Because the verifier reads the policy thresholds from the receipt being evaluated, a schema-valid receipt can authorize itself by lowering min_storage_reduction_x/min_single_query_speedup_x to zero or disabling the quality requirement, even when the reported speedup and storage reduction are below any meaningful bar. For an authorization verifier, these acceptance rules need to come from the verifier configuration or fixed minimums rather than the untrusted evidence packet itself.
Useful? React with 👍 / 👎.
Signed-off-by: blocksifrdev <maurice@blocksifr.com>
a880e2b to
d302dd7
Compare
|
Thank you for your contribution! We will review the PR as soon as we have the bandwidth. 🙏🏻 |
Summary
Adds an optional CAIF-style Index Authority Receipt for ordvec benchmark evidence.
The goal is to make ordvec's index-first retrieval evidence machine-readable: quality delta, bytes/vector, latency regime, benchmark scope, limitations, fallback conditions, and a deterministic receipt hash.
Why
ordvec already has a strong index-first compute story: compressed ordinal/sign retrieval can preserve retrieval quality under stated benchmark scopes while reducing storage and latency.
This PR adds a small evidence packet and verifier so downstream systems can answer:
What this includes
docs/INDEX_AUTHORITY_RECEIPTS.mdschemas/caif/ordvec-index-authority.v0.1.schema.jsonexamples/caif/trec-covid-sign-rq2.index-authority.jsontools/verify_index_authority.pyWhat this does not do
Cargo.tomlVerification
Expected output includes:
Scope
The example uses existing public README benchmark values and preserves the stated limitations around dataset, encoder, corpus size, batch/threading regime, HNSW comparison, and larger-corpus claims.
Framing
Benchmarks should not only report performance.
They should authorize compute paths within a defined evidence envelope.