Skip to content

fix: restore compatibility for core-type extension fallout across tests#293

Merged
jverdicc merged 1 commit intomainfrom
codex/fix-all-tests-after-core-type-changes
Mar 1, 2026
Merged

fix: restore compatibility for core-type extension fallout across tests#293
jverdicc merged 1 commit intomainfrom
codex/fix-all-tests-after-core-type-changes

Conversation

@jverdicc
Copy link
Owner

@jverdicc jverdicc commented Mar 1, 2026

Motivation

  • Recent core type/flow extensions changed request metadata, holdout/topic budget and scope handling, and claim reservation lifecycle which caused many workspace tests to fail.
  • The goal is to audit and update tests and small helper/handler logic so tests reflect the intended production behavior without weakening assertions or changing canonical serialization.

Description

  • Updated tests and helpers to supply required request metadata: request_with_principal now injects a unique x-request-id and authorization header, and epoch/trial fixtures are written via write_epoch_config where needed.
  • Fixed boundary nondeterminism in the ledger test by using a representable overflow increment in JointLeakagePool test (1e-12) to avoid sub-ULP rounding issues.
  • Adjusted HTTP preflight tests to include sessionId/agentId in bodies, and fixed revocation test vectors to use 32-byte claim_id values.
  • Made lifecycle and sweeping changes: strict_pln_padding_duration uses lazy then(|| ...); sweep_expired_reservations runs its mutate/append work under a scoped lock; transition_claim_internal allows Uncommitted/Frozen → Stale transitions used by the sweeper.
  • Handler fixes to preserve correct runtime behavior: create_claim_v2 now rolls back newly-created topic pools on reservation failure, and holdout pool initialization honors descriptor-provided holdout_k_bits_budget, holdout_access_credit_budget, and the resolved holdout_pool_scope.
  • Multiple server tests updated to create deterministic test inputs (access credit values, trial config environment setup) and to assert the correct semantics introduced by the core changes.

Testing

  • Ran focused unit/integration tests repeatedly and fixed issues iteratively, examples include: cargo test -p evidenceos-core --test stress_tests boundary_transition_from_active_to_frozen_has_no_off_by_one_leakage (passed) and many daemon tests such as server::tests::create_claim_v2_*, holdout_pool_*, dependency_*, and http_preflight_tests::* (all exercised and fixed to pass).
  • Performed cargo check --workspace (no errors) and ./scripts/test_evidence.sh (ran formatting, clippy, and the workspace test gate; large workspace test run completed through the bulk of tests with the modified suite passing in CI-style runs).
  • All targeted failing tests observed in the audit were re-run and reported passing after the fixes.

Codex Task

@jverdicc jverdicc merged commit 15eade2 into main Mar 1, 2026
6 of 8 checks passed
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6d81584c11

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1163 to +1164
if created_topic_pool {
topic_pools.remove(&claim.topic_id);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid deleting a topic pool after concurrent reserve failure

Removing the pool when reserve(...) fails can corrupt budget accounting under concurrent create_claim_v2 calls for the same newly-seen topic: one request can create the pool (created_topic_pool = true), another request can reserve against it first, and then this request can fail and delete the entire pool anyway. That drops reservations/usage for the other in-flight claim and can later cause missing topic budget pool errors in execution/settlement paths that fetch the pool by topic, while also effectively resetting topic budget if the pool is recreated.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant