Skip to content

fix: add some stability improvements to crisp_e2e test#936

Merged
ryardley merged 21 commits into
devfrom
ry/fix-crisp-e2e
Oct 30, 2025
Merged

fix: add some stability improvements to crisp_e2e test#936
ryardley merged 21 commits into
devfrom
ry/fix-crisp-e2e

Conversation

@ryardley

@ryardley ryardley commented Oct 29, 2025

Copy link
Copy Markdown
Contributor

I had a few issues running crisp on my local system this also helped fix CI which was also having issues with running in headless mode. This PR helps with fixing these issues.

  1. Displaying a huge key starved my system of resources as it was doing proof processing so I used derivative to not print the proof and keys to the output (probably a "my system" issue but still helpful in general as we are trying to use this to avoid printing large keys in logging)
  2. Hot module reloading was causing the test to stall as there were multiple http://localhost:3000 requests which occasionally caused an error that failed the test so I disabled it in the crisp test by creating a client pnpm dev-static script that does not do hot module reloading.
  3. Made a definitive reset and test script for a clean checkout to aid local testing:
    ./scripts/run-crisp-test.sh --ui and ./scripts/run-crisp-test.sh
  4. Fixed an issue with xvf-run where I finally worked out that not providing a screen size was causing the test to stall both locally on CI.

Summary by CodeRabbit

  • New Features

    • Added dev-static script for static development mode without hot module reloading.
    • Added run-crisp-test script for executing comprehensive cleanup and E2E testing workflow.
  • Improvements

    • Enhanced development script startup logging and synchronization with health checks.
    • Improved E2E test configuration with better display settings and reliability measures.
  • Chores

    • Added workspace dependency for development tooling.

@vercel

vercel Bot commented Oct 29, 2025

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
crisp Ready Ready Preview Comment Oct 30, 2025 0:39am
enclave-docs Ready Ready Preview Comment Oct 30, 2025 0:39am

@coderabbitai

coderabbitai Bot commented Oct 29, 2025

Copy link
Copy Markdown
Contributor

Walkthrough

Added the derivative crate to workspace deps and used it to suppress ciphertext/proof in Debug output. Adjusted CRISP dev/test tooling: added dev-static, conditional HMR, readiness waits (TCP), slowMo for Playwright, extra test timing waits, and new orchestration scripts.

Changes

Cohort / File(s) Summary
Workspace dependency updates
crates/program-server/Cargo.toml, examples/CRISP/Cargo.toml, examples/CRISP/server/Cargo.toml
Added derivative as a workspace dependency (version pinned in examples) to enable Derivative derives.
WebhookPayload Debug customization
crates/program-server/src/types.rs, examples/CRISP/server/src/server/models.rs
Switched structs to #[derive(Derivative, Serialize/Deserialize)] with #[derivative(Debug)] and marked ciphertext and proof as #[derivative(Debug = "ignore")].
Client dev / HMR changes
examples/CRISP/client/package.json, examples/CRISP/client/vite.config.ts, examples/CRISP/scripts/dev_client.sh
Added dev-static npm script (~NO_HOT=1), Vite now logs when HMR is disabled and toggles HMR via NO_HOT, and client startup script changed to pnpm dev-static.
Dev/test orchestration scripts
examples/CRISP/scripts/dev.sh, examples/CRISP/scripts/dev_services.sh, examples/CRISP/scripts/setup.sh, examples/CRISP/scripts/test_e2e.sh, scripts/run-crisp-test.sh
Added startup echoes, replaced fixed sleeps with wait-on tcp:13151/3000 readiness checks, reduced concurrent process names (and colors), added HEADLESS env for tests, and added a top-level test runner script.
E2E / Playwright test edits
examples/CRISP/test/crisp.spec.ts, examples/CRISP/playwright.config.ts
Inserted small timing waits (page.waitForTimeout), some minor formatting commas, and set slowMo: 100 for Chromium in Playwright config.

Sequence Diagram(s)

sequenceDiagram
  participant DevScript as dev.sh
  participant Services as dev_services.sh
  participant Server as CRISP server
  participant WaitOn as wait-on (tcp)
  participant Client as client (pnpm dev-static)
  note right of DevScript #DDEEFF: Start orchestration
  DevScript->>Services: run dev_services.sh (deploy + services)
  Services->>Server: start server
  Services->>WaitOn: wait-on tcp:13151
  WaitOn-->>Services: port ready
  Services->>Client: start client via pnpm dev-static
  Client->>Client: Vite reads NO_HOT -> HMR off (logs)
  note right of Client #EEFFEE: Dev environment running
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Files needing extra attention:
    • crates/program-server/src/types.rs and examples/CRISP/server/src/server/models.rs — ensure Derivative usage doesn't affect serde behavior or trait bounds.
    • Scripts: examples/CRISP/scripts/dev_services.sh, test_e2e.sh, and dev.sh — verify wait-on targets and concurrency names align with CI/dev environment.
    • Playwright/test timing changes: examples/CRISP/test/crisp.spec.ts — check for flakiness introduced by added waits.

Possibly related PRs

Suggested labels

bug

Suggested reviewers

  • cedoor
  • ctrlc03

Poem

🐇 I nibble at crates and tidy the trail,
Hiding proofs and ciphers from debug's pale veil.
I wait on a port, then hop — tests start to sing,
HMR muted, pipelines ready to spring.
A crisp little patch, with a rabbit-sized grin.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The PR title "fix: add some stability improvements to crisp_e2e test" is clearly related to the changeset and captures the primary objective. The changes include adding the derivative crate to suppress debug output of large values, disabling HMR for the test client, adding waits and timeouts throughout test scripts, and introducing a new test runner script—all of which are concrete measures to improve test stability. While "stability improvements" is somewhat broad terminology, it accurately describes the intent without being misleading or overly vague. The title is specific enough to identify that the focus is on the crisp_e2e test, and a teammate scanning history would understand that this PR addresses stability issues in that area.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch ry/fix-crisp-e2e

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/program-server/src/types.rs (1)

11-15: Consider applying the same Debug customization to ComputeResponse.

The ComputeResponse struct contains ciphertext and proof fields similar to WebhookPayload, but it still uses the standard Debug derive. For consistency and to prevent the same resource starvation issues mentioned in the PR objectives, consider applying the same Derivative pattern here.

Apply this diff to add Derivative support:

-#[derive(Serialize, Deserialize, Debug)]
+#[derive(Derivative, Serialize, Deserialize)]
+#[derivative(Debug)]
 pub struct ComputeResponse {
+    #[derivative(Debug = "ignore")]
     pub ciphertext: Vec<u8>,
+    #[derivative(Debug = "ignore")]
     pub proof: Vec<u8>,
 }
🧹 Nitpick comments (1)
crates/program-server/src/types.rs (1)

142-154: Consider adding a test for the custom Debug behavior.

While test_webhook_payload_serialization validates serialization, there's no test verifying that ciphertext and proof are excluded from Debug output. This would help ensure the intended behavior is preserved.

Add a test like this after the existing serialization test:

#[test]
fn test_webhook_payload_debug_excludes_sensitive_fields() {
    let payload = WebhookPayload {
        e3_id: 12345,
        ciphertext: vec![0x01, 0x23, 0x45, 0x67, 0x89, 0xab, 0xcd, 0xef],
        proof: vec![0xde, 0xad, 0xbe, 0xef],
    };

    let debug_output = format!("{:?}", payload);
    
    // Should include e3_id
    assert!(debug_output.contains("12345"));
    
    // Should NOT include ciphertext or proof hex values
    assert!(!debug_output.contains("01"));
    assert!(!debug_output.contains("de"));
    assert!(!debug_output.contains("ad"));
}
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 09c4e2d and 930537c.

⛔ Files ignored due to path filters (2)
  • Cargo.lock is excluded by !**/*.lock
  • examples/CRISP/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (9)
  • crates/program-server/Cargo.toml (1 hunks)
  • crates/program-server/src/types.rs (2 hunks)
  • examples/CRISP/Cargo.toml (1 hunks)
  • examples/CRISP/client/package.json (1 hunks)
  • examples/CRISP/client/vite.config.ts (2 hunks)
  • examples/CRISP/scripts/dev_client.sh (1 hunks)
  • examples/CRISP/server/Cargo.toml (1 hunks)
  • examples/CRISP/server/src/server/models.rs (1 hunks)
  • scripts/run-crisp-test.sh (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-25T10:28:56.174Z
Learnt from: ctrlc03
PR: gnosisguild/enclave#657
File: Cargo.toml:32-34
Timestamp: 2025-08-25T10:28:56.174Z
Learning: The examples/CRISP directory has its own Cargo.toml workspace configuration with members like "server", "wasm-crypto", "program/core", "program/client", etc. The root workspace intentionally excludes "examples/CRISP/server", "examples/CRISP/program", and "examples/CRISP/wasm-crypto" to prevent double workspace membership, which is the correct approach for self-contained example workspaces.

Applied to files:

  • examples/CRISP/server/Cargo.toml
  • crates/program-server/Cargo.toml
  • examples/CRISP/Cargo.toml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: build_sdk
  • GitHub Check: integration_prebuild
  • GitHub Check: test_net
  • GitHub Check: build_enclave_cli
  • GitHub Check: build_e3_support_risc0
  • GitHub Check: rust_unit
  • GitHub Check: rust_integration
  • GitHub Check: test_contracts
  • GitHub Check: Build & Push Image
  • GitHub Check: Build & Push Image
🔇 Additional comments (9)
examples/CRISP/client/package.json (1)

14-14: LGTM! New static development script added.

The dev-static script correctly sets NO_HOT=1 to disable hot module reloading, which aligns with the PR objective of reducing flakiness in the e2e test environment.

examples/CRISP/client/vite.config.ts (1)

16-16: LGTM! Appropriate logging for NO_HOT mode.

The log message correctly indicates when HMR is disabled, which helps with debugging the development environment configuration.

examples/CRISP/scripts/dev_client.sh (1)

7-7: LGTM! Correctly uses the new dev-static script.

This change aligns with the new dev-static script added to package.json, which disables HMR for improved test stability.

examples/CRISP/server/Cargo.toml (1)

56-56: LGTM! Correctly adds derivative as workspace dependency.

This aligns with the workspace configuration in examples/CRISP/Cargo.toml which defines derivative = "=2.2.0" in the workspace dependencies.

examples/CRISP/Cargo.toml (1)

31-31: LGTM! Workspace dependency added correctly.

The derivative crate is properly added to workspace dependencies with a pinned version =2.2.0, enabling workspace members to use it via derivative.workspace = true.

examples/CRISP/server/src/server/models.rs (2)

8-8: LGTM! Derivative import added for custom Debug implementation.

This import enables the custom Debug behavior that excludes sensitive fields from debug output.


11-21: Excellent! Custom Debug implementation prevents printing large data.

The use of Derivative with #[derivative(Debug = "ignore")] on ciphertext and proof fields effectively prevents these large byte arrays from being printed to output, which aligns perfectly with the PR objective of avoiding resource starvation during proof processing.

This maintains serialization behavior while customizing debug output.

crates/program-server/Cargo.toml (1)

18-18: No action required—derivative is properly defined in workspace.

The verification confirms that derivative = "=2.2.0" is defined in the root Cargo.toml at line 118 within [workspace.dependencies]. The reference derivative.workspace = true in crates/program-server/Cargo.toml is correct and will resolve properly.

crates/program-server/src/types.rs (1)

27-36: LGTM! Debug customization prevents resource starvation.

The Derivative implementation correctly excludes ciphertext and proof fields from Debug output while preserving serialization behavior and keeping e3_id visible. This aligns well with the PR objectives to avoid resource starvation during proof processing.

Comment thread examples/CRISP/client/vite.config.ts Outdated
Comment thread scripts/run-crisp-test.sh Outdated
@vercel vercel Bot temporarily deployed to Preview – enclave-docs October 29, 2025 19:02 Inactive
@vercel vercel Bot temporarily deployed to Preview – crisp October 29, 2025 19:02 Inactive

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
examples/CRISP/test/crisp.spec.ts (2)

51-51: Approve wait, but consider condition-based alternatives.

The 1-second wait after runCliInit improves test stability by allowing CLI effects to propagate. However, Playwright's condition-based waits (e.g., waitForLoadState, waitForSelector) are more robust as they complete immediately when the condition is met rather than always waiting the full duration.

Consider refactoring to wait for a specific condition if possible:

 await runCliInit();
-await page.waitForTimeout(1000);
 await page.goto("/");
+await page.waitForLoadState('networkidle');

56-56: Approve wait, but consider waiting for MetaMask dialog.

The 1-second wait after clicking MetaMask improves test stability by allowing the MetaMask extension to respond. However, waiting for a specific condition would be more reliable.

Consider waiting for the MetaMask dialog to be ready:

 await page.locator('button:has-text("MetaMask")').click();
-await page.waitForTimeout(1000);
+await metamaskPage.waitForLoadState('domcontentloaded');
 await metamask.connectToDapp();
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 930537c and 458a405.

📒 Files selected for processing (1)
  • examples/CRISP/test/crisp.spec.ts (4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: ctrlc03
PR: gnosisguild/enclave#657
File: Cargo.toml:32-34
Timestamp: 2025-08-25T10:28:56.174Z
Learning: The examples/CRISP directory has its own Cargo.toml workspace configuration with members like "server", "wasm-crypto", "program/core", "program/client", etc. The root workspace intentionally excludes "examples/CRISP/server", "examples/CRISP/program", and "examples/CRISP/wasm-crypto" to prevent double workspace membership, which is the correct approach for self-contained example workspaces.
Learnt from: ryardley
PR: gnosisguild/enclave#107
File: packages/ciphernode/sortition/src/distance.rs:1-1
Timestamp: 2024-10-08T19:45:18.209Z
Learning: In `packages/ciphernode/core/src/events.rs`, the import statements use the correct and updated `alloy::primitives` module.
Learnt from: ryardley
PR: gnosisguild/enclave#107
File: packages/ciphernode/sortition/src/distance.rs:1-1
Timestamp: 2024-09-26T03:11:29.311Z
Learning: In `packages/ciphernode/core/src/events.rs`, the import statements use the correct and updated `alloy::primitives` module.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: build_sdk
  • GitHub Check: integration_prebuild
  • GitHub Check: test_contracts
  • GitHub Check: build_enclave_cli
  • GitHub Check: test_net
  • GitHub Check: rust_unit
  • GitHub Check: rust_integration
  • GitHub Check: Build & Push Image
🔇 Additional comments (2)
examples/CRISP/test/crisp.spec.ts (2)

18-18: LGTM! Trailing commas improve consistency.

These trailing comma additions follow JavaScript/TypeScript best practices and will make future diffs cleaner.

Also applies to: 33-33, 47-47


67-71: LGTM! Improved test selectors.

The updated selectors use data-test-id attributes, which is a best practice for E2E tests as they're more stable and less coupled to styling changes than CSS class selectors.

@vercel vercel Bot temporarily deployed to Preview – enclave-docs October 29, 2025 21:37 Inactive
@vercel vercel Bot temporarily deployed to Preview – crisp October 29, 2025 21:37 Inactive

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
examples/CRISP/scripts/dev_server.sh (1)

7-7: Consider whether building the CLI here is appropriate.

The script now builds both the server and cli binaries before running the server. While building the server before running makes sense, building the CLI in the dev_server.sh script seems outside its scope since this script only runs the server.

If the CLI needs to be available for tests, consider:

  • Building it in a separate script or test setup step
  • Documenting why the CLI build is included here
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 458a405 and 8bd4dc1.

📒 Files selected for processing (4)
  • examples/CRISP/playwright.config.ts (1 hunks)
  • examples/CRISP/scripts/dev_server.sh (1 hunks)
  • examples/CRISP/scripts/dev_services.sh (1 hunks)
  • examples/CRISP/test/crisp.spec.ts (4 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: ctrlc03
PR: gnosisguild/enclave#657
File: Cargo.toml:32-34
Timestamp: 2025-08-25T10:28:56.174Z
Learning: The examples/CRISP directory has its own Cargo.toml workspace configuration with members like "server", "wasm-crypto", "program/core", "program/client", etc. The root workspace intentionally excludes "examples/CRISP/server", "examples/CRISP/program", and "examples/CRISP/wasm-crypto" to prevent double workspace membership, which is the correct approach for self-contained example workspaces.
📚 Learning: 2025-08-25T10:28:56.174Z
Learnt from: ctrlc03
PR: gnosisguild/enclave#657
File: Cargo.toml:32-34
Timestamp: 2025-08-25T10:28:56.174Z
Learning: The examples/CRISP directory has its own Cargo.toml workspace configuration with members like "server", "wasm-crypto", "program/core", "program/client", etc. The root workspace intentionally excludes "examples/CRISP/server", "examples/CRISP/program", and "examples/CRISP/wasm-crypto" to prevent double workspace membership, which is the correct approach for self-contained example workspaces.

Applied to files:

  • examples/CRISP/scripts/dev_server.sh
📚 Learning: 2024-10-08T19:45:18.209Z
Learnt from: ryardley
PR: gnosisguild/enclave#107
File: tests/basic_integration/test.sh:81-83
Timestamp: 2024-10-08T19:45:18.209Z
Learning: In `tests/basic_integration/test.sh`, it's acceptable to wait indefinitely for the EVM node to start without a timeout, as it's unlikely to fail here.

Applied to files:

  • examples/CRISP/scripts/dev_services.sh
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: build_enclave_cli
  • GitHub Check: build_e3_support_dev
  • GitHub Check: build_sdk
  • GitHub Check: test_net
  • GitHub Check: integration_prebuild
  • GitHub Check: test_contracts
  • GitHub Check: rust_integration
  • GitHub Check: rust_unit
  • GitHub Check: Build & Push Image
  • GitHub Check: Build & Push Image
🔇 Additional comments (6)
examples/CRISP/scripts/dev_services.sh (1)

8-8: Verify port 13151 is correct and wait-on behavior is as expected.

Using wait-on tcp:13151 is a solid improvement over hardcoded delays for startup synchronization. However, there are a couple of details to verify:

  1. Port correctness: Confirm that port 13151 is indeed the port on which dev_server.sh listens (it should be listening before or immediately upon startup).
  2. wait-on defaults: Verify that wait-on has reasonable timeout defaults. Per the learnings, indefinite waits can be acceptable for local dev setup, but ensure this doesn't create confusing hangs during troubleshooting.
  3. Port already bound: If another process has bound 13151, wait-on will pass immediately, but dev_server.sh may still fail. Consider whether the script should fail fast in this scenario or if the concurrently -k flag suffices.

To verify these assumptions, you may check:

  • The dev_server.sh script to confirm it listens on 13151
  • The wait-on CLI defaults by running wait-on --help or checking its documentation
  • Whether there are any integration tests that validate this startup sequence

The change aligns well with the PR's stability objectives and is an improvement over the previous approach.

examples/CRISP/playwright.config.ts (2)

28-30: Good addition for test stability.

Adding slowMo: 100 is a reasonable approach to reduce test flakiness by introducing a 100ms delay between browser actions, giving the application more time to settle between interactions.


11-11: Verify the timeout calculation.

The timeout is set to 5 * 60 * 10000, which equals 3,000,000 ms (50 minutes). This seems extremely long. Was this intended to be 5 * 60 * 1000 (5 minutes) instead?

While the test does include long-running operations (220 second wait on line 64 in crisp.spec.ts for proof processing), a 50-minute timeout still seems excessive.

examples/CRISP/test/crisp.spec.ts (3)

14-25: Good error handling addition.

Wrapping the execSync call in a try-catch block improves error handling and debugging by logging command errors before re-throwing them. This helps diagnose CLI initialization failures during test runs.


51-51: Hard-coded waits added for stability.

These 1000ms waits help stabilize the test by allowing time for CLI initialization and MetaMask interaction to settle. While hard-coded waits are generally less ideal than waiting for specific conditions, they can be necessary for handling timing issues in complex e2e scenarios.

Also applies to: 56-56


57-57: Excellent use of condition-based waiting.

Using waitForLoadState("networkidle") is a best practice for e2e tests, as it waits for network activity to settle rather than using arbitrary timeouts. This ensures the page is fully loaded before proceeding with MetaMask connection.

@vercel vercel Bot temporarily deployed to Preview – crisp October 29, 2025 22:06 Inactive
@vercel vercel Bot temporarily deployed to Preview – enclave-docs October 29, 2025 22:06 Inactive
@ryardley ryardley marked this pull request as draft October 29, 2025 23:55
@vercel vercel Bot temporarily deployed to Preview – enclave-docs October 30, 2025 00:30 Inactive
@vercel vercel Bot temporarily deployed to Preview – crisp October 30, 2025 00:30 Inactive
@ryardley

Copy link
Copy Markdown
Contributor Author

Ok this is good to go

@ryardley ryardley enabled auto-merge (squash) October 30, 2025 00:57

@hmzakhalid hmzakhalid left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK
Thanks a lot for this. I had a quick review of the changes. Seems good to merge.

@ryardley ryardley merged commit 8e14a29 into dev Oct 30, 2025
30 checks passed
@github-actions github-actions Bot deleted the ry/fix-crisp-e2e branch November 6, 2025 03:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants