-
Notifications
You must be signed in to change notification settings - Fork 680
Translate smoketests from Python to Rust #4102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Use OnceLock to build spacetimedb-cli and spacetimedb-standalone once per test process, then run the pre-built binary directly instead of using `cargo run`. This avoids repeated cargo overhead and ensures consistent binary reuse across parallel tests.
Create `crates/smoketests/` to translate Python smoketests to Rust: - Add `Smoketest` struct with builder pattern for test setup - Implement CLI helpers: `spacetime_cmd()`, `call()`, `sql()`, `logs()`, etc. - Translate `smoketests/tests/sql.py` → `tests/sql.rs` - Translate `smoketests/tests/call.py` → `tests/call.rs` - Reuse `ensure_binaries_built()` from guard crate (now public) Also fix Windows process cleanup in `SpacetimeDbGuard`: - Use `taskkill /F /T /PID` to kill entire process tree - Prevents orphaned `spacetimedb-standalone.exe` processes
- Translate 4 more Python smoketests to Rust: auto_inc, describe, module_nested_op, panic - Simplify the call API by removing the generic call<T: Serialize> method and renaming call_raw to call, since CLI args are strings - Remove unused serde dependency
Translate additional Python smoketests to Rust: - dml.rs: DML subscription tests - filtering.rs: Unique/non-unique index filtering tests - namespaces.rs: C# code generation namespace tests - add_remove_index.rs: Index add/remove with subscription tests - schedule_reducer.rs: Scheduled reducer tests Infrastructure improvements: - Add subscribe_background() and SubscriptionHandle for proper background subscription semantics matching Python tests - Add spacetime_local() for commands that don't need --server flag - Add timing instrumentation for debugging test performance
- Separate build and publish timing in lib.rs to identify bottlenecks - Use --bin-path to skip redundant rebuild during publish - Add DEVELOP.md explaining cargo-nextest for faster test runs Timing breakdown per test: - WASM build: ~12s (75%) - Server publish: ~2s (12%) - Server spawn: ~2s (12%) cargo-nextest runs all test binaries in parallel, reducing total runtime from ~265s to ~160s (40% faster).
Translate from Python smoketests: - detect_wasm_bindgen.rs: Tests build rejects wasm_bindgen and getrandom (2 tests) - default_module_clippy.rs: Tests default module passes clippy - delete_database.rs: Tests deleting database stops scheduled reducers - fail_initial_publish.rs: Tests failed publish doesn't corrupt control DB - modules.rs: Tests module update lifecycle and breaking changes (2 tests) Also adds spacetime_build() method to Smoketest for testing build failures. Total: 16 test files translated, 32 tests
Clear CARGO* environment variables (except CARGO_HOME) when spawning child cargo build processes. When running under `cargo test`, cargo sets env vars like CARGO_ENCODED_RUSTFLAGS that differ from a normal build, causing child cargo processes to think they need to recompile. This reduces single-test runtime from ~45s to ~18s by avoiding redundant rebuilds of spacetimedb-standalone and spacetimedb-cli.
Add test translations for: - connect_disconnect_from_cli.rs - client connection callbacks - domains.rs - database rename functionality - client_connection_errors.rs - client_connected error handling - confirmed_reads.rs - --confirmed flag for subscriptions/SQL - create_project.rs - spacetime init command Also fix subscription race condition by waiting for initial update before returning from subscribe_background_*, matching Python behavior.
Translate tests for: - views.rs: st_view_* system tables, namespace collisions, SQL views - auto_migration.rs: schema changes, add table migration
bda7652 to
bea9069
Compare
Add new_identity() method to support multi-identity tests. Translate tests for: - rls.rs: Row-level security filter tests - energy.rs: Energy balance endpoint test - permissions.rs: Private tables, lifecycle reducers, delete protection
Translate tests for: - new_user_flow.rs: Basic publish/call/SQL workflow - servers.rs: Server add/list/edit commands
jdetter
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like these changes so far, I'm excited to talk about this more later today 👍
| fn test_add_then_remove_index() { | ||
| let mut test = Smoketest::builder().module_code(MODULE_CODE).autopublish(false).build(); | ||
|
|
||
| let name = format!("test-db-{}", std::process::id()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would have been an issue if we were to parallelize the smoketests but now we're running 1 spacetime server per test right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually kind of weird now that we're appending the process ID to the name - why not just use a static name like "test-db"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the same questions. I do feel like the old random-string method was the most trivially "this should be fine" approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jdetter one reason to not use a static name is running the tests against a remote server. Now you can run the tests multiple times without conflicting database names.
- Add --config-path to spacetime_local() for test isolation - Fix new_identity() to not pass server arg to logout (matches Python) - Insert --server flag before -- separator in spacetime_cmd() - Update servers.rs to use spacetime_local() for local-only commands - Simplify test files by removing redundant publish_module() calls All 56 smoketests now pass.
Translate smoketests/tests/quickstart.py to Rust. This test validates that the quickstart documentation is correct by extracting code from markdown docs and running it. - Add parse_quickstart() to parse code blocks from markdown with CRLF handling - Add have_pnpm() to check for pnpm availability - Implement QuickstartTest with support for Rust, C#, and TypeScript servers - Rust test passes; C#/TypeScript skip gracefully if dependencies unavailable
| inserts.as_array().unwrap().iter().any(|r| r["name"] == "Cindy"), | ||
| "Expected Cindy in second update: {:?}", | ||
| second | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
technically, the original tests also check things like the id of the entity, and that deletes is empty. unsure if there's an easy way to do a full structural comparison.
| #[test] | ||
| fn test_spacetimedb_ns_csharp() { | ||
| let test = Smoketest::builder().module_code(MODULE_CODE).autopublish(false).build(); | ||
| let _test = Smoketest::builder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out of curiosity, why _test?
Signed-off-by: Tyler Cloutier <cloutiertyler@users.noreply.github.com>
| pub fn person(ctx: &ViewContext) -> Option<Person> { | ||
| None | ||
| } | ||
| "#; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a precompiled module for this
| pub fn person(ctx: &ViewContext) -> Option<ABC> { | ||
| None | ||
| } | ||
| "#; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a precompiled module for this
| ----+------- | ||
| 2 | 2"#, | ||
| ); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're missing several tests from the python smoketests/tests/views.py
| output.contains("testnet.spacetimedb.com"), | ||
| "Expected host in output: {}", | ||
| output | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the python version checks the host and protocol
| testnet_re.is_match(&servers), | ||
| "Expected testnet in server list: {}", | ||
| servers | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the python version also tests for localhost
| output.contains("fingerprint") || output.contains("Fingerprint"), | ||
| "Expected fingerprint message: {}", | ||
| output | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the python version tests fingerprinting of ip address and the presence/absence of -y as well. that could maybe be a separate test.
…ler/translate-smoketests' into tyler/translate-smoketests
Signed-off-by: Zeke Foppa <196249+bfops@users.noreply.github.com>
| invoked_count, | ||
| logs | ||
| ); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're missing the procedure tests from The Original Python
| invoked_count, 1, | ||
| "Expected scheduled reducer to run exactly once, but it ran {} times. Logs: {:?}", | ||
| invoked_count, logs | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Original Python also has:
row_entry = {
"prev": TIMESTAMP_ZERO,
"scheduled_id": 2,
"sched_at": {"Time": TIMESTAMP_ZERO},
}
# subscription should have 2 updates, first for row insert in scheduled table and second for row deletion.
self.assertEqual(
sub(),
[
{"scheduled_table": {"deletes": [], "inserts": [row_entry]}},
{"scheduled_table": {"deletes": [row_entry], "inserts": []}},
],
)| invoked_count, | ||
| logs | ||
| ); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Original Python also has this at the end:
# scheduling repeated reducer again just to get 2nd subscription update.
self.call("schedule_reducer")
repeated_row_entry = {
"prev": TIMESTAMP_ZERO,
"scheduled_id": 1,
"sched_at": {"Interval": {"__time_duration_micros__": 100000}},
}
row_entry = {
"prev": TIMESTAMP_ZERO,
"scheduled_id": 2,
"sched_at": {"Time": TIMESTAMP_ZERO},
}
# subscription should have 2 updates and should not have any deletes
self.assertEqual(
sub(),
[
{"scheduled_table": {"deletes": [], "inserts": [repeated_row_entry]}},
{"scheduled_table": {"deletes": [], "inserts": [row_entry]}},
],
)| result.contains("yay!") && result.contains("hello"), | ||
| "Expected both 'yay!' and 'hello' in table, got: {}", | ||
| result | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the original code did a background subscription and then checked it after the calls. I'm not sure whether that was important to what was being tested (i.e. if there was a specific worry about volatile_nonatomic_schedule_immediate wasn't updating subscriptions properly)
| "Expected book ISBN in AFTER logs: {:?}", | ||
| logs | ||
| ); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we're missing the AddTableColumns class equivalents from The Original Python
| "Expected Robert in logs: {:?}", | ||
| logs | ||
| ); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't print the log messages that The Original Python prints, unsure if intentional
| test.publish_module_clear(false).unwrap(); | ||
|
|
||
| // Add new data with updated schema | ||
| test.call("add_person", &["Husserl", "Student"]).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Original Python does this after this line:
# If subscription, we should get 4 rows corresponding to 4 reducer calls (including before and after update)
sub = sub();
self.assertEqual(len(sub), 4)| #[test] | ||
| fn test_failures() { | ||
| if !have_psql() { | ||
| eprintln!("Skipping test_failures: psql not available"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this fail rather than quietly-ish skipping?
| ); | ||
|
|
||
| // Connection fails with invalid token - we can't easily test this without | ||
| // modifying the token, so skip this part |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this case seems worth testing
| (1 row)"#, | ||
| ); | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're missing a test_sql_conn equivalent to The Original Python
| #[test] | ||
| fn test_build_csharp_module() { | ||
| if !have_dotnet() { | ||
| eprintln!("Skipping test_build_csharp_module: dotnet 8.0+ not available"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer if this failed instead of skipping, forcing the user to pass a --skip-dotnet or similar like The Original Python did. That way, we can tell the difference between the "tests not failing because not run" and "tests not failing because behavior is correct" cases.
| // CLI is pre-built by artifact dependencies during compilation | ||
| let cli_path = ensure_binaries_built(); | ||
|
|
||
| // Install wasi-experimental workload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
python did this before this step:
run_cmd("dotnet", "nuget", "locals", "all", "--clear", cwd=bindings, capture_stderr=True)
unsure if the change is intentional
| .unwrap() | ||
| .strip_suffix(".wasm") | ||
| .unwrap() | ||
| .replace('_', "-"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: perhaps we filter out lines 87-98 into a function, and then call that function in the test?
| /// | ||
| /// This checks if the modules workspace target directory exists and contains | ||
| /// at least one WASM file. | ||
| pub fn precompiled_modules_available() -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this used?
|
|
||
| - name: Fail if Python smoketests were modified | ||
| run: | | ||
| PYTHON_SMOKETEST_CHANGES=$(git diff --name-only origin/${{ github.base_ref }} HEAD -- 'smoketests/**.py') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heads up, I realized that this will start failing in this PR if anyone merges changes into master that touch the smoketests, even if you don't merge master into this PR. I don't know that that's a bad thing, just wanted to mention it before it's confusing.
if we wanted only-changed-in-this-PR logic, it would have to be:
MERGE_BASE=$(git merge-base origin/${{ github.base_ref }} HEAD)
PYTHON_SMOKETEST_CHANGES="$(git diff --name-only $MERGE_BASE HEAD -- 'smoketests/**.py')"also we probably want to make this check required just before merging - I suggest leaving this comment unresolved until then.
| for (key, _) in env::vars() { | ||
| let should_remove = (key.starts_with("CARGO") && key != "CARGO_HOME" && key != "CARGO_TARGET_DIR") | ||
| || key.starts_with("RUST") | ||
| || key == "__CARGO_FIX_YOLO"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
| // Set remote server environment variable if specified | ||
| if let Some(ref server_url) = server { | ||
| cmd.env("SPACETIME_REMOTE_SERVER", server_url); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this could be moved out of the if entirely
Description of Changes
This PR translates all of our Python smoketests into Rust tests which can be run from
cargo runMotivation
The purpose of this fivefold:
cargo testinterfaceSpacetimeDbGuardandcargo test/cargo nextestwe can easily parallelize the smoke testsbsatnetc.)IMPORTANT NOTE!
There are several ways to implement the smoke tests in Rust (none are great):
guardcratebuild.rsscript inguardwhich first builds the executables as a compile step for compiling guardspacetimedb-cliandspacetimedb-standaloneas an artifact dependency of the guard crateNone of these are good.
xtaskis not bad, but doesn't enable us to run other integration tests in other crates (e.g. the CLI)(3) is the correct solution and has the best user experience, but it requires nightly and I don't want to introduce that for all of our tests.
I have chosen to do a combination of (1) and (4). You will now run the smoketests with
cargo smoketest. If you runcargo test --all(or useguard) without doingcargo smoketestit will fall back to (4) which compiles the executables at runtime. Runningcargo buildis the only way to ensure that the executables are not stale because of the internal fingerprint checking. Everything else is fragile not robust.NOTE! There is no way to avoid cargo within cargo and have the smoke tests be run as cargo tests because the modules under test must be compiled with cargo.
API and ABI breaking changes
Note that this is a BREAKING CHANGE to
cargo test --all. The smoketests are now part ofcargo test --allunless you specifically exclude them.Expected complexity level and risk
3, this is partially AI translated. We need to carefully review to ensure the semantics have not regressed.
Testing