Conversation
077a822 to
a59c35f
Compare
❌ 1 Tests Failed:
View the top 2 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
❌ Test FailureAnalysis: All failures are TestPeerFlowE2ETestSuiteMSSQL_CH tests deterministically failing with "connection refused" on localhost:1433 in the maria and mysql-gtid CI matrix jobs, which don't provision SQL Server — this is a real CI configuration bug, not a flaky test. |
🔄 Flaky Test DetectedAnalysis: All MSSQL e2e tests failed due to "connection refused" on localhost:1433 — the SQL Server service did not start (or crashed) in this CI runner, causing every TestPeerFlowE2ETestSuiteMSSQL_CH test to fail during setup/teardown. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: All MSSQL_CH e2e tests failed because the MSSQL service (port 1433) was not reachable in the CI runner — "connection refused" on setup/teardown, indicating an infrastructure failure rather than a code regression. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: All failing tests are ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: All MSSQL_CH tests failed because the MSSQL service was not available at localhost:1433 (connection refused), indicating a CI infrastructure issue where the SQL Server container failed to start rather than a code regression. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: e2e TestApiPg/TestResyncFailed timed out after 138.9s waiting for async flow error messages to appear in peerdb_stats.flow_errors, indicating a race condition in the WaitFor polling loop rather than a code regression. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: All MSSQL_CH test failures stem from "connection refused" on localhost:1433, indicating the MSSQL service failed to start in the CI environment — a classic infrastructure flake unrelated to code changes. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: The e2e test suite failed because ✅ Automatically retrying the workflow |
❌ Test FailureAnalysis: All three matrix jobs fail deterministically because TestPeerFlowE2ETestSuiteMSSQL_CH attempts to connect to MSSQL at localhost:1433, but MSSQL is never started as a service in these CI job variants (maria, mysql-gtid, mysql-pos). |
🔄 Flaky Test DetectedAnalysis: All TestPeerFlowE2ETestSuiteMSSQL_CH tests failed because the MSSQL service never became available on localhost:1433 in the CI runner (connection refused), a classic infrastructure flakiness pattern unrelated to code changes. ✅ Automatically retrying the workflow |
🔄 Possible Flaky TestAnalysis: All 3 matrix jobs failed simultaneously in the long-running e2e test suite (~681s) without specific test failure details visible in logs, and the merged commit (iostream error handling for object storage) is unrelated to the MySQL/MariaDB e2e paths that failed. |
|
#4052 would help include SQL Server as a generic source |
🔄 Flaky Test DetectedAnalysis: All MSSQL_CH e2e tests failed because the MSSQL service was not reachable at localhost:1433 (connection refused), indicating the SQL Server container failed to start in this CI run rather than a code regression. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: Test_Types_CH failed due to a sub-microsecond time precision mismatch (4h41m27.466801s vs 4h41m27.4668s), a timing-dependent comparison between PostgreSQL and ClickHouse that varies per run. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: All TestPeerFlowE2ETestSuiteMSSQL_CH tests failed due to "connection refused" on localhost:1433 — the MSSQL service failed to start or became unreachable in the CI environment, not a code regression. ✅ Automatically retrying the workflow |
0fe9620 to
b9de044
Compare
🔄 Flaky Test DetectedAnalysis: The SQL Server Docker container failed to start within the 60-second timeout during CI environment setup — a transient infrastructure issue unrelated to any code change. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: The SQL Server Docker container failed to become ready within the 60-second health check timeout (30 attempts × 2s), a transient CI infrastructure issue unrelated to any code change. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: SQL Server failed to start in CI because the Linux kernel's async I/O context limit (fs.aio-max-nr) was exhausted on the runner, causing an infrastructure-level crash unrelated to any code change. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: The SQL Server Docker container crashed during startup (exited and was removed by --rm before the 30-retry health check completed), which is a transient infrastructure failure unrelated to any code change. ✅ Automatically retrying the workflow |
❌ Test FailureAnalysis: The MSSQL→ClickHouse e2e tests consistently show 0 records reaching the destination despite 2 records in the source across many polling retries, indicating a real replication bug in the new SQL Server connector rather than a transient timing issue. |
🔄 Flaky Test DetectedAnalysis: The e2e test suite failed only in the mysql-pos matrix variant (running for 802s near the 900s timeout) while identical tests passed in two other MySQL matrix jobs, indicating a timing-sensitive or environment-specific flake rather than a code regression. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: The e2e test ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: The e2e test suite failed only on the mysql-pos/PG17/CH7.0 matrix job (787s runtime near the 900s timeout) while identical tests passed on two other matrix configurations, indicating a timing-sensitive flaky failure rather than a code regression. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: The e2e test suite timed out at ~892s (900s limit) with multiple tests stuck in "SetupCDCFlowStatusQuery stuck in snapshot somehow", indicating a flaky CDC snapshot hang rather than a code regression. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: Three MSSQL→ClickHouse e2e tests failed due to an SQL Server deadlock in the CI container causing CDC data delivery issues, with the failing commit being completely unrelated to MSSQL type handling. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: MSSQL CDC setup hit a transient deadlock (SQL Server error 1205) in ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: Two e2e test failures due to timing issues: TestApiPg/TestQRep hit a ClickHouse "unknown table" race condition (table not yet created when queried), and the MSSQL_CH tests timed out polling for CDC record propagation. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: All failing tests show WaitFor polling loops timing out waiting for CDC/snapshot data to sync (never receiving expected records), a classic flaky pattern in distributed integration tests under CI resource pressure, unrelated to the last commit which touched only object storage error handling. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: Multiple e2e tests timed out waiting for CDC snapshot completion ("SetupCDCFlowStatusQuery stuck in snapshot somehow"), a known transient condition under high test parallelism (-p 32), not caused by any code change. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: The e2e test suite ran for 837s against a 900s timeout with no specific assertion failures logged, indicating the mysql-pos e2e tests timed out due to slow execution rather than a deterministic code bug. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: Multiple unrelated e2e test suites (BigQuery, MSSQL→ClickHouse at ~195s timeout, PG→ClickHouse) failed simultaneously across both matrix configurations, pointing to external service connectivity issues rather than a code regression. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: TestMySQLSSHKeepaliveWithToxiproxy failed due to a race between SSH tunnel teardown and long-running query termination — a timing-sensitive Toxiproxy network simulation test where the keepalive detection succeeded but the query cancellation didn't propagate in time. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: Test_Types_CH fails with a 1-microsecond QValueTime precision mismatch (9h31m57.385762s vs 9h31m57.385761s) caused by a runtime-generated timestamp whose specific microsecond value triggers a rounding discrepancy in the serialization pipeline — not reproducible on every run. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: The e2e test suite timed out after exactly 900s (the configured ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: The e2e test suite hit the 900s timeout limit exactly (900.612s), indicating the tests were killed by a timeout rather than a specific assertion failure, which is a classic flaky/infrastructure issue. ✅ Automatically retrying the workflow |
❌ Test FailureAnalysis: Deterministic type mapping bug in Test_Types_MsSQL: entry 20 is consistently mapped as QValueTimestamp(1900-01-01) instead of the expected QValueDate(1753-01-01), indicating a real MSSQL date type handling regression. |
🔄 Flaky Test DetectedAnalysis: All three MSSQL_CH tests timed out in WaitFor polling loops (194–211s) waiting for CDC records to land in ClickHouse, consistent with resource contention in a 32-parallel-test CI run rather than a code regression. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: Multiple e2e tests timed out waiting for async CDC normalization/sync operations across BQ and MSSQL_CH suites, with ~200s WaitFor loops expiring before data arrived — classic distributed system flakiness rather than a code regression. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: The e2e test suite timed out after exactly 900 seconds (the configured ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: Only the mysql-pos (MySQL 7.0) matrix job failed while identical mysql-gtid and MariaDB jobs passed, and the e2e suite ran for 808s out of a 900s timeout, indicating environment-level timing/resource flakiness rather than a code regression. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: TestPeerFlowE2ETestSuiteBQ/Test_Multi_Table hit an "UNEXPECTED TIMEOUT" waiting for BigQuery normalization to complete within its 2-minute window, a classic transient failure in E2E tests dependent on external service latency. ✅ Automatically retrying the workflow |
|
@serprex We're not going to be able to pick this up at this time. Internal discussions are being made about how to best go about this. |
Doesn't handle ADD COLUMN