Skip to content

SQL Server connector#4161

Draft
serprex wants to merge 9 commits intomainfrom
mssql
Draft

SQL Server connector#4161
serprex wants to merge 9 commits intomainfrom
mssql

Conversation

@serprex
Copy link
Copy Markdown
Member

@serprex serprex commented Apr 9, 2026

Doesn't handle ADD COLUMN

@serprex serprex force-pushed the mssql branch 2 times, most recently from 077a822 to a59c35f Compare April 9, 2026 03:15
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 9, 2026

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
2195 1 2194 263
View the top 2 failed test(s) by shortest run time
github.com/PeerDB-io/peerdb/flow/e2e::TestPeerFlowE2ETestSuiteBQ
Stack Traces | 0s run time
=== RUN   TestPeerFlowE2ETestSuiteBQ
=== PAUSE TestPeerFlowE2ETestSuiteBQ
=== CONT  TestPeerFlowE2ETestSuiteBQ
--- FAIL: TestPeerFlowE2ETestSuiteBQ (0.00s)
2026/04/09 11:52:44 INFO Received AWS credentials from peer for connector: ci x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN}
2026/04/09 11:52:44 INFO Received AWS credentials from peer for connector: clickhouse x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN}
github.com/PeerDB-io/peerdb/flow/e2e::TestPeerFlowE2ETestSuiteBQ/Test_Multi_Table
Stack Traces | 140s run time
=== RUN   TestPeerFlowE2ETestSuiteBQ/Test_Multi_Table
=== PAUSE TestPeerFlowE2ETestSuiteBQ/Test_Multi_Table
=== CONT  TestPeerFlowE2ETestSuiteBQ/Test_Multi_Table
    bigquery_test.go:596: Executed an insert on two tables
    bigquery_test.go:598: WaitFor normalize both tables 2026-04-09 11:50:28.865432449 +0000 UTC m=+105.632236368
    bigquery_test.go:598: UNEXPECTED TIMEOUT normalize both tables 2026-04-09 11:52:30.38892454 +0000 UTC m=+227.155728449
    bigquery.go:86: begin tearing down postgres schema bq_q2xzojvd_20260409115024
--- FAIL: TestPeerFlowE2ETestSuiteBQ/Test_Multi_Table (139.92s)

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

❌ Test Failure

Analysis: All failures are TestPeerFlowE2ETestSuiteMSSQL_CH tests deterministically failing with "connection refused" on localhost:1433 in the maria and mysql-gtid CI matrix jobs, which don't provision SQL Server — this is a real CI configuration bug, not a flaky test.
Confidence: 0.95

⚠️ This appears to be a real bug - manual intervention needed

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: All MSSQL e2e tests failed due to "connection refused" on localhost:1433 — the SQL Server service did not start (or crashed) in this CI runner, causing every TestPeerFlowE2ETestSuiteMSSQL_CH test to fail during setup/teardown.
Confidence: 0.95

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: All MSSQL_CH e2e tests failed because the MSSQL service (port 1433) was not reachable in the CI runner — "connection refused" on setup/teardown, indicating an infrastructure failure rather than a code regression.
Confidence: 0.97

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: All failing tests are TestPeerFlowE2ETestSuiteMSSQL_CH subtests that couldn't connect to MSSQL on localhost:1433 (connection refused), indicating the SQL Server service failed to start on the CI runner rather than a code regression.
Confidence: 0.95

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: All MSSQL_CH tests failed because the MSSQL service was not available at localhost:1433 (connection refused), indicating a CI infrastructure issue where the SQL Server container failed to start rather than a code regression.
Confidence: 0.92

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: e2e TestApiPg/TestResyncFailed timed out after 138.9s waiting for async flow error messages to appear in peerdb_stats.flow_errors, indicating a race condition in the WaitFor polling loop rather than a code regression.
Confidence: 0.82

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: All MSSQL_CH test failures stem from "connection refused" on localhost:1433, indicating the MSSQL service failed to start in the CI environment — a classic infrastructure flake unrelated to code changes.
Confidence: 0.95

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: The e2e test suite failed because SetupCDCFlowStatusQuery got stuck in snapshot state (clickhouse_test.go:466), a known timing/state race condition in CDC flow setup that caused the test to run nearly to the 900s timeout limit.
Confidence: 0.88

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

❌ Test Failure

Analysis: All three matrix jobs fail deterministically because TestPeerFlowE2ETestSuiteMSSQL_CH attempts to connect to MSSQL at localhost:1433, but MSSQL is never started as a service in these CI job variants (maria, mysql-gtid, mysql-pos).
Confidence: 0.92

⚠️ This appears to be a real bug - manual intervention needed

View workflow run

@serprex serprex requested a review from heavycrystal April 9, 2026 04:14
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: All TestPeerFlowE2ETestSuiteMSSQL_CH tests failed because the MSSQL service never became available on localhost:1433 in the CI runner (connection refused), a classic infrastructure flakiness pattern unrelated to code changes.
Confidence: 0.95

✅ Automatically retrying the workflow

View workflow run

@serprex serprex requested a review from ilidemi April 9, 2026 04:22
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Possible Flaky Test

Analysis: All 3 matrix jobs failed simultaneously in the long-running e2e test suite (~681s) without specific test failure details visible in logs, and the merged commit (iostream error handling for object storage) is unrelated to the MySQL/MariaDB e2e paths that failed.
Confidence: 0.65

⚠️ Confidence too low (0.65) to retry automatically - manual review recommended

View workflow run

@serprex
Copy link
Copy Markdown
Member Author

serprex commented Apr 9, 2026

#4052 would help include SQL Server as a generic source

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: All MSSQL_CH e2e tests failed because the MSSQL service was not reachable at localhost:1433 (connection refused), indicating the SQL Server container failed to start in this CI run rather than a code regression.
Confidence: 0.95

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: Test_Types_CH failed due to a sub-microsecond time precision mismatch (4h41m27.466801s vs 4h41m27.4668s), a timing-dependent comparison between PostgreSQL and ClickHouse that varies per run.
Confidence: 0.88

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: All TestPeerFlowE2ETestSuiteMSSQL_CH tests failed due to "connection refused" on localhost:1433 — the MSSQL service failed to start or became unreachable in the CI environment, not a code regression.
Confidence: 0.95

✅ Automatically retrying the workflow

View workflow run

@serprex serprex force-pushed the mssql branch 3 times, most recently from 0fe9620 to b9de044 Compare April 9, 2026 05:09
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: The SQL Server Docker container failed to start within the 60-second timeout during CI environment setup — a transient infrastructure issue unrelated to any code change.
Confidence: 0.95

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: The SQL Server Docker container failed to become ready within the 60-second health check timeout (30 attempts × 2s), a transient CI infrastructure issue unrelated to any code change.
Confidence: 0.95

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: SQL Server failed to start in CI because the Linux kernel's async I/O context limit (fs.aio-max-nr) was exhausted on the runner, causing an infrastructure-level crash unrelated to any code change.
Confidence: 0.97

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: The SQL Server Docker container crashed during startup (exited and was removed by --rm before the 30-retry health check completed), which is a transient infrastructure failure unrelated to any code change.
Confidence: 0.92

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

❌ Test Failure

Analysis: The MSSQL→ClickHouse e2e tests consistently show 0 records reaching the destination despite 2 records in the source across many polling retries, indicating a real replication bug in the new SQL Server connector rather than a transient timing issue.
Confidence: 0.8

⚠️ This appears to be a real bug - manual intervention needed

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: The e2e test suite failed only in the mysql-pos matrix variant (running for 802s near the 900s timeout) while identical tests passed in two other MySQL matrix jobs, indicating a timing-sensitive or environment-specific flake rather than a code regression.
Confidence: 0.82

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: The e2e test TestPeerFlowE2ETestSuiteMySQL_CH_Cluster timed out after 783s with "SetupCDCFlowStatusQuery stuck in snapshot somehow", indicating a non-deterministic state transition failure in the CDC snapshot phase rather than a code regression.
Confidence: 0.88

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: The e2e test suite failed only on the mysql-pos/PG17/CH7.0 matrix job (787s runtime near the 900s timeout) while identical tests passed on two other matrix configurations, indicating a timing-sensitive flaky failure rather than a code regression.
Confidence: 0.82

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: The e2e test suite timed out at ~892s (900s limit) with multiple tests stuck in "SetupCDCFlowStatusQuery stuck in snapshot somehow", indicating a flaky CDC snapshot hang rather than a code regression.
Confidence: 0.9

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: Three MSSQL→ClickHouse e2e tests failed due to an SQL Server deadlock in the CI container causing CDC data delivery issues, with the failing commit being completely unrelated to MSSQL type handling.
Confidence: 0.8

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: MSSQL CDC setup hit a transient deadlock (SQL Server error 1205) in sp_cdc_add_job when enabling Change Data Capture on a table concurrently with other CDC operations, causing Test_Simple_MsSQL to fail and downstream MSSQL tests to time out.
Confidence: 0.95

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: Two e2e test failures due to timing issues: TestApiPg/TestQRep hit a ClickHouse "unknown table" race condition (table not yet created when queried), and the MSSQL_CH tests timed out polling for CDC record propagation.
Confidence: 0.82

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: All failing tests show WaitFor polling loops timing out waiting for CDC/snapshot data to sync (never receiving expected records), a classic flaky pattern in distributed integration tests under CI resource pressure, unrelated to the last commit which touched only object storage error handling.
Confidence: 0.78

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: Multiple e2e tests timed out waiting for CDC snapshot completion ("SetupCDCFlowStatusQuery stuck in snapshot somehow"), a known transient condition under high test parallelism (-p 32), not caused by any code change.
Confidence: 0.93

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: The e2e test suite ran for 837s against a 900s timeout with no specific assertion failures logged, indicating the mysql-pos e2e tests timed out due to slow execution rather than a deterministic code bug.
Confidence: 0.78

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: Multiple unrelated e2e test suites (BigQuery, MSSQL→ClickHouse at ~195s timeout, PG→ClickHouse) failed simultaneously across both matrix configurations, pointing to external service connectivity issues rather than a code regression.
Confidence: 0.88

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: TestMySQLSSHKeepaliveWithToxiproxy failed due to a race between SSH tunnel teardown and long-running query termination — a timing-sensitive Toxiproxy network simulation test where the keepalive detection succeeded but the query cancellation didn't propagate in time.
Confidence: 0.88

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: Test_Types_CH fails with a 1-microsecond QValueTime precision mismatch (9h31m57.385762s vs 9h31m57.385761s) caused by a runtime-generated timestamp whose specific microsecond value triggers a rounding discrepancy in the serialization pipeline — not reproducible on every run.
Confidence: 0.78

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: The e2e test suite timed out after exactly 900s (the configured -timeout 900s limit), indicating a slow CI environment rather than a code regression.
Confidence: 0.92

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: The e2e test suite hit the 900s timeout limit exactly (900.612s), indicating the tests were killed by a timeout rather than a specific assertion failure, which is a classic flaky/infrastructure issue.
Confidence: 0.88

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

❌ Test Failure

Analysis: Deterministic type mapping bug in Test_Types_MsSQL: entry 20 is consistently mapped as QValueTimestamp(1900-01-01) instead of the expected QValueDate(1753-01-01), indicating a real MSSQL date type handling regression.
Confidence: 0.88

⚠️ This appears to be a real bug - manual intervention needed

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: All three MSSQL_CH tests timed out in WaitFor polling loops (194–211s) waiting for CDC records to land in ClickHouse, consistent with resource contention in a 32-parallel-test CI run rather than a code regression.
Confidence: 0.88

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: Multiple e2e tests timed out waiting for async CDC normalization/sync operations across BQ and MSSQL_CH suites, with ~200s WaitFor loops expiring before data arrived — classic distributed system flakiness rather than a code regression.
Confidence: 0.85

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: The e2e test suite timed out after exactly 900 seconds (the configured -timeout 900s limit), indicating a slow/flaky infrastructure issue rather than a real code bug.
Confidence: 0.92

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: Only the mysql-pos (MySQL 7.0) matrix job failed while identical mysql-gtid and MariaDB jobs passed, and the e2e suite ran for 808s out of a 900s timeout, indicating environment-level timing/resource flakiness rather than a code regression.
Confidence: 0.75

✅ Automatically retrying the workflow

View workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🔄 Flaky Test Detected

Analysis: TestPeerFlowE2ETestSuiteBQ/Test_Multi_Table hit an "UNEXPECTED TIMEOUT" waiting for BigQuery normalization to complete within its 2-minute window, a classic transient failure in E2E tests dependent on external service latency.
Confidence: 0.92

✅ Automatically retrying the workflow

View workflow run

@Jeremyyang920
Copy link
Copy Markdown
Contributor

@serprex We're not going to be able to pick this up at this time. Internal discussions are being made about how to best go about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants