Skip to content

Add error-comparison engine to the comparator (PR 1/7, default off)#1524

Merged
sreekanth-db merged 1 commit into
comparator-v2from
comparator-error-engine
Jul 2, 2026
Merged

Add error-comparison engine to the comparator (PR 1/7, default off)#1524
sreekanth-db merged 1 commit into
comparator-v2from
comparator-error-engine

Conversation

@sreekanth-db

@sreekanth-db sreekanth-db commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

What

PR 1 of a 7-PR series that adds error / negative-case comparison to the JDBC comparator. This PR lands the engine only — it is gated off by default, so no existing suite behaves differently.

New com.databricks.jdbc.comparator.error package (test scope):

  • CapturedOutcome (THREW / RETURNED / NULL) + Captures.capture — capture a JDBC call's outcome instead of letting a thrown error propagate and abort the case.
  • ErrorFacts — NPE-safe read of exception class, SQLState, vendor code, and message. Fields come from real accessors; no assumptions about message format.
  • ErrorComparator / ErrorComparison — verdicts MATCH / MISMATCH / ONE_SIDED / NOT_APPLICABLE over the four raw fields. One-sided diffs carry a stable Error one-sided: prefix so they always survive to the CSV summary, regardless of what the non-throwing side renders to (class, scalar, or null).
  • ErrorPolicy — the ERROR_COMPARISON_MODE gate (off | shadow | authoritative, default off). An unrecognized value fails safe to off with a warning rather than aborting the run.

Changes to existing files:

  • ResultSetComparator — exception branch delegates to ErrorComparator only when the gate is on; default off keeps today's isAssignableFrom behavior byte-identical. describeResult() is now null-safe.
  • ComparisonResult.csvSummary() — keeps the one-sided prefix and counts Error <field> mismatch entries. Additive; existing counters untouched.

Deliberate scope decisions

  • Message normalization / tolerance baselines are deferred — we compare raw fields now and will add tolerance in a later PR only if observed shadow-run data shows it is needed.
  • Exception-class comparison is intentionally strict (exact match), not the legacy isAssignableFrom subclass tolerance.

Testing

ErrorComparatorTest — 13/13 passing: field extraction, NPE-safety (null message, non-SQL throwable), all four verdicts, one-sided prefix in both orderings + throw-vs-null, csvSummary survival of throw-vs-null, and the fail-safe mode gate. Reviewed with Isaac Review (/review --uncommitted) across four rounds.

NO_CHANGELOG=true (test-only comparator tooling, no driver behavior change).

This pull request and its description were written by Isaac.

This PR is assisted by Isaac Autopilot: View the full session spec, test plan, and artifacts.

Introduces the com.databricks.jdbc.comparator.error package — the pure
capture/compare logic for treating a thrown JDBC error as a comparable
result, using only fields read from real accessors (no assumptions about
message format):

- CapturedOutcome (THREW/RETURNED/NULL) + Captures.capture helper
- ErrorFacts: NPE-safe read of exception class, SQLState, vendor code,
  and message
- ErrorComparator/ErrorComparison: MATCH/MISMATCH/ONE_SIDED/NOT_APPLICABLE
  comparing the four raw fields. One-sided diffs carry a stable
  "Error one-sided: " prefix so they always survive to the CSV summary
  regardless of what the non-throwing side renders to (class, scalar, null)
- ErrorPolicy: ERROR_COMPARISON_MODE gate (off|shadow|authoritative,
  default off); an unrecognized value fails safe to off with a warning
  rather than aborting the run

ResultSetComparator's exception branch delegates to ErrorComparator when
the gate is on; with the default off mode, behavior is unchanged.
describeResult() is null-safe. ComparisonResult.csvSummary() keeps the
one-sided prefix and counts "Error <field> mismatch" entries. No existing
suite behavior changes in this PR.

Message normalization / tolerance baselines are intentionally deferred —
we compare raw fields for now and will add tolerance later only if
observed shadow-run data shows it is needed. Exception-class comparison is
intentionally strict (exact match), not the legacy isAssignableFrom
subclass tolerance.

NO_CHANGELOG=true (test-only comparator tooling).

Co-authored-by: Isaac
Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>
@sreekanth-db sreekanth-db merged commit 4e0dc9c into comparator-v2 Jul 2, 2026
1 check failed
@sreekanth-db sreekanth-db deleted the comparator-error-engine branch July 2, 2026 15:09
sreekanth-db added a commit that referenced this pull request Jul 2, 2026
…(PR 2/7) (#1525)

## What

PR 2 of the error / negative-case comparison series (builds on #1524).
Two changes:

**1. Default `ERROR_COMPARISON_MODE` flipped `off` → `shadow`.** Shadow
runs deep error comparison and records DIFF rows but never fails CI
(only a re-thrown Throwable does), so the weekly run stays green while
error-shaped diffs become visible. An unset or unrecognized flag now
resolves to `shadow`.

**2. Existing suites now capture errors instead of propagating (which
aborted the case):**
- `StatementSelect` / `StatementOther` / `PreparedStatementTypes` — each
side's `executeQuery`/`execute` is captured via
`Captures.resultOrThrowable` and handed to
`ResultSetComparator.compare`, which dispatches on runtime type (both
ResultSets → cell diff; a Throwable involved → deep error comparison).
Previously a one-sided throw produced an `ERROR` row and aborted the
case.
- `StatementDml` / `VolumeOperations` — replaced ad-hoc `instanceof
Exception` checks (class + sometimes message only) with the gate-aware
`ErrorDiffs` bridge, gaining SQLState / vendor-code / message comparison
when the gate is on.

New `ErrorDiffs` helper honors the `ERROR_COMPARISON_MODE` gate so `off`
is a true kill switch for DML/Volume too: deep comparison when enabled,
minimal legacy class-only check when off — consistent with the
`ResultSetComparator` path. Added `Captures.resultOrThrowable` /
`closeIfResultSet`.

## Deliberate scope decisions

- **Exception-class comparison remains strict** (exact match). The
subclass-tolerance-vs-baseline decision is deferred to PR 7, to be
settled from real shadow output.
- **Message normalization** also deferred to PR 7.

## Behavior note

With the shadow default, error-provoking inputs in the retrofitted
positive suites will now surface as DIFF rows in the CSV (visibility
without failing CI) — the intended shadow behavior. The weekly report
will show more error-shaped diffs than before.

## Testing

`ErrorComparatorTest` — 17/17 passing, including new `ErrorDiffs` gate
coverage (off → legacy class-only, no SQLState/message diff; shadow →
deep comparison; off still flags a differing class). Reviewed with Isaac
Review (`/review --uncommitted`) across two rounds.

NO_CHANGELOG=true (test-only comparator tooling, no driver behavior
change).

This pull request and its description were written by Isaac.

This PR is assisted by Isaac Autopilot: [View the full session spec,
test plan, and
artifacts](http://go/isaac-autopilot/cli-viewer/1782989650201/sreekanth-vadigi_data).

Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>
sreekanth-db added a commit that referenced this pull request Jul 3, 2026
## What

PR 3 of the error / negative-case comparison series (builds on #1524,
#1525). Adds the first batch of `NEGATIVE_*` suites — error-provoking
inputs the positive suites never run — comparing each endpoint's **error
behavior** (exception class, SQLState, vendor code, message) via the
engine from PR 1/2, under the `shadow` default. All five are
**read-only** and safe on the shared connections.

| Suite | Cases |
|---|---|
| `NEGATIVE_STATEMENT_SELECT` | missing table/column, `SELET` syntax
error, `1/0`, `CAST('x' AS INT)`, `executeQuery` on an INSERT (wrong
method) |
| `NEGATIVE_STATEMENT_OTHER` | SHOW/DESCRIBE/EXPLAIN/SET on bad targets,
`getMoreResults()` after consumption, `getUpdateCount()` before
`execute()` |
| `NEGATIVE_PARAM_BINDING` | out-of-range `setInt` index, too-few
params, type mismatch at execute, `setBigDecimal` over precision/scale |
| `NEGATIVE_PREPARED_METADATA` | `clearParameters` then unbound execute;
`getMetaData()` before execute on invalid SQL / missing table / missing
column |
| `NEGATIVE_TYPE_CONVERSION` | `getInt`/`getByte` overflow, non-numeric
`getInt`, `getObject(col, Integer.class)` on VARCHAR, `getInt` on ARRAY,
`getBigDecimal` on STRUCT |

Each provider captures result-or-throwable per side and folds through
the existing `ResultSetComparator` / `Captures.compareCall` path.
Registered in the `TestSuite` enum (auto-included under `DEFAULT_PARAMS`
via `allExcept`). README suite table updated.

## Note on `Endpoint.java`

This PR also **Spotless-formats `config/Endpoint.java`**, which was
committed to `comparator-v2` in a non-Spotless-clean state and reformats
on every build (a phantom diff for anyone running `mvn spotless:apply`).
Folding the fix in here stops it recurring. **No behavior change** in
that file — formatting only.

## Live validation

Ran against the peco serverless warehouse (Thrift-vs-SEA,
`ERROR_COMPARISON_MODE=shadow`) via a narrowed local runner: **27/27
cases executed, no harness errors.** Observed divergences are
**message-wording only** (same class / SQLState / server error code;
Thrift's Hive-wrapper prose vs SEA's direct text) — exactly what message
normalization in PR 7 will address — plus one genuine one-sided finding:
`getMoreResults()` returns `Boolean` on one side and throws a
`STATEMENT_CLOSED` exception on the other.

## Testing

Live shadow run (above); clean compile; Isaac Review (`/review
--uncommitted`) — 0 findings (all raised comments filtered as not-real,
confirming the read-only / shared-connection design).

NO_CHANGELOG=true (test-only comparator tooling, no driver behavior
change).

This pull request and its description were written by Isaac.

Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>
sreekanth-db added a commit that referenced this pull request Jul 3, 2026
…1527)

## What

PR 4 of the error / negative-case comparison series (builds on #1524,
#1525, #1526). Adds the error-provoking DDL, DML, and batch suites —
each compares the two endpoints' **error behavior** (class, SQLState,
vendor code, message) via the gate-aware `ErrorDiffs` bridge, under the
`shadow` default.

| Suite | Cases |
|---|---|
| `NEGATIVE_STATEMENT_DDL` | DROP/CREATE/ALTER on missing or duplicate
objects (± `IF EXISTS`), missing schema/catalog, malformed syntax |
| `NEGATIVE_STATEMENT_DML` | type-mismatch INSERT, NOT NULL violation,
DML on a missing table, `executeUpdate` on a SELECT (wrong method),
write overflow |
| `NEGATIVE_STATEMENT_BATCH` | `executeBatch` partial/full failure and a
ResultSet-producing command; when both sides throw
`BatchUpdateException`, also compares per-element update counts |

## Isolation

These suites mutate catalog objects, so each runs in **its own
namespace** under `comparator_ddl_tests` (seeded fresh, dropped in a
`finally`), keeping them safe on the shared connections.

## Included fix: NPE in the shared `ErrorDiffs.compare`

The neither-threw case is now guarded **centrally** (returns empty,
mirroring the deep path's `NOT_APPLICABLE`) before reaching
`legacyClassOnly`, which dereferences the null `error()` of a RETURNED
outcome. The new suites call `ErrorDiffs.compare` unconditionally, so an
`off`-mode both-succeed case (e.g. `DROP TABLE IF EXISTS`) would
otherwise NPE and abort. This touches the shared helper introduced in
#1525; added a regression test. (Prior callers were unaffected because
they guarded with `if threw`.)

## Live validation

Ran against the peco serverless warehouse (Thrift-vs-SEA,
`ERROR_COMPARISON_MODE=shadow`) **twice** — 21/21 cases, identical
results both runs. Critically, positive `STATEMENT_DDL` /
`STATEMENT_DML` run in the **same JVM after** the destructive negatives
and still **PASS** — confirming no shared-connection poisoning. Observed
divergences are the expected Thrift-Hive-wrapper-vs-SEA-direct-text kind
(same server error codes: `CAST_INVALID_INPUT`, `SCHEMA_NOT_FOUND`,
`TABLE_OR_VIEW_NOT_FOUND`, etc.), which message normalization in PR 7
will address.

## Testing

`ErrorComparatorTest` — 18/18 (added the neither-threw guard regression
test); two consistent live shadow runs; clean compile; Isaac Review
(`/review --uncommitted`) — 0 findings after the NPE fix.

NO_CHANGELOG=true (test-only comparator tooling, no driver behavior
change).

This pull request and its description were written by Isaac.

Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>
sreekanth-db added a commit that referenced this pull request Jul 3, 2026
…(PR 5a/8) (#1528)

## What

Introduces the **fresh-connection mechanism** for negative suites whose
cases mutate or destroy connection/session state, plus the first two
such suites. Builds on #1524#1527.

> Splits the planned PR 5 into **5a** (this — mechanism + the two
lower-risk suites) and **5b** (later — `NEGATIVE_CONNECTION`,
`NEGATIVE_CANCEL_TIMEOUT`, `NEGATIVE_VOLUME`). The series is now 8 PRs.

### Mechanism
- `config/ConnectionFactory.openFresh(side)` — a dedicated, **uncached**
connection
- `ConnectionManager.openUncached(url)` — `DriverManager.getConnection`,
not cached
- `SuiteProvider`: backward-compatible **default 5-arg** `execute(conn1,
conn2, ConnectionFactory, tc, label)` delegating to the 4-arg form —
only isolation suites override it; the 12 existing providers are
untouched
- `JDBCDriverComparisonTest` builds a per-config factory bound to the
resolved LEFT/RIGHT URLs and calls the 5-arg form

### Suites
Each opens its own fresh connections and closes them in a `finally`, so
they never poison the shared connections:
- `NEGATIVE_CONNECTION_STATE` —
`setCatalog`/`setSchema`/`setClientInfo`/`USE` to a nonexistent target
- `NEGATIVE_TRANSACTION` — commit/rollback with autocommit on; DDL
inside a manual transaction (throwaway table dropped in a `finally`)

## Included fix: one-sided diff mis-filing

`ErrorDiffs.compare` returns a **flat** list, so callers that dumped it
into `dataDifferences` mis-filed one-sided diffs (which belong in
`metadataDifferences`) and dropped them from the CSV summary — surfaced
live as an empty summary on the transaction case. New
`ErrorDiffs.foldInto(result, ...)` preserves the metadata/data split;
migrated **all five** negative providers (the two new + DDL/DML/Batch
from #1527) to it. Kept `compare()` for the DML/Volume positive suites
with a javadoc warning. Added `foldInto` unit tests.

## Live validation

Ran against the peco serverless warehouse (Thrift-vs-SEA,
`ERROR_COMPARISON_MODE=shadow`). Positive suites run in the **same JVM
after** the state-mutating negatives and still **PASS** — no
shared-connection poisoning. Surfaced a real one-sided divergence: `DDL
inside a manual transaction` returns `false` on one side and throws
`42P07` on the other.

## Testing

`ErrorComparatorTest` — 20/20 (added `foldInto` coverage); two live
shadow runs; clean compile; Isaac Review (`/review --uncommitted`) — 0
findings.

NO_CHANGELOG=true (test-only comparator tooling, no driver behavior
change).

This pull request and its description were written by Isaac.

Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>
sreekanth-db added a commit that referenced this pull request Jul 3, 2026
## What

Completes the isolation-heavy negative suites, on top of the
`ConnectionFactory` mechanism from #1528. Builds on #1524#1528.

### ConnectionFactory extension
`NEGATIVE_CONNECTION` needs to build **deliberately-broken**
connections, so the factory now also exposes `urlFor(side)`, `token()`,
and `open(url, token)` (an uncached open with an arbitrary URL/token).
Backed by `ConnectionManager.openUncached(url, token)` + `getToken()`.

### Suites
- `NEGATIVE_CONNECTION` — opening *is* the test: starts from a side's
healthy URL/token and corrupts one piece (bad token, unknown host,
malformed httpPath, bad warehouse id, bad `ConnCatalog`/`ConnSchema`),
then compares the connect failure. Closes any connection that
unexpectedly opens.
- `NEGATIVE_CANCEL_TIMEOUT` — `cancel()` mid slow query / after
completion / on an already-cancelled statement, and `setQueryTimeout(1)`
on a slow query. Own fresh connections; the slow query is a bounded
range cross-join (~5s, no server-side sleep) to keep load and flakiness
low.
- `NEGATIVE_VOLUME` — GET/PUT/REMOVE failures + op-after-close. Runs
under the `VOLUME_OPERATIONS` config (needs
`VolumeOperationAllowedLocalPaths`), so it is excluded from
`DEFAULT_PARAMS` like the positive VOLUME_OPERATIONS suite.

## Live validation

Ran against the peco serverless warehouse (Thrift-vs-SEA,
`ERROR_COMPARISON_MODE=shadow`). Positive suites run in the **same JVM
after** the isolation-heavy negatives and still **PASS** — no
shared-connection poisoning.

- `NEGATIVE_CONNECTION` surfaced real error-shape divergences (bad
token/host/warehouse).
- `NEGATIVE_CANCEL_TIMEOUT` surfaced two one-sided findings:
cancel-after-complete and double-cancel throw `STATEMENT_CLOSED` on one
side and return normally on the other.
- `NEGATIVE_VOLUME` is all-PASS — **confirmed meaningful (not vacuous)**
via a throwaway probe: every case genuinely throws
`DatabricksVolumeOperationException` / statement-closed **identically**
on both endpoints, so PASS is real agreement.

## Testing

`ErrorComparatorTest` — 20/20; live shadow runs (two, + volume probe);
clean compile; Isaac Review (`/review --uncommitted`) — 0 findings.

NO_CHANGELOG=true (test-only comparator tooling, no driver behavior
change).

This pull request and its description were written by Isaac.

Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>
sreekanth-db added a commit that referenced this pull request Jul 3, 2026
…8) (#1530)

## What

The final batch of negative suites — row-limit rejection, ResultSet
misuse, and the Databricks async extension API. Builds on #1524#1529.

| Suite | Cases |
|---|---|
| `NEGATIVE_STATEMENT_SELECT_TRUNCATED` | invalid
`setMaxRows`/`setLargeMaxRows` values (negative, `MIN_VALUE`) —
set-then-execute captured together since the failure may originate at
either point |
| `NEGATIVE_RESULTSET` | `next()`/`getObject()` after the ResultSet or
Statement is closed; out-of-range column index |
| `NEGATIVE_ASYNC` | `getExecutionResult()` before any async execution;
`executeAsync()` on invalid SQL; `getExecutionResult()` after the
statement is closed |

### Notes on scope
- **`NEGATIVE_RESULTSET`**: the originally-planned CloudFetch
link-expiry / chunk-download failure is **intentionally omitted** — it
can't be provoked deterministically from a client test (links are valid
for the query's lifetime; expiry is time/infra dependent), and would
require server-side fault injection out of scope for this harness.
Documented in the provider javadoc.
- **`NEGATIVE_ASYNC`**: uses `stmt.unwrap(IDatabricksStatement.class)`.
The invalid-SQL case **polls `getExecutionResult`** so the async failure
is actually captured — `executeAsync` only submits; the syntax error
surfaces on result retrieval.

## Live validation

Ran against the peco serverless warehouse (Thrift-vs-SEA,
`ERROR_COMPARISON_MODE=shadow`). Positive `STATEMENT_SELECT` runs in the
same JVM and still **PASSes**.

Each case was **probe-verified to hit a real error path** (not vacuous
PASS):
- `setMaxRows(-1)` → `DatabricksValidationException: Invalid input for
maxRows`
- `getExecutionResult()` before exec → `DatabricksSQLException: No
execution available`
- next-after-close → `Operation not allowed - ResultSet is closed`
- `getObject(999)` → real 3-field error mismatch (DIFF)

The all-PASS cases are genuine agreement (both endpoints throw
identically), confirmed via a throwaway probe. Also **caught and fixed a
vacuous first cut** of the async invalid-SQL case that captured only the
successful submit rather than the eventual failure.

## Testing

`ErrorComparatorTest` — 20/20; live shadow runs + probe; clean compile;
Isaac Review (`/review --uncommitted`) — 0 findings.

NO_CHANGELOG=true (test-only comparator tooling, no driver behavior
change).

This pull request and its description were written by Isaac.

Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>
sreekanth-db added a commit that referenced this pull request Jul 3, 2026
…8) (#1532)

## What

Turns deep error comparison **on** for the weekly CI run, in **shadow**
mode: every Thrift-vs-SEA error divergence (class / SQLState / code /
message / one-sided) is recorded in the report/CSV, but the run stays
green. Final PR of the negative-cases series (builds on #1524#1530).

- `run-comparator.sh` and `runJdbcComparator.yml` default
`ERROR_COMPARISON_MODE` to `shadow` (previously the deep comparison only
ran when a caller set the flag).
- `ErrorPolicy` is simplified to two modes: `off` and `shadow`.
- README documents the two modes.

## Why no `authoritative` (fail-the-build) mode

An enforcement mode was prototyped and **deliberately removed** before
this PR. Reason: enforcement over the `DatabaseMetaData` matrix can't be
done cleanly at the diff-string level — `CombinationExecutor` collapses
one-sided throws *and* infra failures into the same `"Execution error:
…"` string, so a gate would either false-fail on an infra flake or
silently miss a real one-sided metadata divergence. Rather than ship a
gate with that known hole (dormant and unused, since we're not enforcing
yet), enforcement is left as a clean future change, driven by the
divergence data these shadow runs will accumulate.

## Behavior

Shadow produces the **same report/CSV as before** and never fails on a
diff — nothing else changes. Exception-class comparison stays strict
(exact match); message text is compared raw (no normalization).

## Testing

`ErrorComparatorTest` — 20/20; compiles clean; dead-code check clean
after removing the prototype gate; Isaac Review (`/review
--uncommitted`) — 0 findings.

NO_CHANGELOG=true (test-only comparator tooling, no driver behavior
change).

This pull request and its description were written by Isaac.

Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant