Skip to content

DEV-1390: Flight SQL facade for SLayer (dbt-SL JDBC compatible)#129

Open
ZmeiGorynych wants to merge 1 commit into
mainfrom
egor/dev-1390-flight-sql-facade-for-slayer-dbt-semantic-layer-wire
Open

DEV-1390: Flight SQL facade for SLayer (dbt-SL JDBC compatible)#129
ZmeiGorynych wants to merge 1 commit into
mainfrom
egor/dev-1390-flight-sql-facade-for-slayer-dbt-semantic-layer-wire

Conversation

@ZmeiGorynych
Copy link
Copy Markdown
Member

@ZmeiGorynych ZmeiGorynych commented May 14, 2026

Summary

  • Adds an Arrow Flight SQL endpoint on port 5144 that is wire-compatible with the upstream Apache flight-sql-jdbc-driver v18.3.0 — the same JAR the dbt Semantic Layer connectors (Power BI, Sigma, Looker, Tableau, Hex, DBeaver) use. New CLI: slayer flight-serve.
  • The full chain — translator (sqlglot → SlayerQuery), handlers, server, bearer-token auth, TLS — lives under slayer/flight/. The server is stateless: prepared-statement handles ARE the UTF-8 SQL bytes, so Close is a no-op.
  • Live integration coverage in both tests/integration/test_integration_flight.py (17 JayDeBeAPI tests, 1 xfail for the known JDBC-handshake gap) and test_integration_flight_pyarrow_client.py (21 pyarrow-flight tests, no JDK required). Demo-server fixture shared via tests/integration/conftest.py.
  • Docs: docs/interfaces/flight-sql.md (protocol reference) + docs/getting-started/flight-sql.md (per-tool connect recipes) + Flight SQL section in CLAUDE.md + README mention.

Notable design decisions

  • Any-wrapping: the Apache JDBC driver wraps every do_action body in google.protobuf.Any and requires the response to be Any-wrapped too; pyarrow-flight's Python client sends raw bytes. server._parse_action_body accepts both shapes; handlers.handle_create_prepared_statement always sends an Any-wrapped result. Hard-won lesson from the wire-capture trace.
  • Dotted form end-to-end: customers.regions.name in INFORMATION_SCHEMA.*, the BI-tool projection, WHERE, and the SLayer DSL — no __. rewrite step in the translator.
  • Catalog-declared wire types (Phase 1): the wire schema is derived from Column.type / ModelMeasure.type. A LIMIT 0 still runs for engine-side validation; a Phase 2 follow-up will tighten this to LIMIT-0-derived Arrow types.
  • JDBC token= auth is Phase 2: the Apache driver pre-handshakes the bearer token; SLayer's middleware does per-RPC header validation. test_auth_positive is xfail(strict=True) so a future handshake implementation auto-promotes to PASSED. The pyarrow Python client works today (per-call Authorization headers).
  • JVM --add-opens on Java 17+: the test fixture pre-starts JPype with --add-opens=java.base/java.nio=ALL-UNNAMED (plus java.lang and java.util) for Arrow's reflective memory access. Documented in CLAUDE.md and in the DBeaver section of the getting-started guide.

Test plan

  • poetry run ruff check slayer/ tests/ — clean
  • poetry run pytest tests/flight/ — 173 unit tests pass
  • poetry run pytest -m "not integration" — full unit suite (2395 passed)
  • poetry run pytest tests/integration/test_integration_flight.py -m integration — 17 passed + 1 xfailed (JDBC handshake)
  • poetry run pytest tests/integration/test_integration_flight_pyarrow_client.py -m integration — 21 passed
  • Smoke test end-to-end against the demo: slayer flight-serve --demo + the prepared-statement recipe in specs/DEV-1390-RESUME.md §5

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features

    • Added Arrow Flight SQL endpoint for BI tool connectivity (Power BI, Sigma, Looker, Tableau, DBeaver, Hex).
    • Added slayer flight-serve CLI subcommand to run the Flight SQL server with bearer-token authentication.
    • Implemented INFORMATION_SCHEMA introspection and prepared-statement support for JDBC compatibility.
  • Documentation

    • Added Flight SQL getting-started guide with per-tool connection recipes.
    • Added Flight SQL interface documentation covering server configuration and SQL semantics.
  • Chores

    • Added optional pyarrow dependency and dev dependencies for Flight SQL integration testing.

Review Change Stack

…tegration tests, docs

Phase 1 of the Arrow Flight SQL endpoint that is wire-compatible with the
upstream Apache flight-sql-jdbc-driver (the same JAR the dbt Semantic Layer
connectors use). Adds the slayer flight-serve CLI on port 5144, a full
translator + handler chain backed by SLayer's existing engine, JayDeBeAPI +
pyarrow-flight integration test suites (live wire-validated), and a
recording capture-corpus refresh.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 14, 2026

📝 Walkthrough

Walkthrough

Added a comprehensive Arrow Flight SQL endpoint implementation (DEV-1390 Phase 1) enabling JDBC and BI-tool connectivity to the Semantic Layer. The implementation includes the Flight SQL protocol schema, bearer-token authentication, catalog/metadata generation from semantic models, SQL-to-query translation, RPC handlers, server setup, and full end-to-end integration tests with both JDBC and gRPC clients.

Changes

Flight SQL Server Implementation

Layer / File(s) Summary
Flight SQL protocol schema and type system
slayer/flight/FlightSql.proto, slayer/flight/_flight_sql_pb2.py, slayer/flight/types.py
Protocol Buffers schema defining all Flight SQL message types (metadata queries, prepared statements, transactions, catalog introspection), plus a generated protobuf module and canonical type mappings between SLayer DataType, Apache Arrow, and JDBC representations.
Bearer-token authentication middleware and validation
slayer/flight/auth.py
Startup validators for bind addresses (loopback-by-default enforcement) and TLS keypair pairing, plus a gRPC middleware factory that enforces Authorization: Bearer <token> on authenticated endpoints and logs environment identifiers without affecting auth flow.
Flight catalog and INFORMATION_SCHEMA generation
slayer/flight/catalog.py, slayer/flight/info_schema.py
Pydantic models and builders that snapshot SlayerModel instances into Flight metadata, including metric/dimension expansion with BFS-based join path traversal, collision handling, and type inference; plus INFORMATION_SCHEMA table synthesis (METRICS, DIMENSIONS, TABLES, SCHEMATA, COLUMNS).
SQL-to-SlayerQuery translation pipeline
slayer/flight/translator.py, slayer/flight/probe_queries.py
SQL parser and translator converting incoming Flight SQL strings into tagged-union result types (ProbeResult, InfoSchemaResult, NoOpResult, QueryResult), with table resolution, projection validation, time-grain wrapping, WHERE/GROUP BY/ORDER BY/LIMIT handling, and a whitelisted probe-query dispatcher.
Flight SQL RPC handler implementation
slayer/flight/handlers.py
FlightHandlers container that decodes Flight SQL commands/tickets and dispatches catalog introspection (GetCatalogs, GetDbSchemas, GetTables, GetTableTypes) and SQL RPC handlers, including schema derivation via LIMIT 0 validation, row rewrites with column name mapping, and decimal coercion.
Flight SQL server and RPC dispatch
slayer/flight/server.py
FlightSqlServer extending pyarrow.flight.FlightServerBase with get_flight_info/do_get/do_action implementations that dispatch to handlers; includes a factory function validating configuration and attaching bearer-token middleware.
CLI subcommand registration and server startup
slayer/flight/cli.py, slayer/cli.py
Flight-serve subcommand wiring in the main argument parser, registering arguments for host/port/token/TLS/demo, resolving storage/engine/handlers, optionally preparing demo data, and starting the server.
Test fixtures and recording stubs
tests/flight/conftest.py, tests/flight/_capture_stub.py, tests/integration/conftest.py
Pytest fixtures for JDBC driver caching, JPype/JayDeBeAPI setup, gRPC client setup, and demo server startup; plus test-only RPC logging servers (CaptureFlightServer, RecordingFlightSqlServer) that record incoming requests to JSONL for fixture generation.
Core unit tests
tests/flight/test_*.py
Comprehensive test suite validating catalog construction (metrics, dimensions, joins, BFS depth), type mapping round-trips, SQL translation dispatch and error handling, Flight handlers catalog/SQL dispatch, INFORMATION_SCHEMA content, auth middleware, and probe query recognition.
JDBC and pyarrow.flight integration tests
tests/integration/test_integration_flight.py, tests/integration/test_integration_flight_pyarrow_client.py
End-to-end integration tests exercising the Flight SQL server over JDBC (JayDeBeAPI) and pyarrow.flight gRPC, validating catalog introspection, prepared-statement execution, semantic model queries, INFORMATION_SCHEMA, error paths, bearer-token auth, and concurrency.
Project configuration, documentation, and fixtures
pyproject.toml, slayer/flight/capture_dbt_jdbc.py, tests/flight/fixtures/*, docs/getting-started/flight-sql.md, docs/interfaces/flight-sql.md, CLAUDE.md, README.md, .gitignore, specs/DEV-1390-RESUME.md
Poetry dependency configuration (pyarrow, jaydebeapi, jpype1), CLI capture driver for recording JDBC RPC traces to fixtures, captured RPC samples (JSONL), fixture analysis documentation, BI tool connection guides, and Flight SQL interface specification plus design documentation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

The changes are substantial and intricate: they span protocol definition, async-to-sync bridging, catalog generation with complex join semantics, SQL parsing and translation with strict GROUP BY/ORDER BY validation, request dispatch with multiple message types, authentication middleware, and dual-path integration testing (JDBC and gRPC). The density of semantic logic and multi-layer interactions across handlers, translation, and catalog generation requires careful reasoning about correctness and round-tripping behavior, though the patterns are consistent and well-structured.


Possibly related PRs

  • MotleyAI/slayer#9: Main PR's Flight SQL translator and catalog explicitly build and resolve multi-hop joined/dotted dimensions from SlayerModel.joins and cross-model measure semantics, which were introduced in this PR's dynamic joins & multi-stage query support.

Suggested reviewers

  • whimo
  • AivanF

Poem

🐰 A whisker of SQL through the Arrow flies fast,
JDBC tools now queryable at last!
Join paths dotted, metrics expanded wide,
Prepared statements carry SQL inside.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch egor/dev-1390-flight-sql-facade-for-slayer-dbt-semantic-layer-wire
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch egor/dev-1390-flight-sql-facade-for-slayer-dbt-semantic-layer-wire

@sonarqubecloud
Copy link
Copy Markdown

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (2)
docs/getting-started/flight-sql.md (1)

47-51: 💤 Low value

Consider adding language specifiers to fenced code blocks.

The markdown linter flags these configuration blocks as missing language specifiers. While they're plaintext configuration examples (not strict code), you could add ```text to satisfy the linter and maintain consistency with markdown best practices.

📝 Optional: Add text language specifier

For example, line 47-51:

-```
+```text
 Host: <slayer-host>
 Port: 5144

Apply similarly to the other configuration blocks at lines 57, 68, 78, 87, and 97.

Also applies to: 57-60, 68-72, 78-82, 87-91, 97-100

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/getting-started/flight-sql.md` around lines 47 - 51, Update the fenced
configuration blocks by adding a language specifier (use text) to each opening
triple-backtick so the markdown linter stops flagging them; specifically update
the blocks containing the example lines starting with "Host: <slayer-host>" and
the other similar blocks that include "Port: 5144", "Service token: <slayer
--token value>", and the subsequent configuration examples (the blocks currently
at the locations shown in the review) to begin with ```text instead of just ```,
leaving the block contents unchanged.
tests/integration/conftest.py (1)

21-125: 💤 Low value

Consider extracting shared fixtures to a common module.

The fixtures jdbc_jar, _format_flight_jdbc_url, _ensure_jvm_started_for_arrow, and jaydebeapi_connect are duplicated between tests/flight/conftest.py and this file. While the duplication is documented as intentional (line 38-40), consider extracting these to a shared module like tests/flight/fixtures_common.py that both conftest files can import from. This would make maintenance easier and ensure consistency.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/integration/conftest.py` around lines 21 - 125, The fixtures jdbc_jar,
_format_flight_jdbc_url, _ensure_jvm_started_for_arrow, and jaydebeapi_connect
are duplicated; extract them into a shared module (e.g.,
tests/flight/fixtures_common.py) and import them from both
tests/flight/conftest.py and tests/integration/conftest.py; specifically move
the implementations of jdbc_jar, _format_flight_jdbc_url,
_ensure_jvm_started_for_arrow, and jaydebeapi_connect into the shared file,
export them as fixtures (or plain helper functions where appropriate), then
replace the local definitions in both conftest files with imports that expose
the same fixture names so tests continue to reference jdbc_jar, flight_jdbc_url
(aliasing _format_flight_jdbc_url if needed), and jaydebeapi_connect unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@slayer/flight/auth.py`:
- Around line 150-153: The no-auth branch currently returns
_BearerTokenMiddleware(...) unconditionally when self._expected is None; update
this to enforce the documented loopback-only rule by checking
_peer_is_loopback(info.peer) before accepting unauthenticated requests.
Concretely, in the method that constructs/returns the middleware (the branch
that tests self._expected is None), call _peer_is_loopback(info.peer) and only
return _BearerTokenMiddleware(environment_id=environment_id) if that check
passes; otherwise reject/raise or return a middleware that denies non-loopback
peers (consistent with existing error handling). Ensure you reference the same
info.peer, _peer_is_loopback, self._expected, and _BearerTokenMiddleware symbols
so the change matches surrounding logic.

In `@slayer/flight/info_schema.py`:
- Around line 70-72: The function match_info_schema is a public API with
multiple parameters but currently allows positional args; change its signature
to enforce keyword-only parameters by adding a leading * (e.g., def
match_info_schema(*, parsed: exp.Expression, catalog: FlightCatalog) ->
Optional[pa.Table]) and then update all call sites (notably in
slayer/flight/translator.py) to pass parsed=... and catalog=... accordingly so
callers use keyword arguments.

In `@tests/flight/capture_dbt_jdbc.py`:
- Around line 157-160: The test uses the wrong cross-model dimension syntax:
update the SQL passed to run_select inside the _try call (the lambda in the
"_try(\"cross-model dim SELECT\", lambda: run_select(...))" invocation) to use
dotted form for cross-model fields (customers.regions.name) instead of the
double-underscore form (customers__regions__name); keep the rest of the query
(revenue_sum) unchanged so the test aligns with the Flight SQL dotted
convention.

In `@tests/flight/test_translator.py`:
- Around line 71-72: Change positional calls to translate(...) to use explicit
keyword arguments: pass the SQL string as sql= and the catalog as catalog=
(e.g., replace translate("SELECT 1", _catalog()) with translate(sql="SELECT 1",
catalog=_catalog())). Update all occurrences in this test (the calls around the
shown diff and the additional calls noted at lines 77-79 and 95-96) to follow
the same pattern so every translate invocation with more than one parameter uses
named keywords.

In `@tests/integration/test_integration_flight_pyarrow_client.py`:
- Around line 44-47: The _client function currently accepts positional host and
port and an unused token; change its signature to require keyword-only
parameters (e.g., def _client(*, host: str, port: int) -> fl.FlightClient) and
remove the unused token parameter so auth is clearly handled per-RPC via
_bearer_options; update any callers to pass host=... and port=... (and similarly
make the related helper at the other occurrence keyword-only) so multi-argument
calls follow the repo rule and the function no longer implies implicit auth.

---

Nitpick comments:
In `@docs/getting-started/flight-sql.md`:
- Around line 47-51: Update the fenced configuration blocks by adding a language
specifier (use text) to each opening triple-backtick so the markdown linter
stops flagging them; specifically update the blocks containing the example lines
starting with "Host: <slayer-host>" and the other similar blocks that include
"Port: 5144", "Service token: <slayer --token value>", and the subsequent
configuration examples (the blocks currently at the locations shown in the
review) to begin with ```text instead of just ```, leaving the block contents
unchanged.

In `@tests/integration/conftest.py`:
- Around line 21-125: The fixtures jdbc_jar, _format_flight_jdbc_url,
_ensure_jvm_started_for_arrow, and jaydebeapi_connect are duplicated; extract
them into a shared module (e.g., tests/flight/fixtures_common.py) and import
them from both tests/flight/conftest.py and tests/integration/conftest.py;
specifically move the implementations of jdbc_jar, _format_flight_jdbc_url,
_ensure_jvm_started_for_arrow, and jaydebeapi_connect into the shared file,
export them as fixtures (or plain helper functions where appropriate), then
replace the local definitions in both conftest files with imports that expose
the same fixture names so tests continue to reference jdbc_jar, flight_jdbc_url
(aliasing _format_flight_jdbc_url if needed), and jaydebeapi_connect unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7f7f13fa-151b-4be3-9da9-0f0101568072

📥 Commits

Reviewing files that changed from the base of the PR and between 16c2315 and 5581927.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (36)
  • .gitignore
  • CLAUDE.md
  • README.md
  • docs/getting-started/flight-sql.md
  • docs/interfaces/flight-sql.md
  • pyproject.toml
  • slayer/cli.py
  • slayer/flight/FlightSql.proto
  • slayer/flight/__init__.py
  • slayer/flight/_capture_stub.py
  • slayer/flight/_flight_sql_pb2.py
  • slayer/flight/auth.py
  • slayer/flight/catalog.py
  • slayer/flight/cli.py
  • slayer/flight/handlers.py
  • slayer/flight/info_schema.py
  • slayer/flight/probe_queries.py
  • slayer/flight/server.py
  • slayer/flight/translator.py
  • slayer/flight/types.py
  • specs/DEV-1390-RESUME.md
  • tests/flight/__init__.py
  • tests/flight/capture_dbt_jdbc.py
  • tests/flight/conftest.py
  • tests/flight/fixtures/CAPTURE-FINDINGS.md
  • tests/flight/fixtures/capture-latest.jsonl
  • tests/flight/test_auth.py
  • tests/flight/test_catalog.py
  • tests/flight/test_handlers.py
  • tests/flight/test_info_schema.py
  • tests/flight/test_probe_queries.py
  • tests/flight/test_translator.py
  • tests/flight/test_types.py
  • tests/integration/conftest.py
  • tests/integration/test_integration_flight.py
  • tests/integration/test_integration_flight_pyarrow_client.py
👮 Files not reviewed due to content moderation or server errors (8)
  • tests/integration/test_integration_flight.py
  • tests/flight/test_catalog.py
  • slayer/flight/_flight_sql_pb2.py
  • slayer/flight/_capture_stub.py
  • slayer/flight/server.py
  • tests/flight/test_handlers.py
  • slayer/flight/translator.py
  • slayer/flight/FlightSql.proto

Comment thread slayer/flight/auth.py
Comment on lines +150 to +153
if self._expected is None:
# No-auth mode: loopback fallback. Server startup already rejects
# non-loopback without a token, but recheck here.
return _BearerTokenMiddleware(environment_id=environment_id)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚖️ Poor tradeoff

Missing peer validation in no-auth mode.

The docstring (lines 114-118) states that when no token is configured, "requests from loopback peers are accepted unauthenticated; non-loopback peers are rejected" as a "belt-and-braces" defense. However, the code unconditionally returns the middleware without checking _peer_is_loopback(info.peer).

While validate_bind_address provides startup protection, the middleware should enforce the peer check to match the documented behavior and provide defense-in-depth.

🔒 Proposed fix to add peer validation
         if self._expected is None:
             # No-auth mode: loopback fallback. Server startup already rejects
             # non-loopback without a token, but recheck here.
+            if not _peer_is_loopback(info.peer):
+                raise fl.FlightUnauthenticatedError(
+                    "No token configured; only loopback peers accepted"
+                )
             return _BearerTokenMiddleware(environment_id=environment_id)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@slayer/flight/auth.py` around lines 150 - 153, The no-auth branch currently
returns _BearerTokenMiddleware(...) unconditionally when self._expected is None;
update this to enforce the documented loopback-only rule by checking
_peer_is_loopback(info.peer) before accepting unauthenticated requests.
Concretely, in the method that constructs/returns the middleware (the branch
that tests self._expected is None), call _peer_is_loopback(info.peer) and only
return _BearerTokenMiddleware(environment_id=environment_id) if that check
passes; otherwise reject/raise or return a middleware that denies non-loopback
peers (consistent with existing error handling). Ensure you reference the same
info.peer, _peer_is_loopback, self._expected, and _BearerTokenMiddleware symbols
so the change matches surrounding logic.

Comment on lines +70 to +72
def match_info_schema(
parsed: exp.Expression, catalog: FlightCatalog,
) -> Optional[pa.Table]:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Use keyword-only parameters for public API.

Per coding guidelines, functions with more than 1 parameter should use keyword arguments. Since match_info_schema is exported in __all__ and called from other modules, it should enforce keyword-only parameters for clarity.

♻️ Proposed fix
 def match_info_schema(
-    parsed: exp.Expression, catalog: FlightCatalog,
+    *, parsed: exp.Expression, catalog: FlightCatalog,
 ) -> Optional[pa.Table]:

And update call sites in slayer/flight/translator.py to use keyword arguments:

match_info_schema(parsed=..., catalog=...)

As per coding guidelines: "Use keyword arguments for functions with more than 1 parameter"

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@slayer/flight/info_schema.py` around lines 70 - 72, The function
match_info_schema is a public API with multiple parameters but currently allows
positional args; change its signature to enforce keyword-only parameters by
adding a leading * (e.g., def match_info_schema(*, parsed: exp.Expression,
catalog: FlightCatalog) -> Optional[pa.Table]) and then update all call sites
(notably in slayer/flight/translator.py) to pass parsed=... and catalog=...
accordingly so callers use keyword arguments.

Comment on lines +157 to +160
_try("cross-model dim SELECT",
lambda: run_select(
"SELECT customers__regions__name, revenue_sum FROM orders"
))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use dotted form for cross-model dimensions.

Line 159 uses customers__regions__name but the locked design decision (#12 in DEV-1390-RESUME.md) and coding guidelines specify that Flight SQL uses dotted form end-to-end: customers.regions.name. Update the query to align with the implementation.

📝 Proposed fix
     _try("cross-model dim SELECT",
          lambda: run_select(
-             "SELECT customers__regions__name, revenue_sum FROM orders"
+             "SELECT customers.regions.name, revenue_sum FROM orders"
          ))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
_try("cross-model dim SELECT",
lambda: run_select(
"SELECT customers__regions__name, revenue_sum FROM orders"
))
_try("cross-model dim SELECT",
lambda: run_select(
"SELECT customers.regions.name, revenue_sum FROM orders"
))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/flight/capture_dbt_jdbc.py` around lines 157 - 160, The test uses the
wrong cross-model dimension syntax: update the SQL passed to run_select inside
the _try call (the lambda in the "_try(\"cross-model dim SELECT\", lambda:
run_select(...))" invocation) to use dotted form for cross-model fields
(customers.regions.name) instead of the double-underscore form
(customers__regions__name); keep the rest of the query (revenue_sum) unchanged
so the test aligns with the Flight SQL dotted convention.

Comment on lines +71 to +72
result = translate("SELECT 1", _catalog())
assert isinstance(result, ProbeResult)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use keyword arguments for translate(...) calls with multiple parameters.

This file consistently calls translate positionally with 2 arguments; please switch these to explicit keywords throughout for guideline compliance and readability.

As per coding guidelines **/*.py: Use keyword arguments for functions with more than 1 parameter.

Also applies to: 77-79, 95-96

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/flight/test_translator.py` around lines 71 - 72, Change positional
calls to translate(...) to use explicit keyword arguments: pass the SQL string
as sql= and the catalog as catalog= (e.g., replace translate("SELECT 1",
_catalog()) with translate(sql="SELECT 1", catalog=_catalog())). Update all
occurrences in this test (the calls around the shown diff and the additional
calls noted at lines 77-79 and 95-96) to follow the same pattern so every
translate invocation with more than one parameter uses named keywords.

Comment on lines +44 to +47
def _client(host: str, port: int, *, token: str | None = None) -> fl.FlightClient:
"""Construct a pyarrow Flight client, optionally with a bearer token header."""
return fl.FlightClient(f"grpc://{host}:{port}")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Make _client explicit and keyword-only for multi-arg calls.

token in _client is unused and makes auth behavior look implicit, while auth is actually per-RPC via _bearer_options. Also, multi-parameter calls should be keyword-based per repo rules.

♻️ Proposed fix
-def _client(host: str, port: int, *, token: str | None = None) -> fl.FlightClient:
+def _client(*, host: str, port: int) -> fl.FlightClient:
     """Construct a pyarrow Flight client, optionally with a bearer token header."""
     return fl.FlightClient(f"grpc://{host}:{port}")

-    client = _client(host, port)
+    client = _client(host=host, port=port)

As per coding guidelines **/*.py: Use keyword arguments for functions with more than 1 parameter.

Also applies to: 64-67

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/integration/test_integration_flight_pyarrow_client.py` around lines 44
- 47, The _client function currently accepts positional host and port and an
unused token; change its signature to require keyword-only parameters (e.g., def
_client(*, host: str, port: int) -> fl.FlightClient) and remove the unused token
parameter so auth is clearly handled per-RPC via _bearer_options; update any
callers to pass host=... and port=... (and similarly make the related helper at
the other occurrence keyword-only) so multi-argument calls follow the repo rule
and the function no longer implies implicit auth.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant