Skip to content

Commit 3618419

Browse files
prmoore77claude
andcommitted
Feat: Add GizmoSQL engine adapter
Add support for GizmoSQL, a database server that uses DuckDB as its execution engine and exposes an Arrow Flight SQL interface for remote connections. Key features: - ADBC (Arrow Database Connectivity) with Flight SQL driver - DuckDB SQL dialect for query generation - Full transaction support via SQL statements (BEGIN/COMMIT/ROLLBACK) - Lazy execution handling for DDL/DML operations - Full catalog operations support - Arrow-to-Pandas conversion for efficient data transfer - TLS encryption with optional certificate verification skip - DuckDB backend validation on connection Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent d5ceeb2 commit 3618419

File tree

14 files changed

+1019
-0
lines changed

14 files changed

+1019
-0
lines changed

.circleci/wait-for-db.sh

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,14 @@ risingwave_ready() {
5959
probe_port 4566
6060
}
6161

62+
gizmosql_ready() {
63+
# GizmoSQL uses port 31337 for Flight SQL connections
64+
# Also check that the server has fully started by looking for the startup message
65+
probe_port 31337
66+
# Give it a few more seconds for the server to initialize after port is available
67+
sleep 3
68+
}
69+
6270
echo "Waiting for $ENGINE to be ready..."
6371

6472
READINESS_FUNC="${ENGINE}_ready"

Makefile

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -208,6 +208,9 @@ trino-test: engine-trino-up
208208
risingwave-test: engine-risingwave-up
209209
pytest -n auto -m "risingwave" --reruns 3 --junitxml=test-results/junit-risingwave.xml
210210

211+
gizmosql-test: engine-gizmosql-up
212+
pytest -n auto -m "gizmosql" --reruns 3 --junitxml=test-results/junit-gizmosql.xml
213+
211214
#################
212215
# Cloud Engines #
213216
#################
Lines changed: 144 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
# GizmoSQL
2+
3+
This page provides information about how to use SQLMesh with the [GizmoSQL](https://github.com/gizmodata/gizmosql) database server.
4+
5+
!!! info
6+
The GizmoSQL engine adapter is a community contribution. Due to this, only limited community support is available.
7+
8+
## Overview
9+
10+
GizmoSQL is a database server that uses [DuckDB](./duckdb.md) as its execution engine and exposes an [Apache Arrow Flight SQL](https://arrow.apache.org/docs/format/FlightSql.html) interface for remote connections. This allows you to connect to a GizmoSQL server from anywhere on your network while still benefiting from DuckDB's fast analytical query processing.
11+
12+
The SQLMesh GizmoSQL adapter uses [ADBC (Arrow Database Connectivity)](https://arrow.apache.org/docs/format/ADBC.html) with the Flight SQL driver to communicate with GizmoSQL servers. Data is transferred using the efficient Apache Arrow columnar format.
13+
14+
!!! note
15+
This adapter only supports the DuckDB backend for GizmoSQL. Attempting to connect to a GizmoSQL server running a different backend will result in an error.
16+
17+
## Local/Built-in Scheduler
18+
19+
**Engine Adapter Type**: `gizmosql`
20+
21+
### Installation
22+
23+
```
24+
pip install "sqlmesh[gizmosql]"
25+
```
26+
27+
This will install the required dependencies:
28+
29+
- `adbc-driver-flightsql` - The ADBC driver for Arrow Flight SQL
30+
- `pyarrow` - Apache Arrow Python bindings
31+
32+
## Connection options
33+
34+
| Option | Description | Type | Required |
35+
|------------------------------------|-------------------------------------------------------------------------------|:-------:|:--------:|
36+
| `type` | Engine type name - must be `gizmosql` | string | Y |
37+
| `host` | The hostname of the GizmoSQL server | string | N |
38+
| `port` | The port number of the GizmoSQL server (default: `31337`) | int | N |
39+
| `username` | The username for authentication with the GizmoSQL server | string | Y |
40+
| `password` | The password for authentication with the GizmoSQL server | string | Y |
41+
| `use_encryption` | Whether to use TLS encryption for the connection (default: `true`) | bool | N |
42+
| `disable_certificate_verification`| Skip TLS certificate verification - useful for self-signed certs (default: `false`) | bool | N |
43+
| `database` | The default database/catalog to use | string | N |
44+
45+
### Example configuration
46+
47+
=== "YAML"
48+
49+
```yaml linenums="1"
50+
gateways:
51+
gizmosql:
52+
connection:
53+
type: gizmosql
54+
host: gizmosql.example.com
55+
port: 31337
56+
username: my_user
57+
password: my_password
58+
use_encryption: true
59+
disable_certificate_verification: false
60+
```
61+
62+
=== "Python"
63+
64+
```python linenums="1"
65+
from sqlmesh.core.config import (
66+
Config,
67+
GatewayConfig,
68+
ModelDefaultsConfig,
69+
)
70+
from sqlmesh.core.config.connection import GizmoSQLConnectionConfig
71+
72+
config = Config(
73+
model_defaults=ModelDefaultsConfig(dialect="duckdb"),
74+
gateways={
75+
"gizmosql": GatewayConfig(
76+
connection=GizmoSQLConnectionConfig(
77+
host="gizmosql.example.com",
78+
port=31337,
79+
username="my_user",
80+
password="my_password",
81+
use_encryption=True,
82+
disable_certificate_verification=False,
83+
),
84+
),
85+
},
86+
)
87+
```
88+
89+
## SQL Dialect
90+
91+
GizmoSQL uses the DuckDB SQL dialect. When writing models for GizmoSQL, set your model dialect to `duckdb`:
92+
93+
```yaml
94+
model_defaults:
95+
dialect: duckdb
96+
```
97+
98+
Or specify the dialect in individual model definitions:
99+
100+
```sql
101+
MODEL (
102+
name my_schema.my_model,
103+
dialect duckdb
104+
);
105+
106+
SELECT * FROM my_table;
107+
```
108+
109+
## Docker Setup
110+
111+
For local development and testing, you can run GizmoSQL using Docker:
112+
113+
```bash
114+
docker run -d \
115+
--name gizmosql \
116+
-p 31337:31337 \
117+
-e GIZMOSQL_USERNAME=gizmosql_user \
118+
-e GIZMOSQL_PASSWORD=gizmosql_password \
119+
-e TLS_ENABLED=1 \
120+
gizmodata/gizmosql:latest
121+
```
122+
123+
Then connect with:
124+
125+
```yaml
126+
gateways:
127+
gizmosql:
128+
connection:
129+
type: gizmosql
130+
host: localhost
131+
port: 31337
132+
username: gizmosql_user
133+
password: gizmosql_password
134+
use_encryption: true
135+
disable_certificate_verification: true # For self-signed certs
136+
```
137+
138+
## Related Integrations
139+
140+
GizmoSQL has adapters available for other popular data tools:
141+
142+
- [Ibis GizmoSQL](https://pypi.org/project/ibis-gizmosql/) - Ibis backend for GizmoSQL
143+
- [dbt-gizmosql](https://pypi.org/search/?q=dbt-gizmosql) - dbt adapter for GizmoSQL
144+
- [SQLFrame GizmoSQL](https://github.com/gizmodata/sqlframe) - SQLFrame (PySpark-like API) support for GizmoSQL

docs/integrations/overview.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ SQLMesh supports the following execution engines for running SQLMesh projects (e
2323
* [MySQL](./engines/mysql.md) (mysql)
2424
* [Postgres](./engines/postgres.md) (postgres)
2525
* [GCP Postgres](./engines/gcp-postgres.md) (gcppostgres)
26+
* [GizmoSQL](./engines/gizmosql.md) (gizmosql)
2627
* [Redshift](./engines/redshift.md) (redshift)
2728
* [Snowflake](./engines/snowflake.md) (snowflake)
2829
* [Spark](./engines/spark.md) (spark)

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,7 @@ nav:
8989
- integrations/engines/mysql.md
9090
- integrations/engines/postgres.md
9191
- integrations/engines/gcp-postgres.md
92+
- integrations/engines/gizmosql.md
9293
- integrations/engines/redshift.md
9394
- integrations/engines/risingwave.md
9495
- integrations/engines/snowflake.md

pyproject.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -144,6 +144,7 @@ lsp = [
144144
"lsprotocol",
145145
]
146146
risingwave = ["psycopg2"]
147+
gizmosql = ["adbc-driver-flightsql", "pyarrow"]
147148

148149
[project.scripts]
149150
sqlmesh = "sqlmesh.cli.main:cli"
@@ -271,6 +272,7 @@ markers = [
271272
"pyspark: test for PySpark that need to run separately from the other spark tests",
272273
"trino: test for Trino (all connectors)",
273274
"risingwave: test for Risingwave",
275+
"gizmosql: test for GizmoSQL",
274276

275277
# Other
276278
"set_default_connection",

sqlmesh/core/config/connection.py

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2326,6 +2326,104 @@ def init(cursor: t.Any) -> None:
23262326
return init
23272327

23282328

2329+
class GizmoSQLConnectionConfig(ConnectionConfig):
2330+
"""
2331+
GizmoSQL connection configuration.
2332+
2333+
GizmoSQL is a database server that uses DuckDB as its execution engine and
2334+
exposes an Arrow Flight SQL interface for remote connections. This configuration
2335+
uses ADBC (Arrow Database Connectivity) with the Flight SQL driver.
2336+
2337+
Args:
2338+
host: The hostname of the GizmoSQL server.
2339+
port: The port of the GizmoSQL server (default: 31337).
2340+
username: The username for authentication.
2341+
password: The password for authentication.
2342+
use_encryption: Whether to use TLS encryption (default: True).
2343+
disable_certificate_verification: Whether to skip TLS certificate verification.
2344+
Useful for self-signed certificates in development (default: False).
2345+
database: The default database/catalog to use.
2346+
concurrent_tasks: The maximum number of concurrent tasks.
2347+
register_comments: Whether to register model comments.
2348+
pre_ping: Whether to pre-ping the connection.
2349+
"""
2350+
2351+
host: str = "localhost"
2352+
port: int = 31337
2353+
username: str
2354+
password: str
2355+
use_encryption: bool = True
2356+
disable_certificate_verification: bool = False
2357+
database: t.Optional[str] = None
2358+
2359+
concurrent_tasks: int = 4
2360+
register_comments: bool = True
2361+
pre_ping: bool = False
2362+
2363+
type_: t.Literal["gizmosql"] = Field(alias="type", default="gizmosql")
2364+
DIALECT: t.ClassVar[t.Literal["duckdb"]] = "duckdb"
2365+
DISPLAY_NAME: t.ClassVar[t.Literal["GizmoSQL"]] = "GizmoSQL"
2366+
DISPLAY_ORDER: t.ClassVar[t.Literal[17]] = 17
2367+
2368+
_engine_import_validator = _get_engine_import_validator(
2369+
"adbc_driver_flightsql", "gizmosql", extra_name="gizmosql"
2370+
)
2371+
2372+
@property
2373+
def _connection_kwargs_keys(self) -> t.Set[str]:
2374+
# ADBC uses a different connection pattern, so we don't pass these directly
2375+
return set()
2376+
2377+
@property
2378+
def _engine_adapter(self) -> t.Type[EngineAdapter]:
2379+
return engine_adapter.GizmoSQLEngineAdapter
2380+
2381+
@property
2382+
def _connection_factory(self) -> t.Callable:
2383+
"""
2384+
Create a connection factory for GizmoSQL using ADBC Flight SQL driver.
2385+
2386+
The connection is established using the Arrow Flight SQL protocol over gRPC.
2387+
"""
2388+
import re
2389+
from adbc_driver_flightsql import dbapi as flightsql, DatabaseOptions
2390+
2391+
def connect() -> t.Any:
2392+
# Build the URI for the Flight SQL connection
2393+
protocol = "grpc+tls" if self.use_encryption else "grpc"
2394+
uri = f"{protocol}://{self.host}:{self.port}"
2395+
2396+
# ADBC database-level options (passed to the driver)
2397+
db_kwargs: t.Dict[str, str] = {
2398+
"username": self.username,
2399+
"password": self.password,
2400+
}
2401+
2402+
# Add TLS skip verify option using the proper DatabaseOptions enum
2403+
if self.use_encryption and self.disable_certificate_verification:
2404+
db_kwargs[DatabaseOptions.TLS_SKIP_VERIFY.value] = "true"
2405+
2406+
# Create the connection - uri is first positional arg, db_kwargs is for driver options
2407+
# Explicit autocommit=True since GizmoSQL doesn't support manual transaction commits
2408+
conn = flightsql.connect(uri, db_kwargs=db_kwargs, autocommit=True)
2409+
2410+
# Verify the backend is DuckDB - this adapter only supports the DuckDB backend
2411+
vendor_version = conn.adbc_get_info().get("vendor_version", "")
2412+
if not re.search(pattern=r"^duckdb ", string=vendor_version):
2413+
conn.close()
2414+
raise ConfigError(
2415+
f"Unsupported GizmoSQL server backend: '{vendor_version}'. "
2416+
"This adapter only supports the DuckDB backend for GizmoSQL."
2417+
)
2418+
2419+
return conn
2420+
2421+
return connect
2422+
2423+
def get_catalog(self) -> t.Optional[str]:
2424+
return self.database
2425+
2426+
23292427
CONNECTION_CONFIG_TO_TYPE = {
23302428
# Map all subclasses of ConnectionConfig to the value of their `type_` field.
23312429
tpe.all_field_infos()["type_"].default: tpe

sqlmesh/core/engine_adapter/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
from sqlmesh.core.engine_adapter.athena import AthenaEngineAdapter
2121
from sqlmesh.core.engine_adapter.risingwave import RisingwaveEngineAdapter
2222
from sqlmesh.core.engine_adapter.fabric import FabricEngineAdapter
23+
from sqlmesh.core.engine_adapter.gizmosql import GizmoSQLEngineAdapter
2324

2425
DIALECT_TO_ENGINE_ADAPTER = {
2526
"hive": SparkEngineAdapter,
@@ -37,6 +38,7 @@
3738
"athena": AthenaEngineAdapter,
3839
"risingwave": RisingwaveEngineAdapter,
3940
"fabric": FabricEngineAdapter,
41+
"gizmosql": GizmoSQLEngineAdapter,
4042
}
4143

4244
DIALECT_ALIASES = {

0 commit comments

Comments
 (0)