Skip to content

pgwire extended-query protocol rejects DSL statements (SEARCH / GRAPH / UPSERT / MATCH) — forces simple-query + string interpolation #43

@hollanf

Description

@hollanf

Standard pgwire clients (postgres.js, JDBC, psycopg, Go pgx) use the extended-query protocol by default. Every NodeDB-specific DSL keyword (SEARCH, GRAPH …, UPSERT INTO, MATCH, OPTIONAL MATCH, …) fails to parse on this path, so application code is forced to either disable prepared statements globally or use raw simple-query plus manual string interpolation — losing parameter binding and inviting SQL injection.

Symptom seen by client

postgres.js tagged template:

await sql`SEARCH memories USING VECTOR(embedding, ${vec}, ${k})`
// → ERROR: Expected: an SQL statement, found: SEARCH

Documented workaround in a downstream app (real production code):

// `SEARCH memories USING VECTOR(...)` is a NodeDB DSL extension. It only
// parses via pgwire simple-query protocol (sql.unsafe), NOT the extended
// protocol that postgres.js tagged templates use — extended mode wraps the
// statement in PREPARE which NodeDB's parser rejects with
// "Expected: an SQL statement, found: SEARCH".
const raw = await sql.unsafe(
  `SEARCH memories USING VECTOR(embedding, ${vecStr}, ${k})`
)

The same connection has had to set prepare: false, fetch_types: false globally just to avoid the extended path:

postgres({ host, port, database: "nodedb", ..., fetch_types: false, prepare: false })

Root cause (in the nodedb tree)

Two execution paths exist in the pgwire handler and they don't share a dispatcher:

  1. Simple querynodedb/src/control/server/pgwire/handler/core.rs:202-254:

    impl SimpleQueryHandler for NodeDbPgHandler {
        async fn do_query<C>(...) -> PgWireResult<Vec<Response>> {
            ...
            let result = self.execute_sql(&identity, &addr, query).await;
            ...
        }
    }

    execute_sqlexecute_single_sql → calls the DSL/DDL dispatcher at sql_exec.rs:286-289:

    if let Some(result) = super::super::ddl::dispatch(&self.state, identity, sql_trimmed).await {
        return result;
    }

    The dispatcher routes SEARCH / GRAPH / UPSERT / MATCH via pgwire/ddl/router/dsl.rs:53-148:

    if upper.starts_with(\"SEARCH \") && upper.contains(\"USING VECTOR\") { ... }
    if upper.starts_with(\"SEARCH \") && upper.contains(\"USING FUSION\") { ... }
    if upper.starts_with(\"GRAPH INSERT EDGE \") { ... }
    if upper.starts_with(\"GRAPH TRAVERSE \") { ... }
    if upper.starts_with(\"GRAPH ALGO \") { ... }
    if upper.starts_with(\"MATCH \") || upper.starts_with(\"OPTIONAL MATCH \") { ... }
    if upper.starts_with(\"UPSERT INTO \") { ... }
  2. Extended querycore.rs:259-281:

    impl ExtendedQueryHandler for NodeDbPgHandler {
        ...
        async fn do_query<C>(...) -> PgWireResult<Response> {
            self.execute_prepared(client, portal, max_rows).await
        }
    }

    execute_preparedprepared/execute.rs:46-49:

    let mut results = self
        .execute_planned_sql_with_params(&identity, &stmt.sql, tenant_id, &addr, &params)
        .await?;

    execute_planned_sql_with_paramsexecute_planned_sql_inner (routing/mod.rs:75-100) — which goes straight to nodedb_sql::plan_sql and never invokes ddl::dispatch.

Already at the Parse phase, prepared/parser.rs:63-66 tries to plan with nodedb_sql::plan_sql(sql, &catalog) — which is the standard SQL planner that doesn't know SEARCH, GRAPH, etc. and emits the Expected: an SQL statement, found: SEARCH error. There is no fall-back to the DSL dispatcher at any point in the extended-query pipeline.

Why this matters

  1. Forces SQL injection vectors. Clients fall back to sql.unsafe(\SEARCH … '${userInput}'`). Application code has to roll its own escape function (e.g. moomoori's esc()`). Any miss is an injection.
  2. Forces global prepare: false for the whole connection pool, which removes parameter binding for every statement, not just DSL ones — losing the protocol-level type safety pgwire is supposed to provide.
  3. No protocol-aware DSL. Even an explicit PREPARE stmt AS SEARCH … from a SQL client fails the same way (the simple-query PREPARE handler at sql_exec.rs:246 just plans it).
  4. The Describe / Bind round trip also breaks — clients that introspect parameter types up-front (Java, Go) get an error before they can submit values.

Repro

import postgres from "postgres";
const sql = postgres({ host: \"localhost\", port: 6432, database: \"nodedb\" });
// (NB: no prepare:false here, so this uses the extended path)
await sql\`SEARCH memories USING VECTOR(embedding, ARRAY[0.1, 0.2, 0.3], 5)\`;
// ERROR: Expected: an SQL statement, found: SEARCH

Same shape with psycopg:

cur.execute(\"SEARCH memories USING VECTOR(embedding, %s, %s)\", (vec, k))
# ERROR: Expected: an SQL statement, found: SEARCH

All affected DSL prefixes (per pgwire/ddl/router/dsl.rs): SEARCH … USING VECTOR, SEARCH … USING FUSION, GRAPH INSERT EDGE, GRAPH DELETE EDGE, GRAPH LABEL, GRAPH UNLABEL, GRAPH TRAVERSE, GRAPH NEIGHBORS, GRAPH PATH, GRAPH ALGO, MATCH, OPTIONAL MATCH, UPSERT INTO.

Notes

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions