Skip to content

Add formatExplainJSON exporter for ClickHouse EXPLAIN AST json = 1 with vendored reference suite#26

Draft
Copilot wants to merge 2 commits into
mainfrom
copilot/check-ast-format-synchronization
Draft

Add formatExplainJSON exporter for ClickHouse EXPLAIN AST json = 1 with vendored reference suite#26
Copilot wants to merge 2 commits into
mainfrom
copilot/check-ast-format-synchronization

Conversation

Copilot AI commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Implements Phases 1–3 of the JSON-AST plan: vendor the EXPLAIN AST json = 1 fixture corpus from the upstream ClickHouse PR (peter-leonov-ch/ClickHouse#1), implement an exporter converting this library's AST to that v1 JSON contract, and add a reference test suite plus debug tooling.

Vendored reference corpus

  • tests/clickhouse-reference-ast-json/cases/ — the 41 SQL → JSON pairs from the upstream PR, pinned at commit ad33aa4ceba52f42afb17246cc2778d3733cfb7e. README records the source PR, SHA, format version, and snapshot caveats (fixtures at that SHA are bare AST nodes with native 64-bit numbers).
  • Regeneration tooling for when a build containing the PR is available: npm run generate:ast-json-fixtures, explain:ch --json, and a CLICKHOUSE_IMAGE override in docker-compose.yml. Until then the vendored copies are the source of truth.

Exporter

  • src/explain-json.tsformatExplainJSON(statements) (one {version: 1, ast} document per statement) and explainJSON(statement), exported from src/index.ts. Pure formatting over AST data, reusing OP_TO_FUNCTION and helpers extracted from src/explain.ts. Covers the SELECT family (SELECT/UNION/INTERSECT/EXCEPT); other statement kinds throw.
import { parse, explainJSON } from '@clickhouse/parser';

explainJSON(parse('SELECT * FROM foo WHERE x = 1')[0]);
// { version: 1, ast: { type: 'SelectWithUnionQuery', selects: [{ type: 'SelectQuery',
//   select: [{ type: 'Asterisk' }], tables: {...},
//   where: { type: 'Function', name: 'equals', is_operator: true, arguments: [...] } }] } }

Grammar/AST fidelity fixes surfaced by the contract

Per repo convention, data missing from the AST became grammar changes rather than parsing inside the formatter. These were silently discarded by the parser before (and dropped by format()):

  • NULLS FIRST/LAST on ORDER BY (nullsFirst) and bare WITH FILL (withFill)
  • RESPECT/IGNORE NULLS on functions (nullsAction)
  • OVER w named-window references (windowName) and the parent window name in OVER (w …) — previously the entire OVER w clause was lost
  • JOIN strictness (ANY/ALL/ASOF/SEMI/ANTI), GLOBAL locality, and comma-join vs CROSS JOIN
  • INTERSECT/EXCEPT DISTINCT (distinct on the intersect node)
  • isOperator marker on operator-syntax desugar sites (arr[1], tup.2, BETWEEN, -x, LIKE, ||, ?:, IS NULL, …) so explicit calls like arrayElement(arr, 1) are distinguishable from operator syntax

format.ts now prints the new semantic fields; isOperator is treated as surface-syntax metadata (stripped in round-trip/golden comparisons, like location). Self-goldens in tests/clickhouse-reference regenerated accordingly (~890 files).

Test suite and tooling

  • tests/clickhouse-reference-ast-json.test.ts — mirrors the existing reference suites: parse each fixture, run the exporter, deep-compare structurally. All 41 cases pass; the documented skip/known-divergence list is empty.
  • npm run diff:ast-json following the existing diff:* pattern, with key-order-insensitive diffs; diff-lib generalized to support alternate corpus layouts.

Known follow-ups

  • README section documenting the new export (the corpus README covers the contract pin meanwhile).
  • One no-control-regex lint nit in src/explain-json.ts (unescapeLabel).
  • Phase 0 (contract freeze) and upstream feedback (Phase 4 item 12) are upstream-side actions on the ClickHouse PR.

@CLAassistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@peter-leonov-ch

peter-leonov-ch commented Jun 14, 2026

Copy link
Copy Markdown
Collaborator

This is a result of the core agent having a little back and forth with this repo agent on the best strategy for AST sync. Most of the changes on the core agent made sense to me. Not sure (yet) about the changes to this repo.

This is the partner PR to my fork of core: peter-leonov-ch/ClickHouse#1

@peter-leonov-ch peter-leonov-ch requested a review from Copilot June 14, 2026 12:53

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review this pull request because it exceeds the maximum number of files (300). Try reducing the number of changed files and requesting a review from Copilot again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants