fix(codegen): support double-quoted identifiers wherever gram.y uses IDENT#37
Conversation
postgres/grammar.js defined quoted_identifier but nothing referenced it, so CREATE TABLE "Foo" (id int) and SELECT "a""b" produced ERROR nodes. In PostgreSQL's lexer a double-quoted identifier is an IDENT terminal, so every IDENT call site must accept both forms. - Map IDENT/UIDENT in BASE_TOKEN_MAP to a new hidden _ident rule, choice($.identifier, $.quoted_identifier), emitted with the lexer rules. Hidden, so the CST keeps (ColId (identifier)) stable and quoted forms surface as (ColId (quoted_identifier)) - word stays on the bare identifier token as tree-sitter requires; the prec.left/prec.dynamic keyword-vs-identifier wrappers now wrap _ident, which is safe because quoted identifiers never lex as keywords - Add corpus cases for quoted table/column names, quoted qualified names, and COLLATE with a quoted collation in index options No new GLR conflicts. Validated against pglifecycle rust-rewrite: its quoted_identifiers_unquote and parses_index_options tests pass with this checkout patched in (the latter was fixed by #28). Co-authored-by: Claude <noreply@anthropic.com>
With the current tree-sitter CLI, `tree-sitter generate postgres/grammar.js` from the repo root writes output to ./src (gitignored) instead of postgres/src, silently leaving the committed parser stale. Run generate from inside postgres/ instead, matching the generate-plpgsql recipe. Co-authored-by: Claude <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (8)
📝 WalkthroughWalkthroughThis PR extends the PostgreSQL tree-sitter grammar to accept quoted identifiers in 13 grammar productions where only unquoted identifiers were previously allowed. It introduces a hidden ChangesQuoted Identifier Grammar Extension
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ast-grep (0.43.0)postgres/grammar.jsComment |
Summary
Quoted identifiers (
"My Table","a""b") now parse everywhere PostgreSQL accepts them. Also fixes thegenerate-postgresrecipe, which silently wrote the regenerated parser to a gitignored directory.Problem
postgres/grammar.jsdefined aquoted_identifiertoken but nothing referenced it —ColId,type_function_name,ColLabel, etc. only accepted$.identifier, soCREATE TABLE "Foo" (id int);andSELECT "a""b";produced ERROR nodes. In PostgreSQL's lexer a double-quoted identifier is anIDENTterminal, so every IDENT call site must accept both forms.Separately,
just generate-postgresrantree-sitter generate postgres/grammar.jsfrom the repo root, which with the current CLI writes output to./src(gitignored) instead ofpostgres/src— regeneration appeared to succeed while leaving the committed parser stale.Solution
Per the codegen-first rule, the fix is in
script/codegen.js:BASE_TOKEN_MAPnow mapsIDENT/UIDENTto a hidden_identrule (choice($.identifier, $.quoted_identifier)) emitted with the lexer rules. Because the rule is hidden, the CST shape stays(ColId (identifier))for unquoted names, with quoted names surfacing as(ColId (quoted_identifier)).wordstays on the bare identifier token as tree-sitter requires, and theprec.dynamickeyword-vs-identifier wrappers now wrap_ident— safe since quoted identifiers never lex as keywords. The recipe now runstree-sitter generatefrom insidepostgres/, matchinggenerate-plpgsql.Impact
New
(quoted_identifier)leaves can appear anywhere an(identifier)leaf could; downstream consumers that match(ColId (identifier))exclusively will not match quoted names (they previously got ERROR nodes, so this is strictly additive). No new GLR conflicts.Testing
PG_SOURCE_DIRonREL_18_3; diff confined to generated postgres filescargo test(Rust bindings) passesrust-rewrite) to this checkout, removed the#[ignore]s onquoted_identifiers_unquoteandparses_index_options— both pass (the latter was fixed by #22 Fix/create index error #28); pglifecycle then revertedSummary by CodeRabbit
New Features
Tests