Skip to content

feat: add ReScript language support#58

Open
malo wants to merge 9 commits into
colbymchenry:mainfrom
malo:feat/rescript-support
Open

feat: add ReScript language support#58
malo wants to merge 9 commits into
colbymchenry:mainfrom
malo:feat/rescript-support

Conversation

@malo
Copy link
Copy Markdown

@malo malo commented Mar 12, 2026

Add ReScript (https://rescript-lang.org/) as a supported language with tree-sitter parsing for .res and .resi files.

The tree-sitter-rescript.wasm grammar (908KB) was built from https://github.com/rescript-lang/tree-sitter-rescript via Docker + emscripten. AST node mappings cover let_declaration, module_declaration, type_declaration, call_expression, pipe_expression, open_statement, and include_statement.

Fix symbol disambiguation in MCP tools: add file parameter to codegraph_node, codegraph_callers, codegraph_callees, and codegraph_impact so that ambiguous symbol names can be resolved by file path. When file is provided but no match is found, return an error instead of silently falling back to a different symbol.

malo and others added 9 commits March 12, 2026 20:30
Add ReScript (https://rescript-lang.org/) as a supported language with
tree-sitter parsing for .res and .resi files.

The tree-sitter-rescript.wasm grammar (908KB) was built from
https://github.com/rescript-lang/tree-sitter-rescript via Docker +
emscripten. AST node mappings cover let_declaration, module_declaration,
type_declaration, call_expression, pipe_expression, open_statement, and
include_statement.

Fix symbol disambiguation in MCP tools: add file parameter to codegraph_node,
codegraph_callers, codegraph_callees, and codegraph_impact so that ambiguous
symbol names can be resolved by file path. When file is provided but no match
is found, return an error instead of silently falling back to a different symbol.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…E.md

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolves merge conflicts with upstream's refactored extraction architecture:
- Language extraction split into per-file modules in src/extraction/languages/
- ReScript extractor moved to src/extraction/languages/rescript.ts using
  the visitNode hook with ExtractorContext API
- MCP tools updated to use upstream's findAllSymbols approach (file param
  for disambiguation superseded by aggregation across all matching symbols)
- CLAUDE.md and README.md updated with both Svelte and ReScript

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The file parameter was added to codegraph_callers, codegraph_callees,
codegraph_impact, and codegraph_node schemas but the handlers never used
it. Conflict resolution replaced the implementation with upstream's
findAllSymbols aggregation approach. Remove the dead schema entries to
keep the diff purely ReScript-related.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…oading

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…architecture

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@andreinknv
Copy link
Copy Markdown
Contributor

Hi! Sorry to drop in cold — I've been working through the open PR backlog and noticed your PR is waiting on the same architectural conflict as a bunch of other language PRs. I've opened a refactor PR (#116) that turns "adding a language" from a 6-file mutation into a 1-file addition. Once that lands, this PR rebases cleanly. Sharing the rebase template inline so you don't have to read the maintainer guide to do it.

After #116 merges, the rebase is mechanical

Step 1. Discard your changes to the monolithic files (#116 makes them auto-derived from the registry):

git checkout main -- \
  src/types.ts \
  src/extraction/grammars.ts \
  src/extraction/languages/index.ts \
  CLAUDE.md

Step 2. Move your extractor's registration into one file, src/extraction/languages/<your-language>.ts:

import { yourExtractor } from './your-extractor-config';
import type { LanguageDef } from './types';

export const YOUR_DEF: LanguageDef = {
  name: 'yourlang',
  displayName: 'Your Language',
  extensions: ['.ext'],
  includeGlobs: ['**/*.ext'],
  // For a tree-sitter grammar:
  grammar: { wasmFile: 'tree-sitter-yourlang.wasm', vendored: true, extractor: yourExtractor },
  // OR for a custom regex/template extractor (Liquid, HCL pattern):
  // customExtractor: (filePath, source) => new YourExtractor(filePath, source).extract(),
};

Step 3. Add 2 lines to src/extraction/languages/registry.ts:

import { YOUR_DEF } from './yourlang';   // alphabetical
// ...
const ALL_DEFS: readonly LanguageDef[] = [
  // ... alphabetical
  YOUR_DEF,
  // ...
];

Step 4. Add 'yourlang' to the Language union in src/types.ts (1 line).

Step 5. Your test additions in __tests__/extraction.test.ts work as-is — the registry refactor doesn't touch tests.

That's it. If your existing <extractor>.ts file is in the right shape (taking filePath+source for custom extractors, or being a LanguageExtractor for grammar-backed), it requires zero changes — just register it via LanguageDef.

Three of my own language PRs (#92, #94, #95) are already pre-rebased to this pattern as references if you want to see what a finished rebase looks like.

Happy to help with the rebase if you'd like — just let me know.

andreinknv added a commit to andreinknv/codegraph that referenced this pull request Apr 28, 2026
…istry

External PR (malo). Originally based on monolithic grammars.ts;
rebased to per-language registry pattern.
- src/extraction/languages/rescript.ts (rescriptExtractor + RESCRIPT_DEF)
- 'rescript' added to Language union
- Vendored tree-sitter-rescript.wasm
- Extensions: .res, .resi.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit to andreinknv/codegraph that referenced this pull request May 9, 2026
…y syntax errors

Friction colbymchenry#58: FTS5 reserved characters (hyphens, quotes, operators) were causing
"invalid FTS5 query syntax" crashes when users included them in intent-mode queries
(e.g., "fallback when no issue-tagged" → FTS5 interprets as "issue MINUS tagged").

Solution: Pre-sanitize user queries by replacing FTS5-reserved chars (-, ^, *, (, ), :, ")
with spaces, collapsing whitespace. Surface sanitization as a non-blocking footer notice
so users understand the transformation and know to use mode='exact' for FTS5 operators.

Implementation:
- Added sanitizeQueryForFts5() helper that returns both sanitized query and a flag
- Sanitization occurs at the entry point before any FTS5 execution
- Error on empty-after-sanitization to catch purely operator-only queries
- Footer notice shows original vs sanitized when modification occurred
- 10 unit tests cover: hyphens, parentheses, colons, asterisks, quotes, whitespace,
  mixed operators, clean queries, and integration with the CLI

Tests pass: 20/20 intent-search tests, full suite 2005/2040 (1 pre-existing timeout in unrelated test).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants