Skip to content
2 changes: 1 addition & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ SQLite database with:

### Supported Languages

TypeScript, JavaScript, TSX, JSX, Svelte, Python, Go, Rust, Java, C, C++, C#, PHP, Ruby, Swift, Kotlin, Dart, Liquid, Pascal
TypeScript, JavaScript, TSX, JSX, Svelte, Python, Go, Rust, Java, C, C++, C#, PHP, Ruby, Swift, Kotlin, Dart, Liquid, Pascal, ReScript

### Node and Edge Types

Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ All tests used Claude Opus 4.6 (1M context) with Claude Code v2.1.91. Each test
| **Full-Text Search** | Find code by name instantly across your entire codebase, powered by FTS5 |
| **Impact Analysis** | Trace callers, callees, and the full impact radius of any symbol before making changes |
| **Always Fresh** | File watcher uses native OS events (FSEvents/inotify/ReadDirectoryChangesW) with debounced auto-sync — the graph stays current as you code, zero config |
| **19+ Languages** | TypeScript, JavaScript, Python, Go, Rust, Java, C#, PHP, Ruby, C, C++, Swift, Kotlin, Dart, Svelte, Liquid, Pascal/Delphi |
| **20+ Languages** | TypeScript, JavaScript, Python, Go, Rust, Java, C#, PHP, Ruby, C, C++, Swift, Kotlin, Dart, Svelte, Liquid, Pascal/Delphi, ReScript |
| **100% Local** | No data leaves your machine. No API keys. No external services. SQLite database only |

---
Expand Down Expand Up @@ -399,6 +399,7 @@ The `.codegraph/config.json` file controls indexing:
| Svelte | `.svelte` | Full support (script extraction, Svelte 5 runes, SvelteKit routes) |
| Liquid | `.liquid` | Full support |
| Pascal / Delphi | `.pas`, `.dpr`, `.dpk`, `.lpr` | Full support (classes, records, interfaces, enums, DFM/FMX form files) |
| ReScript | `.res`, `.resi` | Full support (modules, functors, functions, records, variants, pattern matching, JSX components) |

## Troubleshooting

Expand Down
185 changes: 185 additions & 0 deletions RESCRIPT-SUPPORT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
# ReScript Support for CodeGraph

## Why ReScript?

[ReScript](https://rescript-lang.org/) is a robustly typed language that compiles to efficient JavaScript. It combines a powerful type system with a syntax familiar to JavaScript developers. With strong adoption in production applications (particularly in the React ecosystem via `rescript-react`), ReScript projects benefit from semantic code intelligence for navigating module hierarchies, understanding type relationships, and tracing call graphs through pipe chains.

ReScript's module system (influenced by OCaml) means that codebases are organized differently from class-based languages — modules are the primary unit of composition, and functors enable powerful abstraction patterns. CodeGraph's structural understanding helps developers navigate these patterns effectively.

## What Was Implemented

### ReScript Extraction (tree-sitter)

Full extraction support for `.res` and `.resi` files using a WASM build of the `tree-sitter-rescript` grammar:

| Feature | NodeKind | Details |
|---------|----------|---------|
| Functions | `function` | `let` bindings with function body (`let foo = (x) => ...`) |
| Variables | `variable` | `let` bindings with non-function body (`let x = expr`) |
| Externals (FFI) | `function` | `external` declarations with type annotation |
| Modules | `namespace` | `module` declarations (primary organizational unit) |
| Module Types | `interface` | `module type` declarations (signatures without definitions) |
| Type Aliases | `type_alias` | `type t = ...` declarations |
| Variant Types | `enum` | Types with `variant_declaration` body, including enum members |
| Record Types | `struct` | Types with `record_type` body, including field extraction |
| Record Fields | `field` | Individual fields within record types |
| Enum Members | `enum_member` | Individual variant constructors |
| Imports | `import` | `open Module` and `include Module` statements |
| Exceptions | `type_alias` | `exception` declarations |
| Function Calls | — | `calls` edges for `call_expression` nodes |
| Pipe Expressions | — | `calls` edges for `x->f(y)` pipe chains |
| Decorators | — | `@module`, `@schema`, etc. extracted as metadata |
| Signatures | — | Parameter lists and return types for functions |
| Containment | — | `contains` edges (module → function, type → field, etc.) |
| Functors | — | Functor bodies traversed for nested declarations |
| Module Aliases | — | `module X = OtherModule` creates `references` edge |
| Docstrings | — | Preceding doc comments captured |
| ERROR Recovery | — | Valid structures inside tree-sitter ERROR nodes are extracted |

### MCP Symbol Disambiguation

Added `file` parameter to `codegraph_node`, `codegraph_callers`, `codegraph_callees`, and `codegraph_impact` tools to disambiguate when multiple symbols share the same name across different files. When `file` is provided but no match is found, an error is returned instead of silently falling back to a different symbol.

## Architecture

The implementation follows CodeGraph's established patterns:

- **ReScript extraction** is implemented as a `LanguageExtractor` in `src/extraction/languages/rescript.ts`, following the same per-language file pattern as TypeScript, Go, Rust, etc.
- **`visitNode()` hook** handles ReScript's wrapper node pattern where declarations use intermediate binding nodes (`let_declaration` → `let_binding`, `module_declaration` → `module_binding`, `type_declaration` → `type_binding`)
- **Pipe expression handling** extracts the piped function from `pipe_expression` nodes and creates `calls` edges, enabling call graph traversal through `x->Array.map(f)->Array.filter(g)` chains
- **ERROR node recovery** walks children of tree-sitter ERROR nodes to extract valid structures (common in `tree-sitter-rescript` for certain syntax patterns)
- **`tree-sitter-rescript.wasm`** (908KB) ships in `src/extraction/wasm/` (not in the `tree-sitter-wasms` npm package), following the same pattern as Pascal

### ReScript AST → CodeGraph Node Mapping

| CodeGraph Concept | ReScript AST Node Type | Notes |
|---|---|---|
| function | `let_declaration` (when body is `function`) | let bindings with function bodies |
| function | `external_declaration` | FFI bindings |
| namespace | `module_declaration` (with definition) | Primary organizational unit |
| interface | `module_declaration` (with `type` keyword, signature only) | Module types |
| type_alias | `type_declaration` | Generic type declarations |
| enum | `type_declaration` (with `variant_declaration` body) | Variant types |
| struct | `type_declaration` (with `record_type` body) | Record types |
| variable | `let_declaration` (when body is not `function`) | Value bindings |
| import | `open_statement` | `open Module` |
| import | `include_statement` | `include Module` |
| calls | `call_expression` | Direct function calls |
| calls | `pipe_expression` | `x->f(y)` pipe chains |

### Key Design Decisions

- **Modules → `namespace`**: ReScript has no classes; modules are the primary organizational unit, mapped to `namespace` NodeKind
- **`let` overloading**: `let_declaration` can be a function or a variable — determined by checking if the body is a `function` node
- **Functors**: `module Make = (Config: T) => { ... }` — functor bodies are traversed for nested declarations
- **Pipe chains**: `x->f(y)` creates a `calls` edge to `f`, enabling `codegraph_callers`/`codegraph_callees` to trace through pipe-heavy ReScript code
- **Decorators**: PPX attributes (`@module`, `@schema`, `@s.matches`) are extracted as decorator metadata on the associated node

## Grammar: tree-sitter-rescript.wasm

The WASM grammar was built from [`rescript-lang/tree-sitter-rescript`](https://github.com/rescript-lang/tree-sitter-rescript) via Docker + emscripten (908KB output).

### Rebuild Instructions

```bash
git clone https://github.com/rescript-lang/tree-sitter-rescript.git /tmp/tree-sitter-rescript
cd /tmp/tree-sitter-rescript

# Build WASM via Docker + emscripten (produces tree-sitter-rescript.wasm)
npx tree-sitter build --wasm

# Copy to CodeGraph
cp tree-sitter-rescript.wasm /path/to/codegraph/src/extraction/wasm/
```

The native dynamic library (for ast-grep, not CodeGraph) can be built with:

```bash
gcc -shared -fPIC -O2 -I /tmp/tree-sitter-rescript/src \
-o rescript.dylib \
/tmp/tree-sitter-rescript/src/parser.c /tmp/tree-sitter-rescript/src/scanner.c
```

### Default Include/Exclude Patterns

**Included:** `**/*.res`, `**/*.resi`

**Excluded:** `**/.rescript/**`, `**/lib/bs/**`, `**/lib/ocaml/**` (ReScript compiler output directories)

## Files Changed

| File | Change |
|------|--------|
| `src/types.ts` | Added `'rescript'` to `Language` type, `.res`/`.resi` to `DEFAULT_CONFIG.include`, ReScript compiler output dirs to `exclude` |
| `src/extraction/grammars.ts` | WASM loader, extension mappings (`.res`, `.resi`), display name |
| `src/extraction/languages/rescript.ts` | `LanguageExtractor` with `visitNode()` hook, `extractImport()`, helper functions for let bindings, modules, types, externals, pipe calls, ERROR node recovery |
| `src/extraction/languages/index.ts` | Registers `rescriptExtractor` in the `EXTRACTORS` map |
| `src/extraction/wasm/tree-sitter-rescript.wasm` | Pre-built WASM grammar (908KB) |
| `__tests__/extraction.test.ts` | 12 new tests covering all ReScript extraction features |

## Test Results

- **12 new tests**, all passing
- **0 regressions** — all pre-existing tests unchanged
- Tests cover: language detection, functions, variables, type declarations, variant types (enums), record types (structs), modules, module types (interfaces), open/include imports, external declarations, call expressions, and pipe expression calls
- **Real-world validation**: Tested against a ReScript codebase — 75 nodes, 71 edges, 28 call references from 4 core files

## Testing Instructions

### Prerequisites

- Node.js >= 18
- npm
- Git

### 1. Clone and build

```bash
git clone https://github.com/colbymchenry/codegraph.git
cd codegraph
npm install
npm run build
```

### 2. Link globally

```bash
npm link
```

Verify with:

```bash
codegraph --version
```

### 3. Index a ReScript project

```bash
cd /path/to/your/rescript-project
codegraph init -i
codegraph index
```

### 4. Query the code graph

```bash
codegraph status # Show index statistics
codegraph query "EventLog" # Search for a symbol
codegraph context "How does the event log work?" # Build AI context
```

### 5. Set up the MCP server (for Claude Code)

```bash
codegraph install
```

This configures the MCP server, tool permissions, auto-sync hooks, and CLAUDE.md in one step. After that, start Claude Code in the project — CodeGraph tools will be available immediately.

### 6. Clean up

```bash
npm unlink -g @colbymchenry/codegraph # Remove global link
rm -rf /path/to/rescript-project/.codegraph # Remove project index
```
128 changes: 128 additions & 0 deletions __tests__/extraction.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3079,3 +3079,131 @@ describe('Directory Exclusion', () => {
expect(files.every((f) => !f.includes('vendor'))).toBe(true);
});
});

describe('ReScript Extraction', () => {
it('should detect ReScript files', () => {
expect(detectLanguage('src/EventLog.res')).toBe('rescript');
expect(detectLanguage('src/EventLog.resi')).toBe('rescript');
});

it('should extract function declarations (let bindings with function body)', () => {
const code = `let make = (~name, ~opts=?) => {
let storage = Storage.make(~name, ~opts)
storage
}`;
const result = extractFromSource('EventLog_Builder.res', code);
const funcNode = result.nodes.find((n) => n.kind === 'function');
expect(funcNode).toBeDefined();
expect(funcNode?.name).toBe('make');
});

it('should extract variable declarations (non-function let bindings)', () => {
const code = `let componentType = ComponentType.EventLog`;
const result = extractFromSource('EventLog.res', code);
const varNode = result.nodes.find((n) => n.kind === 'variable');
expect(varNode).toBeDefined();
expect(varNode?.name).toBe('componentType');
});

it('should extract type declarations', () => {
const code = `type t
type component<'operations> = Component.t<t, outputs, 'operations>`;
const result = extractFromSource('EventLog.res', code);
const typeNodes = result.nodes.filter((n) => n.kind === 'type_alias');
expect(typeNodes.length).toBeGreaterThanOrEqual(2);
expect(typeNodes.map(n => n.name)).toContain('t');
expect(typeNodes.map(n => n.name)).toContain('component');
});

it('should extract variant types as enums', () => {
const code = `type event =
| ItemCreated({itemId: string, name: string})
| ItemUpdated({itemId: string, name: string})
| ItemDeleted({itemId: string})`;
const result = extractFromSource('CatalogSpec.res', code);
const enumNode = result.nodes.find((n) => n.kind === 'enum');
expect(enumNode).toBeDefined();
expect(enumNode?.name).toBe('event');
const members = result.nodes.filter((n) => n.kind === 'enum_member');
expect(members.map(m => m.name)).toContain('ItemCreated');
expect(members.map(m => m.name)).toContain('ItemUpdated');
expect(members.map(m => m.name)).toContain('ItemDeleted');
});

it('should extract record types as structs', () => {
const code = `type operations = {
append: append<Id.t, event>,
replay: replay<Id.t, event>,
}`;
const result = extractFromSource('EventLog.res', code);
const structNode = result.nodes.find((n) => n.kind === 'struct');
expect(structNode).toBeDefined();
expect(structNode?.name).toBe('operations');
const fields = result.nodes.filter((n) => n.kind === 'field');
expect(fields.map(f => f.name)).toContain('append');
expect(fields.map(f => f.name)).toContain('replay');
});

it('should extract module declarations as namespaces', () => {
const code = `module Make = (Spec: EventLog.T, Storage: EventLog_Adapter.Storage) => {
module Spec = Spec
let construct = (self, name) => {
Storage.make(~name)
}
}`;
const result = extractFromSource('EventLog_Builder.res', code);
const moduleNode = result.nodes.find((n) => n.kind === 'namespace');
expect(moduleNode).toBeDefined();
expect(moduleNode?.name).toBe('Make');
});

it('should extract module type declarations as interfaces', () => {
const code = `module type T = {
module Spec: ReventlessInfra.EventLog.T
type operations
let make: (~name: string) => component
}`;
const result = extractFromSource('EventLog.res', code);
const ifaceNode = result.nodes.find((n) => n.kind === 'interface');
expect(ifaceNode).toBeDefined();
expect(ifaceNode?.name).toBe('T');
});

it('should extract open statements as imports', () => {
const code = `open ReventlessCore
open Belt`;
const result = extractFromSource('test.res', code);
const imports = result.nodes.filter((n) => n.kind === 'import');
expect(imports.map(i => i.name)).toContain('ReventlessCore');
expect(imports.map(i => i.name)).toContain('Belt');
});

it('should extract external declarations as functions', () => {
const code = `@module("uuid") external v4: unit => string = "v4"`;
const result = extractFromSource('UUID.res', code);
const funcNode = result.nodes.find((n) => n.kind === 'function');
expect(funcNode).toBeDefined();
expect(funcNode?.name).toBe('v4');
});

it('should extract call expressions', () => {
const code = `let make = (~name) => {
let storage = Storage.make(~name)
storage
}`;
const result = extractFromSource('Builder.res', code);
const calls = result.unresolvedReferences.filter((r) => r.referenceKind === 'calls');
expect(calls.length).toBeGreaterThan(0);
expect(calls.some(c => c.referenceName.includes('Storage.make'))).toBe(true);
});

it('should extract pipe expression calls', () => {
const code = `let process = (items) => {
items->Array.map(item => item.name)->Array.filter(name => name !== "")
}`;
const result = extractFromSource('Utils.res', code);
const calls = result.unresolvedReferences.filter((r) => r.referenceKind === 'calls');
expect(calls.some(c => c.referenceName.includes('Array.map'))).toBe(true);
expect(calls.some(c => c.referenceName.includes('Array.filter'))).toBe(true);
});
});
8 changes: 6 additions & 2 deletions src/extraction/grammars.ts
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ const WASM_GRAMMAR_FILES: Record<GrammarLanguage, string> = {
kotlin: 'tree-sitter-kotlin.wasm',
dart: 'tree-sitter-dart.wasm',
pascal: 'tree-sitter-pascal.wasm',
rescript: 'tree-sitter-rescript.wasm',
};

/**
Expand Down Expand Up @@ -74,6 +75,8 @@ export const EXTENSION_MAP: Record<string, Language> = {
'.lpr': 'pascal',
'.dfm': 'pascal',
'.fmx': 'pascal',
'.res': 'rescript',
'.resi': 'rescript',
};

/**
Expand Down Expand Up @@ -121,8 +124,8 @@ export async function loadGrammarsForLanguages(languages: Language[]): Promise<v
for (const lang of toLoad) {
const wasmFile = WASM_GRAMMAR_FILES[lang];
try {
// Pascal ships its own WASM (not in tree-sitter-wasms)
const wasmPath = lang === 'pascal'
// Pascal and ReScript ship their own WASM (not in tree-sitter-wasms)
const wasmPath = (lang === 'pascal' || lang === 'rescript')
? path.join(__dirname, 'wasm', wasmFile)
: require.resolve(`tree-sitter-wasms/out/${wasmFile}`);
const language = await WasmLanguage.load(wasmPath);
Expand Down Expand Up @@ -284,6 +287,7 @@ export function getLanguageDisplayName(language: Language): string {
svelte: 'Svelte',
liquid: 'Liquid',
pascal: 'Pascal / Delphi',
rescript: 'ReScript',
unknown: 'Unknown',
};
return names[language] || language;
Expand Down
2 changes: 2 additions & 0 deletions src/extraction/languages/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ import { swiftExtractor } from './swift';
import { kotlinExtractor } from './kotlin';
import { dartExtractor } from './dart';
import { pascalExtractor } from './pascal';
import { rescriptExtractor } from './rescript';

export const EXTRACTORS: Partial<Record<Language, LanguageExtractor>> = {
typescript: typescriptExtractor,
Expand All @@ -41,4 +42,5 @@ export const EXTRACTORS: Partial<Record<Language, LanguageExtractor>> = {
kotlin: kotlinExtractor,
dart: dartExtractor,
pascal: pascalExtractor,
rescript: rescriptExtractor,
};
Loading