Skip to content

feat(parser): add OCaml (.ml / .mli) support#622

Open
Macho0x wants to merge 2 commits into
justrach:release/0.2.5825from
Macho0x:feat/ocaml-support
Open

feat(parser): add OCaml (.ml / .mli) support#622
Macho0x wants to merge 2 commits into
justrach:release/0.2.5825from
Macho0x:feat/ocaml-support

Conversation

@Macho0x

@Macho0x Macho0x commented Jun 16, 2026

Copy link
Copy Markdown

Adds OCaml language support to the line-based parser.

  • Detect .ml / .mli files
  • Parse open / include / module / type / external / exception / val / let / and
  • Handle nested (* ... *) comments
  • Classify let / and as function when -> or fun / function is present
  • Add parser test

zig build test-parser: 70/70 passed.

@github-actions

Copy link
Copy Markdown

👋 Thanks for the contribution! Quick heads-up: this repo lands changes on the current release/* branch, not main.

Please retarget this PR via Edit → base branch to the active release branch (currently release/0.2.5825).

(Automated hint — reply here if you need a hand.)

@Macho0x Macho0x changed the base branch from main to release/0.2.5825 June 16, 2026 20:38
Add language detection, line-based outline parsing, nested (* ... *)
comment handling, and tests for OCaml.

- Detect .ml and .mli files as Language.ocaml
- Parse open/include/module/type/external/exception/val/let/and
- Classify let/and as function when -> or fun/function is present
- Handle nested OCaml block comments in the line loop
- Add parser test covering modules, types, let/rec, external, and imports

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4d76167b80

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/explore.zig
Comment on lines +1940 to +1941
if (std.mem.startsWith(u8, trimmed[i..], "(*")) {
ocaml_comment_depth += 1;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Skip OCaml comment markers inside strings

For valid OCaml such as let pattern = "(*", this scan treats the string contents as a real comment opener, leaves ocaml_comment_depth nonzero, and then skips all subsequent definitions until a later *). OCaml comment delimiters are ignored inside string/char literals, so files containing these characters in strings lose most of their outline; the comment scanner should ignore quoted literals before updating the depth.

Useful? React with 👍 / 👎.

Comment thread src/explore.zig
Comment on lines +6614 to +6619
const kind: SymbolKind = if (std.mem.indexOf(u8, code, "->") != null) .function else blk: {
if (std.mem.indexOfScalar(u8, code, '=')) |eq_pos| {
const rhs = std.mem.trimStart(u8, code[eq_pos + 1 ..], " \t");
if (startsWith(rhs, "fun ") or startsWith(rhs, "function")) break :blk .function;
}
break :blk .constant;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Classify curried OCaml lets as functions

In OCaml, the common function form is let make name age = ... or let rec fib n = ..., but this heuristic only marks bindings as functions when the line contains -> or the RHS starts with fun/function. As a result, most ordinary OCaml functions are indexed as constants, which makes the new language support report misleading symbol kinds for normal .ml files; after parsing the binding name, check whether another identifier/pattern precedes the = before falling back to .constant.

Useful? React with 👍 / 👎.

@Macho0x Macho0x force-pushed the feat/ocaml-support branch from 4d76167 to d3d988a Compare June 16, 2026 20:39
Comment delimiters (* and *) inside quoted strings were incorrectly
treated as real comment openers, causing the scanner to skip subsequent
definitions until a later close. Now the scanner tracks string and char
literal state and ignores comment tokens inside quoted regions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant