Running lezer-generator on a complex grammar results in an error "Goto table too large"

We have a real-world grammar for the Wolfram Language, which is fairly complex by nature. When running it through `lezer-generator`, it results in an error `Goto table too large`, exit code 1. The grammar file is attached. (It relies on a custom tokenizer, but I don't think that's needed to understand or reproduce this issue.)

What I know so far:

* This used to work with lezer-generator v0.8.5 (after processing the grammar for ~90 seconds) but fails now with v0.13.1.
* If I remove handling of "Unicode operators" (commenting out `| unicodeOperation` in the definition of `expr`), the grammar is generated OK.

Those Unciode operations are rules of the form
```
Unicode_XXX { expr !precXXX UnicodeOp<unicodeXXX>  expr }
```
with extra "wrappers"
```
UnicodeOp<token> { Op<token> }
Op<term> { term }
```
defined for the corresponding unicode tokens (this helps with the "post-processing" of the generated parse tree). Perhaps that just adds too much complexity?

I just wanted to open this issue as dicussed previosly on https://github.com/lezer-parser/lezer-generator/pull/2#issuecomment-589612246. I don't really expect anyone to solve the whole problem for us, but wanted to ask what would be useful next steps...

Would it help to narrow down the specific version (> 0.8.5, <= 0.13.1) where this "broke"?

Should we try to simplify the grammar (e.g. trying to remove the nesting of `UnicodeOp` / `Op`), assuming this limit is expected?

Or would it make sense to work towards supporting larger "goto tables"? I could try to open a pull request increasing the "pointer size" from 16 to 32 bit, if that makes sense.

Thanks!

[wl.txt](https://github.com/lezer-parser/lezer/files/5690927/wl.txt)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running lezer-generator on a complex grammar results in an error "Goto table too large" #6

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Running lezer-generator on a complex grammar results in an error "Goto table too large" #6

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions