Skip to content
This repository was archived by the owner on Apr 15, 2026. It is now read-only.
This repository was archived by the owner on Apr 15, 2026. It is now read-only.

Running lezer-generator on a complex grammar results in an error "Goto table too large" #6

@poeschko

Description

@poeschko

We have a real-world grammar for the Wolfram Language, which is fairly complex by nature. When running it through lezer-generator, it results in an error Goto table too large, exit code 1. The grammar file is attached. (It relies on a custom tokenizer, but I don't think that's needed to understand or reproduce this issue.)

What I know so far:

  • This used to work with lezer-generator v0.8.5 (after processing the grammar for ~90 seconds) but fails now with v0.13.1.
  • If I remove handling of "Unicode operators" (commenting out | unicodeOperation in the definition of expr), the grammar is generated OK.

Those Unciode operations are rules of the form

Unicode_XXX { expr !precXXX UnicodeOp<unicodeXXX>  expr }

with extra "wrappers"

UnicodeOp<token> { Op<token> }
Op<term> { term }

defined for the corresponding unicode tokens (this helps with the "post-processing" of the generated parse tree). Perhaps that just adds too much complexity?

I just wanted to open this issue as dicussed previosly on lezer-parser/generator#2 (comment). I don't really expect anyone to solve the whole problem for us, but wanted to ask what would be useful next steps...

Would it help to narrow down the specific version (> 0.8.5, <= 0.13.1) where this "broke"?

Should we try to simplify the grammar (e.g. trying to remove the nesting of UnicodeOp / Op), assuming this limit is expected?

Or would it make sense to work towards supporting larger "goto tables"? I could try to open a pull request increasing the "pointer size" from 16 to 32 bit, if that makes sense.

Thanks!

wl.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions