Fix typedef/variable name conflict parsing error by Copilot · Pull Request #219 · goblint/cil

Copilot · 2026-03-12T07:20:23Z

Create test file test/small1/typedef_varname.c for the typedef/variable name conflict
Add test to test/testcil.pl
Fix init_declarator rule in src/frontc/cparser.mly to call add_identifier via declarator_init_start helper rule between declarator and EQ
Fix comment style in cparser.mly to use /* (* TEXT *) */ as per file conventions
Fix add_identifier timing: declarator_init_start registers each initialized declarator's name before its initializer is parsed; declarator_no_init registers NO_INIT declarator names immediately (guarded by is_typedef_decl flag); typedef names use add_type only in doDeclaration — ensuring correct C11 6.2.1.7 scoping without breaking typedef scope cleanup
Add test/small1/typedef_varname_noinit.c to test multiple initialized declarators: verifies that the second declarator's initializer can reference the first declared variable (which shadows a typedef), exercising declarator_init_start
Fix CI regression in test/scope2: declarator_no_init now skips add_identifier when is_typedef_decl=true, preventing pop_context from incorrectly removing the NAMED_TYPE lexicon binding for typedef names
Add test for NO_INIT middle declarator in typedef_varname.c: mytype a = 1, mytype, b = sizeof mytype — sizeof mytype (without parens) is only valid when mytype is a variable, proving correct C11 6.2.1.7 scoping for NO_INIT declarators in mixed declaration lists

Original prompt

This section details on the original issue you should resolve

<issue_title>typedef and variable name conflict parsing error</issue_title>
<issue_description>### Test case from goblint/bench#38 (comment)
#include <stddef.h>
#include <stdint.h>

typedef struct raxNode {
    uint32_t iskey:1;
    uint32_t isnull:1;
    uint32_t iscompr:1;
    uint32_t size:29;
    unsigned char data[];
} raxNode;

typedef struct rax {
    raxNode *head;
    uint64_t numele;
    uint64_t numnodes;
} rax;

void *malloc(size_t size);

int main() {
  rax *rax = malloc(sizeof(*rax)); // PARSING ERROR HERE
  return 0;
}
This is completely fine by GCC/Clang and rax inside the sizeof should refer to the local variable rax that has been declared and is being initialized.

Cause

The parser errors because that inner rax is a NAMED_TYPE token from the lexer instead of IDENT. This is due to the lexer hack being buggy: ours only adds a local variable to the lexer hack table after the entire declaration, but actually the initializer in the declaration may already refer to the local variable. Thus at this point, the lexer hack table considers rax to refer to the typedef instead.

Specification

It's extremely difficult to find specification for this, but I believe it might be 6.2.1.7 in N1570:

Structure, union, and enumeration tags have scope that begins just after the appearance of the tag in a type specifier that declares the tag. Each enumeration constant has scope that begins just after the appearance of its defining enumerator in an enumerator list. Any other identifier has scope that begins just after the completion of its declarator.

The declaractor is the *rax part of the declaration in the above example, so rax needs to be added to the lexer hack table as identifier between the declarator and initializer, not after the whole thing.</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes typedef and variable name conflict parsing error #114

🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

… register identifier before parsing initializer Co-authored-by: michael-schwarz <13812333+michael-schwarz@users.noreply.github.com>

src/frontc/cparser.mly

Co-authored-by: michael-schwarz <13812333+michael-schwarz@users.noreply.github.com>

sim642 · 2026-03-13T10:34:46Z

src/frontc/cparser.mly

+      in the lexer hack, so that in the initializer the name shadows any typedef
+      with the same name (C11 6.2.1.7: scope begins after completion of declarator). *) */
+declarator_init_start:
+    declarator EQ                       { let (n, _, _, _) = $1 in !Lexerhack.add_identifier n; $1 }


This may be the right place for it to happen but just adding another Lexerhack.add_identifier call is a bit suspicious.

Somewhere else there's already a call to call add_identifier on the same thing, just slightly too late.
The lexer hack management uses H.add (not H.replace) and H.remove to actually use the multiple-binding feature of OCaml hashtables (which exists precisely for this kind of namespacing use case).
But if there's an extra add_identifier here and the same one is added later by existing code, then there will only be one remove. This means that the identifier will probably remain bound outside of its intended scope somewhere.

Lexerhack has a push_context-pop_context mechanism to manage all this in a well-nested manner which this might now break.

@copilot address simmo's comment

Good catch. The double-add was the issue — fixed in d8822cc by changing doDeclaration to skip calling add_identifier for initialized declarators (those with init != NO_INIT), since declarator_init_start already registered them before the initializer was parsed. This ensures exactly one add/remove pair per variable in the lexer context, keeping the push_context/pop_context mechanism balanced.

This asymmetry with NO_INIT also seems dubious because multiple identifiers can be declared together and mixing NO_INIT with initializers can be arbitrary, e.g. int a = 1, b, c = 3;. This seems like b would be added at the end, while a and c are added at their =. And I guess later initializers could refer to earlier ones, so it might matter?

The lexer hack is much more intricate than Copilot likes to believe.

@copilot address simmo's comment.

Fixed in 160543c. Instead of the NO_INIT check in doDeclaration, both init_declarator branches now use dedicated helper rules that call add_identifier immediately after each declarator is parsed:

declarator_no_init — handles the uninitialized case, registers the name right after the declarator

declarator_init_start — handles the initialized case, registers the name between the declarator and =

doDeclaration no longer calls add_identifier at all. This gives correct C11 6.2.1.7 timing for every declarator in any combination, including int a = 1, b, c = 3; — with exactly one add/remove per variable.

@copilot, write a test that tests this specific feature.

Added test/small1/typedef_varname_noinit.c (registered as testrun/typedef_varname_noinit in testcil.pl) in d977be5.

The key test technique: sizeof T without parentheses is only valid C when T is a variable (expression operand). When T is still a type name, sizeof T is a parse error (types require sizeof(T)). So:

typedef int T; { T T, U = sizeof T; } // fails to parse if declarator_no_init is absent

typedef int mytype; { mytype a = 1, mytype, b = sizeof mytype; } // NO_INIT in the middle, referenced later

Both test functions compile and run only if the NO_INIT declarator is registered as an identifier immediately after its declarator.

@copilot fix the CI failures

Fixed the CI failures and updated the test in 363114d.

Root cause of CI failure: The previous declarator_no_init helper rule called add_identifier unconditionally for all declarations — including typedef declarations. For typedef ... Z inside a block, this added "Z" to the context list. When pop_context removed it on block exit, H.remove "Z" removed the most-recently-added binding (the NAMED_TYPE from add_type), leaving the stale IDENT binding. So Z globz; at file scope then failed to parse (test/scope2).

Fix: Reverted declarator_no_init and restored add_identifier for NO_INIT names inside doDeclaration (which already guards against typedef declarations via the isTypedef branch). declarator_init_start is retained for the initialized case — this is the core fix for the original issue.

Updated test (typedef_varname_noinit.c): Now tests multiple initialized declarators, specifically that the second initializer can reference the first declared variable after it has shadowed a typedef:

typedef int T; { T T = 1, U = T + 1; } /* T in "T + 1" must be the variable, not the typedef */

Without the fix, T + 1 would see T as NAMED_TYPE and fail to parse (since NAMED_TYPE is not a valid primary expression).

…initialized declarators Co-authored-by: michael-schwarz <13812333+michael-schwarz@users.noreply.github.com>

… its declarator via helper rules Co-authored-by: michael-schwarz <13812333+michael-schwarz@users.noreply.github.com>

Co-authored-by: michael-schwarz <13812333+michael-schwarz@users.noreply.github.com>

…t_start; update test Co-authored-by: michael-schwarz <13812333+michael-schwarz@users.noreply.github.com>

michael-schwarz · 2026-03-18T10:35:52Z

test/small1/typedef_varname.c

+int main() {
+  rax *rax = malloc(sizeof(*rax)); /* variable rax shadows typedef rax in initializer */
+  return rax == 0;
+}


@copilot add a further test where there is no init in the middle, i.e., akin to

typedef int mytype; mytype a = 1, mytype, b = sizeof mytype;

Added in 5960536. The test uses sizeof mytype without parentheses — this is only valid C when mytype is a variable (an expression operand); types require sizeof(type). So the test fails to parse if the fix is absent:

typedef int mytype; /* ... */ { mytype a = 1, mytype, b = sizeof mytype; mytype = 5; if (a != 1) E(2); if (mytype != 5) E(3); if (b != sizeof(int)) E(4); }

To make this work correctly without breaking test/scope2 (the previous regression), declarator_no_init is re-added but now guarded by an is_typedef_decl flag. The flag is set when TYPEDEF is seen and reset in doDeclaration, so typedef declarators don't get add_identifier called — only non-typedef NO_INIT declarators do.

… typedef guard Co-authored-by: michael-schwarz <13812333+michael-schwarz@users.noreply.github.com>

michael-schwarz · 2026-03-18T13:06:10Z

@sim642: This looks reasonable now!

Initial plan

3494d12

Copilot AI assigned Copilot and michael-schwarz Mar 12, 2026

Copilot started work on behalf of michael-schwarz March 12, 2026 07:20 View session

Fix typedef/variable name conflict: add declarator_init_start rule to…

98004a6

… register identifier before parsing initializer Co-authored-by: michael-schwarz <13812333+michael-schwarz@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Fix typedef and variable name conflict parsing error~~ Fix typedef/variable name conflict parsing error Mar 12, 2026

Copilot finished work on behalf of michael-schwarz March 12, 2026 07:33

michael-schwarz requested changes Mar 13, 2026

View reviewed changes

src/frontc/cparser.mly Outdated Show resolved Hide resolved

Copilot started work on behalf of michael-schwarz March 13, 2026 08:49 View session

Fix comment style in cparser.mly to use /* (* TEXT *) */

f9a7b12

Co-authored-by: michael-schwarz <13812333+michael-schwarz@users.noreply.github.com>

Copilot AI requested a review from michael-schwarz March 13, 2026 08:51

Copilot finished work on behalf of michael-schwarz March 13, 2026 08:51

michael-schwarz approved these changes Mar 13, 2026

View reviewed changes

michael-schwarz marked this pull request as ready for review March 13, 2026 09:00

michael-schwarz requested a review from sim642 March 13, 2026 09:00

sim642 reviewed Mar 13, 2026

View reviewed changes

Copilot started work on behalf of michael-schwarz March 13, 2026 10:36 View session

Fix double add_identifier: skip re-registration in doDeclaration for …

d8822cc

…initialized declarators Co-authored-by: michael-schwarz <13812333+michael-schwarz@users.noreply.github.com>

Copilot AI requested a review from michael-schwarz March 13, 2026 10:42

Copilot finished work on behalf of michael-schwarz March 13, 2026 10:42

Copilot started work on behalf of michael-schwarz March 13, 2026 11:59 View session

Fix add_identifier timing: register each declarator immediately after…

160543c

… its declarator via helper rules Co-authored-by: michael-schwarz <13812333+michael-schwarz@users.noreply.github.com>

Copilot finished work on behalf of michael-schwarz March 13, 2026 12:03

Copilot started work on behalf of michael-schwarz March 13, 2026 15:03 View session

Add test for NO_INIT declarator typedef shadowing in mixed declarations

d977be5

Co-authored-by: michael-schwarz <13812333+michael-schwarz@users.noreply.github.com>

Copilot finished work on behalf of michael-schwarz March 13, 2026 15:14

Copilot started work on behalf of michael-schwarz March 13, 2026 15:35 View session

Fix scope2 regression: revert declarator_no_init, keep declarator_ini…

363114d

…t_start; update test Co-authored-by: michael-schwarz <13812333+michael-schwarz@users.noreply.github.com>

Copilot finished work on behalf of michael-schwarz March 13, 2026 16:02

michael-schwarz requested a review from sim642 March 13, 2026 16:38

michael-schwarz requested changes Mar 18, 2026

View reviewed changes

Copilot started work on behalf of michael-schwarz March 18, 2026 10:36 View session

Add NO_INIT middle-declarator test and re-add declarator_no_init with…

5960536

… typedef guard Co-authored-by: michael-schwarz <13812333+michael-schwarz@users.noreply.github.com>

Copilot AI requested a review from michael-schwarz March 18, 2026 10:52

Copilot finished work on behalf of michael-schwarz March 18, 2026 10:52

michael-schwarz approved these changes Mar 18, 2026

View reviewed changes

Conversation

Copilot AI commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Cause

Specification

Comments on the Issue (you are @copilot in this section)

Uh oh!

Uh oh!

sim642 Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

michael-schwarz Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

sim642 Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

michael-schwarz Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

michael-schwarz Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

michael-schwarz Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

michael-schwarz Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

michael-schwarz commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Mar 12, 2026 •

edited

Loading