Skip to content

typechecker + debug improvements#167

Merged
pointbazaar merged 7 commits intomasterfrom
csv-parser
May 1, 2025
Merged

typechecker + debug improvements#167
pointbazaar merged 7 commits intomasterfrom
csv-parser

Conversation

@pointbazaar
Copy link
Owner

@pointbazaar pointbazaar commented Apr 26, 2025

Summary by CodeRabbit

  • New Features

    • Added calloc and memcpy functions with example and test programs.
    • Introduced enhanced type checking for variables and member accesses with improved error messages.
    • Improved LVST printing to support file output and architecture-dependent sizing.
    • Added utility functions for struct and type handling.
    • Added keyword delimiter checks in lexer to prevent partial keyword matches.
  • Bug Fixes

    • Enhanced error handling in type inference and checking to avoid null dereferences and improve diagnostics.
  • Documentation

    • Added new test cases covering type checking errors and allocator functions.
  • Refactor

    • Restructured type checking logic for variables and member accesses.
    • Updated struct symbol table interfaces to support type management and member queries.
    • Improved resource cleanup flow in x86 code generation.
  • Style

    • Standardized struct initialization in tests.
  • Chores

    • Updated .gitignore to exclude .lvst files.
    • Updated CircleCI config for full system upgrade and added libxml2 dependency.

Ability to dump LVST into file.

e.g. LVST for function1 can be dumped into function1.lvst

when applying '-debug' flag.

Typechecker can also dump LVST in case there is any error in
typechecking. LVST is not dumped for all functions even with '-debug'
since that would just dump too many files when combined with '-stdlib'
flag.
Cleaning a leftover debug statement. The typechecker should always
allocate and store a correct error and not use stderr directly.
The checks did not take into account struct member access.
That involves looking up types recursively to check for existence of
struct members and such.

Add those checks to avoid any problems later on.
e.g.
struct X x1 = {0};

instead of explicitly initializing each member.
@coderabbitai
Copy link

coderabbitai bot commented Apr 26, 2025

Walkthrough

This update introduces significant changes across the compiler's type checking, symbol table management, and testing infrastructure. New functions and files are added for robust type checking of variables and struct member accesses, with improved error handling and recursion for nested member checks. The struct symbol table interface is expanded to support member type queries and existence checks, and its construction now registers type information. LVST (live variable state table) printing is refactored to allow output to files or streams, and type size calculations now support an "anytype" with architecture-dependent sizing. The lexer receives stricter keyword matching, and several new tests and example programs are included to verify memory allocation, copying, and struct/array interactions. Numerous function signatures are updated for consistency and improved parameterization.

Changes

File(s) Change Summary
.gitignore Added .lvst files to ignore list.
compiler/main/CMakeLists.txt Added typechecker/tc_simplevar.c to build sources.
compiler/main/avr_code_gen/cg_avr_single_function.c, tables/lvst/lvst.c, tables/lvst/lvst.h Refactored LVST printing to accept output streams; added filename-based LVST output function.
compiler/main/typechecker/_tc.h, compiler/main/typechecker/tc_simplevar.c, compiler/main/typechecker/tc_var.c Added/declared new type checking functions for simple variables and struct member access; refactored member access checking to recursive, type-driven approach.
compiler/main/typechecker/tc_ifstmt.c, compiler/main/typechecker/tc_methodcall.c Improved type error reporting for if-statement conditions and method call arguments, including robust null checks and error messages.
compiler/main/typechecker/tc_method.c, compiler/main/x86_code_gen/cg_x86_single_function.c Enhanced debug output: on typecheck failure, LVST is written to files named after the function/method.
compiler/main/typechecker/util/tc_utils.c, compiler/main/typechecker/util/tc_utils.h Added function to extract underlying struct name from a type.
compiler/main/typeinference/infer_in_context.c, compiler/main/typeinference/typeinfer_var.c Added null checks after type inference to prevent dereferencing null pointers.
compiler/main/typeinference/typeinfer.c Added assertion to ensure non-null types in index unwrapping.
compiler/main/util/fill_tables.c, tables/stst/stst.c, tables/stst/stst.h Changed struct symbol table fill and line constructor to operate on broader symbol table context; added member type query and existence functions; extended struct table line with a type field.
compiler/main/x86_code_gen/cg_x86.c Refactored error handling and cleanup logic for output file and instruction buffer.
compiler/test/testcases.c, compiler/test/typechecker/test_typechecker.c, compiler/test/typechecker/test_typechecker.h, compiler/test/typechecker/test-src/index_not_found.dg Added new typechecker test for index-not-found errors.
examples/stdlib/base/allocator/calloc_test/calloc_test.dg, examples/stdlib/base/allocator/calloc_test/calloc_test.exitcode, examples/stdlib/base/allocator/calloc_test/calloc_test.stdout Added example and expected outputs for testing calloc functionality.
examples/stdlib/base/allocator/memcpy_test/memcpy_test.dg, examples/stdlib/base/allocator/memcpy_test/memcpy_test.exitcode, examples/stdlib/base/allocator/memcpy_test/memcpy_test.stdout Added example and expected outputs for testing memcpy functionality.
examples/struct/struct_and_array/struct_and_array.dg, examples/struct/struct_and_array/struct_and_array.exitcode Added example demonstrating struct containing dynamic array of structs.
lexer/src/lexer.c Added delimiter-aware keyword matching to prevent partial matches; introduced helper functions for delimiter and keyword recognition.
parser/main/astnodes/types/Type.c, parser/main/astnodes/types/Type.h Added function for constructing a Type from a SimpleType node; updated header includes and declarations.
stdlib/base/allocator.dg Added calloc and memcpy implementations to allocator module.
tables/test/test.c Standardized zero-initialization for all local struct variables in tests.

Sequence Diagram(s)

sequenceDiagram
    participant Parser
    participant TypeChecker
    participant SymbolTable
    participant LVST
    participant FileSystem

    Parser->>TypeChecker: Parse variable/member access
    TypeChecker->>SymbolTable: Lookup variable/member (recursive for structs)
    SymbolTable-->>TypeChecker: Return type or error
    TypeChecker->>TypeChecker: Check indices/types (recursive for members)
    TypeChecker-->>Parser: Success or error

    Note over TypeChecker,LVST: On typecheck failure (with debug enabled)
    TypeChecker->>LVST: Print LVST to .lvst file
    LVST->>FileSystem: Write LVST contents
Loading
sequenceDiagram
    participant UserCode
    participant Allocator
    participant Memcpy
    participant Output

    UserCode->>Allocator: Call calloc(n)
    Allocator->>Allocator: Allocate memory and zero it
    Allocator-->>UserCode: Return pointer

    UserCode->>Memcpy: Call memcpy(dest, src, len)
    Memcpy->>Memcpy: Copy bytes from src to dest
    Memcpy-->>UserCode: Return status

    UserCode->>Output: Print test results
Loading
sequenceDiagram
    participant Lexer
    participant SourceBuffer
    participant TokenList

    Lexer->>SourceBuffer: Read next token candidate
    Lexer->>Lexer: Check for keyword with delimiter
    alt Delimiter found after keyword
        Lexer->>TokenList: Add keyword token
    else
        Lexer->>TokenList: Continue as identifier
    end
Loading

Possibly related PRs

  • string processing #164: Changes .gitignore to ignore .lvst files and modifies LVST printing functions to accept output streams and print to files, similar in approach to this PR but for LVST instead of RAT.

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a81c09b and 674b103.

📒 Files selected for processing (1)
  • .circleci/config.yml (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • .circleci/config.yml

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@netlify
Copy link

netlify bot commented Apr 26, 2025

Deploy Preview for espl1000 ready!

Name Link
🔨 Latest commit 674b103
🔍 Latest deploy log https://app.netlify.com/sites/espl1000/deploys/68139d31dfbc1100084bc4b2
😎 Deploy Preview https://deploy-preview-167--espl1000.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🔭 Outside diff range comments (1)
tables/stst/stst.c (1)

171-185: ⚠️ Potential issue

Memory leak: line->type never freed

stst_free() releases name, _namespace, and line itself but omits the newly introduced deep-copied Type (and the SimpleType it owns). Add:

-       if (line != prev) {
+       if (line != prev) {
            prev = line;
+           free_type(line->type); /* or the equivalent destructor */
            free(line->name);
            free(line->_namespace);
            free(line);
        }

(make sure a matching free_type helper exists).

🧹 Nitpick comments (19)
compiler/main/x86_code_gen/cg_x86_single_function.c (1)

66-77: Improved debug output with function-specific LVST dump files.

This change replaces direct console output with function-specific LVST dump files, providing better debug information organization. The implementation correctly:

  1. Generates a filename based on the function name
  2. Handles memory allocation failures
  3. Frees allocated memory after use

Consider adding error handling in case lvst_print_filename fails, similar to how the RAT file printing is handled later in the file (lines 129-134).

 		lvst_print_filename(st->lvst, lvst_filename);
+		printf("LVST for function '%s' dumped to %s\n", current_function_name, lvst_filename);
examples/stdlib/base/allocator/calloc_test/calloc_test.dg (1)

10-14: Consider adding explicit verification for zero initialization.

While the test implicitly verifies zero initialization by printing the value, consider adding an explicit assertion to programmatically verify the initial value is zero.

 	println_uint64(p[10]);
+
+	// Verify that calloc initializes memory to zero
+	if p[10] != 0 {
+		println_string("FAIL: calloc did not zero initialize memory");
+		free(p);
+		return 1;
+	}
 
 	p[10] = 831212;
examples/stdlib/base/allocator/memcpy_test/memcpy_test.dg (3)

4-5: Memory allocation could have error handling improvements

While there is error checking for null pointers on lines 7-12, consider freeing already allocated memory when the second allocation fails.

 	[uint8] p1 = calloc(4);
 	[uint8] p2 = calloc(4);

 	if p1 == 0 {
 		return 1;
 	}
 	if p2 == 0 {
+		free(p1);
 		return 1;
 	}

18-18: Memcpy function return value is unused

The memcpy function returns a value (0) but this return value is not checked in the test. This is acceptable for testing purposes but worth noting.


2-27: Missing memory cleanup

The program doesn't free the allocated memory before exiting. While this won't cause problems in this specific test case (as the OS will reclaim memory after program exit), it's good practice to free allocated memory.

 	println_uint64(p2[3]); // 0
 
+	free(p1);
+	free(p2);
+
 	return 0;
 }
stdlib/base/allocator.dg (2)

36-41: Buffer zeroing could be optimized

The current implementation zeroes each byte individually in a loop. For large allocations, this could be inefficient.

Consider using the newly implemented memcpy function with a zero-initialized buffer for larger allocations, or implementing a more efficient bulk-zeroing approach for large memory blocks.


132-132: Consider standardizing return value

The function returns 0, which implies success, but the comment for free() on line 46 suggests that 0 indicates success for return values. Consider standardizing return values across functions or adding a comment to clarify the return value meaning.

+// @returns 0 on success
 fn memcpy([#] dest, [#] src, uint64 len) -> uint64 {
examples/struct/struct_and_array/struct_and_array.dg (2)

28-28: Potential array size calculation issue

The allocation for c.lines uses c.capacity * 8 which assumes pointers are 8 bytes. For portability, consider making this calculation clearer.

-	c.lines = calloc(c.capacity * 8);
+	c.lines = calloc(c.capacity * sizeof(*Line)); // Ideally using sizeof

18-43: Memory leaks in example

The program allocates memory for the container, the lines array, and a Line object, but never frees this memory before exiting.

While this is just an example and the OS will reclaim memory after program exit, consider adding proper cleanup to demonstrate best practices:

 	*Line line2 = c.lines[c.count];
 
+	// Free allocated memory
+	free(line);
+	free(c.lines);
+	free(c);
+
 	return line2.b;
 }
compiler/main/typechecker/util/tc_utils.h (1)

8-9: Good addition of utility function for struct name extraction

This new function will help with type checking of struct member accesses by providing a consistent way to extract the underlying struct name from a type.

Consider adding a brief comment explaining the function's purpose and return value semantics:

+// Recursively extracts the underlying struct name from a type.
+// Returns NULL if the type is not a struct or pointer to a struct.
 char* tc_get_underlying_struct_name(struct Type* t);
parser/main/astnodes/types/Type.c (1)

60-70: Good addition of SimpleType constructor

The new makeType_4 function provides a convenient way to create a Type from a SimpleType, following the same pattern as the other makeType_* functions. This makes the API more complete and consistent.

One minor note: Unlike other makeType_* functions, this one doesn't explicitly initialize type_param, array_type, or pointer_type to NULL, or is_anytype to false. While these fields are likely implicitly initialized to zero/NULL by make(Type), for consistency with other constructors, you might want to explicitly initialize them.

compiler/main/typechecker/util/tc_utils.c (1)

17-34: Consider supporting array‐of‐struct types

tc_get_underlying_struct_name bails out when the input is neither a basic_type nor a pointer_type.
A common pattern is an array of structs (t->array_type != NULL). If a user writes
myStructArr[i].field, the current helper will return NULL, causing the subsequent assert(struct_name) in callers such as tc_var_in_struct to abort the compiler although the access is perfectly valid.

@@
-       if (!t->basic_type && !t->pointer_type) {
+       if (!t->basic_type && !t->pointer_type && !t->array_type) {
                return NULL;
        }
@@
-       if (t->pointer_type) {
+       if (t->pointer_type) {
                return tc_get_underlying_struct_name(t->pointer_type->element_type);
        }
+
+       if (t->array_type) {
+               return tc_get_underlying_struct_name(t->array_type->element_type);
+       }

This keeps the helper generic and prevents spurious crashes when structs are nested inside arrays.

compiler/main/typechecker/tc_var.c (1)

79-82: Null-check line after lvst_get

Although tc_simplevar() should normally guarantee existence, being defensive costs little:

-       struct LVSTLine* line = lvst_get(tcctx->st->lvst, sv->name);
-       return tc_var_in_struct(line->type, member_access, tcctx);
+       struct LVSTLine* line = lvst_get(tcctx->st->lvst, sv->name);
+       if (!line) { return false; }
+       return tc_var_in_struct(line->type, member_access, tcctx);
tables/stst/stst.c (4)

55-66: Gracefully handle allocation failures during stst_fill

stst_line_ctor can return NULL (OOM, bad input).
If that happens, stst_add dereferences a null pointer. Add an early-out with an error message:

-       struct STSTLine* line = stst_line_ctor(st, mystruct, ns->name);
-       stst_add(stst, line);
+       struct STSTLine* line = stst_line_ctor(st, mystruct, ns->name);
+       if (!line) {
+               fprintf(stderr, "[STST] failed to construct line for struct '%s'\n",
+                       mystruct->type->struct_type->type_name);
+               continue;
+       }
+       stst_add(stst, line);

105-115: Eliminate unsafe casts in stst_member_type

stst_get_member expects char*, forcing a cast from const char* here.
Propagating const correctness up the call chain is safer and removes the cast:

-struct StructMember* stst_get_member(struct STST* stst, char* struct_name, char* member_name);
+struct StructMember* stst_get_member(struct STST* stst, const char* struct_name, const char* member_name);

and drop the casts in both stst_member_type and stst_has_member.


116-137: Iteration variable should use size_t

line->decl->count_members is unsigned; iterating with int j risks signed/unsigned mismatches and UB on large structs:

-for (int j = 0; j < line->decl->count_members; j++) {
+for (size_t j = 0; j < line->decl->count_members; j++) {

192-208: Risk of incompatible pointer types in copy_simple_type call

copy_simple_type() expects struct SimpleType*, but s->type is already a SimpleType* (not Type*). This is fine only if the field’s static type indeed is SimpleType*. Please double-check the definition of StructDecl. If it ever changes to a Type*, this call will silently compile with a mismatched type on non-pedantic builds.

Consider adding an explicit cast or a comment to clarify intent and silence compiler warnings:

/* s->type is SimpleType* */
struct SimpleType* copy = copy_simple_type(s->type);
compiler/main/typechecker/tc_simplevar.c (2)

139-141: Minor wording nit – missing apostrophe

-	error(tcctx, "cannot use indices for something thats not a local var/arg", TC_ERR_CANNOT_INDEX_INTO);
+	error(tcctx, "cannot use indices for something that's not a local var/arg", TC_ERR_CANNOT_INDEX_INTO);

62-74: Prefer const char* for struct names to clarify ownership

tc_get_underlying_struct_name() returns an internal pointer that must not be freed.
Using const char * for struct_name prevents accidental mutation/frees.

-	const char* struct_name = tc_get_underlying_struct_name(containing_type);
+	const char* struct_name = tc_get_underlying_struct_name(containing_type);

(Only type qualifier changed.)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 36f87cb and c62598b.

📒 Files selected for processing (38)
  • .gitignore (1 hunks)
  • compiler/main/CMakeLists.txt (1 hunks)
  • compiler/main/avr_code_gen/cg_avr_single_function.c (1 hunks)
  • compiler/main/typechecker/_tc.h (1 hunks)
  • compiler/main/typechecker/tc_ifstmt.c (1 hunks)
  • compiler/main/typechecker/tc_method.c (2 hunks)
  • compiler/main/typechecker/tc_methodcall.c (1 hunks)
  • compiler/main/typechecker/tc_simplevar.c (1 hunks)
  • compiler/main/typechecker/tc_var.c (2 hunks)
  • compiler/main/typechecker/util/tc_utils.c (1 hunks)
  • compiler/main/typechecker/util/tc_utils.h (1 hunks)
  • compiler/main/typeinference/infer_in_context.c (2 hunks)
  • compiler/main/typeinference/typeinfer.c (2 hunks)
  • compiler/main/typeinference/typeinfer_var.c (1 hunks)
  • compiler/main/util/fill_tables.c (1 hunks)
  • compiler/main/x86_code_gen/cg_x86.c (2 hunks)
  • compiler/main/x86_code_gen/cg_x86_single_function.c (1 hunks)
  • compiler/test/testcases.c (1 hunks)
  • compiler/test/typechecker/test-src/index_not_found.dg (1 hunks)
  • compiler/test/typechecker/test_typechecker.c (1 hunks)
  • compiler/test/typechecker/test_typechecker.h (1 hunks)
  • examples/stdlib/base/allocator/calloc_test/calloc_test.dg (1 hunks)
  • examples/stdlib/base/allocator/calloc_test/calloc_test.exitcode (1 hunks)
  • examples/stdlib/base/allocator/calloc_test/calloc_test.stdout (1 hunks)
  • examples/stdlib/base/allocator/memcpy_test/memcpy_test.dg (1 hunks)
  • examples/stdlib/base/allocator/memcpy_test/memcpy_test.exitcode (1 hunks)
  • examples/stdlib/base/allocator/memcpy_test/memcpy_test.stdout (1 hunks)
  • examples/struct/struct_and_array/struct_and_array.dg (1 hunks)
  • examples/struct/struct_and_array/struct_and_array.exitcode (1 hunks)
  • lexer/src/lexer.c (4 hunks)
  • parser/main/astnodes/types/Type.c (1 hunks)
  • parser/main/astnodes/types/Type.h (1 hunks)
  • stdlib/base/allocator.dg (2 hunks)
  • tables/lvst/lvst.c (5 hunks)
  • tables/lvst/lvst.h (1 hunks)
  • tables/stst/stst.c (4 hunks)
  • tables/stst/stst.h (2 hunks)
  • tables/test/test.c (10 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (14)
compiler/main/avr_code_gen/cg_avr_single_function.c (2)
tables/lvst/lvst.h (1)
  • lvst_print (54-54)
tables/lvst/lvst.c (1)
  • lvst_print (211-228)
compiler/main/typechecker/tc_ifstmt.c (1)
ast/util/str_ast.c (1)
  • str_type (288-304)
compiler/main/util/fill_tables.c (4)
tables/stst/stst.h (1)
  • stst_fill (32-32)
tables/stst/stst.c (1)
  • stst_fill (55-67)
compiler/main/util/ctx.h (1)
  • ctx_tables (17-17)
compiler/main/util/ctx.c (1)
  • ctx_tables (66-68)
compiler/test/typechecker/test_typechecker.c (2)
compiler/test/typechecker/test_typechecker.h (1)
  • test_typecheck_index_not_found (16-16)
compiler/test/typechecker/test_typechecker_util.c (1)
  • typecheck_file (17-55)
compiler/test/testcases.c (2)
compiler/test/typechecker/test_typechecker.h (1)
  • test_typecheck_index_not_found (16-16)
compiler/test/typechecker/test_typechecker.c (1)
  • test_typecheck_index_not_found (204-216)
compiler/main/typechecker/tc_methodcall.c (2)
compiler/main/typeinference/typeinfer_expr.c (1)
  • infer_type_expr (34-92)
ast/util/str_ast.c (1)
  • str_expr (479-517)
compiler/main/typeinference/infer_in_context.c (2)
compiler/main/typeinference/typeinfer.c (1)
  • unwrap_indices (15-31)
compiler/main/typeinference/typeinfer.h (1)
  • unwrap_indices (45-45)
parser/main/astnodes/types/Type.c (3)
parser/main/astnodes/types/Type.h (1)
  • makeType_4 (13-13)
parser/main/astnodes/types/BasicType.h (1)
  • makeBasicTypeSimple (10-10)
parser/main/astnodes/types/BasicType.c (1)
  • makeBasicTypeSimple (17-28)
compiler/main/typechecker/util/tc_utils.c (1)
compiler/main/typechecker/util/tc_utils.h (1)
  • tc_get_underlying_struct_name (8-8)
compiler/main/x86_code_gen/cg_x86_single_function.c (5)
compiler/main/cli/flags/flags.c (1)
  • flags_debug (418-420)
compiler/main/cli/flags/flags.h (1)
  • flags_debug (23-23)
compiler/main/util/ctx.c (1)
  • ctx_flags (62-64)
tables/lvst/lvst.h (1)
  • lvst_print_filename (56-56)
tables/lvst/lvst.c (1)
  • lvst_print_filename (198-209)
lexer/src/lexer.c (2)
lexer/src/driver.h (1)
  • out_nostr (8-8)
lexer/src/driver.c (1)
  • out_nostr (17-19)
tables/stst/stst.c (6)
tables/stst/stst.h (6)
  • stst_fill (32-32)
  • stst_line_ctor (30-30)
  • stst_member_type (44-44)
  • stst_get_member (40-40)
  • stst_has_member (47-47)
  • stst_get (37-37)
ast/util/copy_ast.c (1)
  • copy_simple_type (222-233)
parser/main/astnodes/types/Type.c (1)
  • makeType_4 (60-70)
parser/main/astnodes/types/Type.h (1)
  • makeType_4 (13-13)
tables/symtable/symtable.c (1)
  • st_register_inferred_type (59-72)
tables/symtable/symtable.h (1)
  • st_register_inferred_type (45-45)
compiler/main/typechecker/tc_simplevar.c (5)
compiler/main/typechecker/_tc.h (3)
  • tc_expr (36-36)
  • tc_simplevar_in_struct (44-44)
  • tc_simplevar (43-43)
compiler/main/typechecker/util/tc_utils.c (3)
  • is_integer_type (49-54)
  • tc_get_underlying_struct_name (11-37)
  • max_indices_allowed (75-82)
compiler/main/typechecker/util/tc_utils.h (3)
  • is_integer_type (12-12)
  • tc_get_underlying_struct_name (8-8)
  • max_indices_allowed (20-20)
tables/stst/stst.h (2)
  • stst_member_type (44-44)
  • stst_has_member (47-47)
tables/stst/stst.c (2)
  • stst_member_type (105-114)
  • stst_has_member (116-137)
tables/lvst/lvst.c (2)
tables/lvst/lvst.h (3)
  • lvst_print (54-54)
  • lvst_at (42-42)
  • lvst_print_filename (56-56)
ast/util/str_ast.c (1)
  • str_type (288-304)
🔇 Additional comments (54)
examples/struct/struct_and_array/struct_and_array.exitcode (1)

1-2: Approve addition of the exit code file.

The file correctly lists the expected exit code 73 for the struct_and_array example and ends with a single newline, adhering to the project’s convention for .exitcode files.

.gitignore (1)

20-21: Adding LVST dump files to .gitignore is appropriate.

The addition of *.lvst pattern to ignore live variable state table dump files is consistent with the existing pattern for register allocation table dumps (.rat). This ensures debug artifacts aren't accidentally committed to version control.

compiler/main/avr_code_gen/cg_avr_single_function.c (1)

32-32: Updated LVST printing to use explicit output stream parameter.

This change aligns with the modified lvst_print function signature that now requires an explicit FILE* parameter. Using stderr is appropriate for debug output.

tables/lvst/lvst.h (2)

54-54: Function signature updated to support flexible output destinations.

The lvst_print function now accepts a FILE* parameter, allowing output to be directed to any file stream instead of a fixed destination. This improves flexibility for debugging purposes.


56-56: New function added for file-based LVST output.

The lvst_print_filename function provides a convenient way to print LVST data to a named file. This complements the updated lvst_print function and supports the new debug output approach in the compiler.

compiler/main/x86_code_gen/cg_x86.c (2)

79-79: Good error handling improvement.

The change to jump to the exit_ibu label on failure provides a cleaner error path that ensures proper resource cleanup, avoiding potential memory leaks.


101-101: Well-placed cleanup label.

The exit_ibu label placement is appropriate, serving as a single exit point for instruction buffer cleanup. This consolidation simplifies the resource management flow.

compiler/main/typechecker/tc_method.c (1)

1-1: Necessary addition for asprintf.

The _GNU_SOURCE definition enables the GNU extension features needed for asprintf used later in the function.

tables/lvst/lvst.c (7)

137-137: Improved error reporting direction.

Excellent change to direct error output explicitly to stderr instead of the default stdout. This follows best practices for separating normal output from error messages.

Also applies to: 144-144, 183-183


198-209: Well-implemented file output function with proper error handling.

The new lvst_print_filename function provides a convenient way to output LVST data to a file, with appropriate error handling for file opening failures.


211-211: Good API enhancement for flexible output.

Modifying the lvst_print function to accept a FILE* parameter enables more flexible output redirection, supporting the debug enhancements in this PR.


217-226: Consistent change to use fprintf.

The change from printf to fprintf with the provided file pointer ensures consistent output redirection throughout the function.


321-326: Good architecture-dependent sizing for anytype.

The new lvst_sizeof_anytype function correctly handles architecture differences, returning 8 bytes for x86 and 2 bytes for other architectures.


347-349: Complete type size handling for anytype.

The addition of anytype handling in lvst_sizeof_type ensures comprehensive type size calculations for all possible type variants.


351-351: Good defensive programming with assertion.

Adding an assertion to check that the calculated size is positive prevents potential issues with invalid type sizes, enhancing code robustness.

lexer/src/lexer.c (4)

15-17: Well-designed helper function for delimiter detection.

The is_delim function cleanly encapsulates the logic for identifying delimiters. This abstraction improves code readability and will make future modifications to delimiter rules easier to implement.


35-51: Good enhancement to prevent partial keyword matches.

The h2out_keyword function implements a more robust keyword detection mechanism by ensuring keywords are followed by delimiters. This prevents scenarios where identifiers that start with keywords (like "invoice" containing "in") would be incorrectly tokenized.


53-54: Consistent macro design for keyword handling.

The H2OUT_KEYWORD macro follows the established pattern of other token handling macros in the file, making it easy to understand and use.


67-67: Good refactoring to use the new delimiter function.

Using is_delim() here improves consistency and maintainability by centralizing the delimiter logic.

examples/stdlib/base/allocator/memcpy_test/memcpy_test.exitcode (1)

1-1: LGTM - Exit code indicates successful test execution.

The exit code of 0 is appropriate for indicating successful test completion.

examples/stdlib/base/allocator/calloc_test/calloc_test.stdout (1)

1-2: LGTM - Output correctly shows zeroed memory and then updated value.

The output shows:

  1. The initial value at index 10 is 0, confirming that calloc properly zeroes memory
  2. After assignment, the value correctly changes to 831212

This validates the expected behavior of the calloc function.

examples/stdlib/base/allocator/calloc_test/calloc_test.exitcode (1)

1-1: LGTM - Exit code indicates successful test execution.

The exit code of 0 is appropriate for indicating successful test completion.

examples/stdlib/base/allocator/memcpy_test/memcpy_test.stdout (1)

1-4: LGTM - Output validates correct memcpy behavior.

The output shows:

  1. The three bytes (1, 2, 3) were successfully copied from source to destination
  2. The fourth byte remains 0, confirming that memcpy only copies the specified number of bytes

This appropriately validates the expected behavior of the memcpy function.

examples/stdlib/base/allocator/calloc_test/calloc_test.dg (2)

4-8: LGTM - Good error handling for allocation failure.

The code properly checks if the allocation was successful before proceeding, which is a good practice for preventing null pointer dereferences.


16-18: LGTM - Proper memory management and return value.

The code correctly frees allocated memory before returning, preventing memory leaks. The return value of 0 appropriately indicates successful test completion.

examples/stdlib/base/allocator/memcpy_test/memcpy_test.dg (2)

1-27: Overall implementation looks good, with minor observations

The test program correctly validates the functionality of the memcpy function by:

  1. Allocating two memory blocks using calloc
  2. Initializing values in the first block
  3. Copying values from first to second block
  4. Verifying the copied values

24-24: Good test coverage includes boundary validation

Good practice to check that the byte beyond the copy length remains unmodified (still 0 from calloc).

stdlib/base/allocator.dg (1)

28-44: Calloc implementation looks correct

The implementation follows the expected pattern for calloc:

  1. Allocates memory using malloc
  2. Zeroes out the allocated memory
  3. Returns the allocated pointer or 0 on failure
examples/struct/struct_and_array/struct_and_array.dg (1)

4-16: Good struct definitions with clear data model

The Container and Line structs are well-defined and model a common pattern for dynamic arrays of structures.

compiler/test/typechecker/test_typechecker.h (1)

16-16: Good addition of a new test case

The new test function declaration appropriately follows the existing naming convention pattern and will help validate the typechecker's behavior when an index variable is not found.

tables/test/test.c (4)

30-33: Good practice: Using zero initialization for structs

Zero initialization ensures all struct fields have known values, preventing undefined behavior. This improves code quality without changing test behavior.


68-75: Consistent zero initialization applied throughout

Continuing the pattern of properly zero-initializing all structs. This is a good practice that makes tests more reliable by ensuring fields start with known values.


102-118: Zero initialization applied to struct declarations

Properly initializing all struct variables with {0} across this test function. Good practice that prevents potential undefined behavior.


128-131: Consistent initialization pattern applied throughout remaining tests

The zero initialization pattern has been consistently applied to all struct declarations throughout the test file. This improves code robustness and predictability.

Also applies to: 180-183, 217-220, 251-254, 281-284, 319-322, 354-357

compiler/main/typeinference/typeinfer.c (2)

3-3: Added required header for assert

Good addition of the assert.h header to support the new assertion.


17-18: Defensive programming: Added null pointer check

Adding this assertion is a good defensive programming practice that prevents potential null pointer dereferences later in the function and documents the expected precondition.

compiler/main/util/fill_tables.c (1)

29-29: Function signature adjustment aligns with broader changes.

The call to stst_fill was updated to pass the entire symbol table context instead of just the struct symbol table. This change aligns with the updated function signature in tables/stst/stst.c where stst_fill now accepts a pointer to the broader symbol table (struct ST* st) rather than just the struct symbol table.

compiler/main/CMakeLists.txt (1)

116-116: Addition of new type checker module.

The addition of tc_simplevar.c to the build configuration properly integrates the new type checking logic for simple variables into the compiler. This file implements detailed validation for variables, including index validation and struct member access checks.

compiler/main/typeinference/typeinfer_var.c (1)

16-18: Null check prevents potential segmentation fault.

This defensive programming check prevents the function from attempting to dereference a null pointer if infer_type_simplevar returns NULL. The early return ensures safer type inference when member types are missing or invalid.

compiler/test/testcases.c (1)

288-288: Added test case for index variable validation.

This new test function checks that the compiler correctly reports an error when an index variable is not found during type checking. This is a good addition to ensure proper error handling for undefined index variables.

compiler/main/typechecker/tc_ifstmt.c (3)

32-32: Good addition of string representation for type

Getting a string representation of the type for error reporting is a good enhancement to provide more context in error messages.


35-35: Improved error message with proper type information

The error message now includes both the expression and its actual type, which makes it much more informative and user-friendly compared to the previous implementation.


38-38: Good memory management - freeing allocated string

Properly freeing the type string after use prevents memory leaks. This is especially important since str_type() returns a dynamically allocated string.

compiler/test/typechecker/test_typechecker.c (1)

204-216: Well-structured test for index variable not found

This test follows the established pattern in the file and verifies the typechecker's ability to detect undeclared index variables. The test case checks both that an error is produced and that it's specifically of type TC_ERR_VAR_NOT_FOUND.

compiler/test/typechecker/test-src/index_not_found.dg (1)

1-6: Good test case for undeclared index variable

This test case effectively tests the scenario where an index variable is not declared before being used. Line 5 attempts to access an array with an undeclared variable index, which should trigger the expected TC_ERR_VAR_NOT_FOUND error that's being tested in the new test function.

compiler/main/typechecker/_tc.h (1)

44-44: Function declaration looks good and enhances struct member type checking.

The addition of tc_simplevar_in_struct function is a logical extension to the typechecker interface, enabling verification of simple variables within a struct context. This complements the existing tc_simplevar function and helps build more robust struct member access validation.

parser/main/astnodes/types/Type.h (2)

5-6: Good move to include ast_declare.h instead of forward declarations.

This change improves code organization by properly including the header with necessary declarations rather than relying on forward declarations.


13-13: The new makeType_4 function fits well with existing pattern.

The addition of makeType_4 follows the consistent naming pattern of existing type construction functions and extends the type system to handle SimpleType nodes, which is needed for the enhanced typechecking capabilities being added.

compiler/main/typeinference/infer_in_context.c (3)

53-55: Good defensive programming with null check.

Adding this null check prevents potential null pointer dereferences if the member type cannot be found in the symbol table. This improves robustness and error handling in the type inference system.


57-57: Improved handling of array indices.

Using unwrap_indices to handle multiple array indices in a member variable is a good approach. This ensures proper type unwrapping based on the number of indices in the simple variable.


66-69: Another good null check for robust error handling.

This additional null check ensures that if infer_in_context_once fails to infer a type, the function will return early instead of attempting to use the null result, preventing potential crashes.

compiler/main/typechecker/tc_methodcall.c (2)

121-121: Good context setting for error reporting.

Setting the current line number in the typechecking context ensures that any error messages will reference the correct line where the method call occurs, improving the quality of diagnostic output.


125-133: Excellent error handling improvement.

This change significantly improves error handling by:

  1. Checking if type inference failed (returning NULL)
  2. Generating a helpful error message with the specific expression snippet
  3. Returning early to prevent null pointer dereferences later in the function

This is a much better approach than the previous code which would likely crash on type inference failure.

compiler/main/typechecker/tc_var.c (1)

50-60:

Details

❌ Incorrect review comment

TODO indicates incomplete type handling for indexed members

The comment says “apply all indices and then get that type”.
Currently structType in ma is the original containing_type; if member->simple_var has array indices, they are ignored when inferring type_after_indices, which can lead to wrong recursion results or crashes.

Please resolve the TODO by feeding the already-checked result of tc_simplevar_in_struct() (or a dedicated helper) into infer_in_context_once.


🏁 Script executed:

#!/bin/bash
# Locate the file tc_var.c and display lines 1-200
file_path=$(fd --full-path "tc_var.c")
echo "File: $file_path"
sed -n '1,200p' "$file_path"

Length of output: 2149


#!/bin/bash
# Search for the definition of MemberAccess
rg -n "struct MemberAccess" -n .

# Search for infer_in_context_once usage and definition
rg -n "infer_in_context_once" -n .

# Search for tc_simplevar_in_struct definition
rg -n "tc_simplevar_in_struct" -n .

🏁 Script executed:

#!/bin/bash
# Locate the definition of MemberAccess
rg -n "struct MemberAccess" -n .

# Locate the implementation of infer_in_context_once
rg -n "infer_in_context_once" -n .

# Locate the implementation of tc_simplevar_in_struct
rg -n "tc_simplevar_in_struct" -n .

Length of output: 1943


#!/bin/bash
echo "===== infer_in_context_once (lines 1-200) ====="
sed -n '1,200p' compiler/main/typeinference/infer_in_context.c
echo "===== tc_simplevar_in_struct (lines 1-200) ====="
sed -n '1,200p' compiler/main/typechecker/tc_simplevar.c

🏁 Script executed:

#!/bin/bash
# Show MemberAccess definition
echo "===== typeinfer.h (MemberAccess) ====="
sed -n '1,200p' compiler/main/typeinference/typeinfer.h

# Show infer_in_context_once implementation
echo "===== infer_in_context_once ====="
sed -n '1,200p' compiler/main/typeinference/infer_in_context.c

# Show typeinfer_var.c context for MemberAccess usage
echo "===== typeinfer_var.c ====="
sed -n '1,200p' compiler/main/typeinference/typeinfer_var.c

Length of output: 4529


Remove stale TODO: indices handling is already implemented
The helper infer_in_context_once unpacks all array indices via unwrap_indices(…, member->simple_var->count_indices), so the //TODO: apply all indices and then get that type and the unused //struct STSTLine* line … are now obsolete. Cleaning up this dead code will improve readability.

• File: compiler/main/typechecker/tc_var.c
• Around lines 57–60

Original snippet:

//struct STSTLine* line = stst_get(stst, struct_name);
//TODO: apply all indices and then get that type

struct MemberAccess ma = {
    .structType = containing_type,
    .member = member,
};

struct Type* type_after_indices = infer_in_context_once(tcctx->st, &ma);

Suggested change:

- //struct STSTLine* line = stst_get(stst, struct_name);
- //TODO: apply all indices and then get that type
  struct MemberAccess ma = {
      .structType = containing_type,
      .member = member,
  };

  struct Type* type_after_indices = infer_in_context_once(tcctx->st, &ma);

Likely an incorrect or invalid review comment.

Fixes for issues found in code review.
@pointbazaar pointbazaar force-pushed the csv-parser branch 2 times, most recently from b1069b5 to a81c09b Compare May 1, 2025 15:51
@pointbazaar pointbazaar merged commit b020e81 into master May 1, 2025
15 checks passed
@pointbazaar pointbazaar deleted the csv-parser branch May 1, 2025 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant