-
Notifications
You must be signed in to change notification settings - Fork 17
Issue 848 replace simple regression tests #903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: SharedDevelopment
Are you sure you want to change the base?
Issue 848 replace simple regression tests #903
Conversation
…8-replace-simple-regression-tests
…ices for regression testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR replaces the simple diff command used in regression testing with a custom C++ matrix comparison tool that provides more intelligent comparison of XML matrix files with floating-point tolerance.
Changes:
- Implemented a new C++ tool (
compare_matrices.cpp) that parses XML matrix files and compares them with configurable floating-point tolerance - Updated GitHub Actions workflow to compile and use the new comparison tool instead of
diff - Applied formatting improvements to the workflow YAML file
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| Testing/RegressionTesting/compare_matrices.cpp | Complete rewrite of matrix comparison tool with XML parsing, metadata validation, and floating-point tolerance for value comparisons |
| .github/workflows/tests.yml | Added build step for compare_matrices and replaced all diff commands with the new tool; applied formatting fixes |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } else { | ||
| // Check individual values | ||
| for (size_t i = 0; i < good.values.size(); ++i) { | ||
| if ((std::abs(good.values[i] - test.values[i])) > EPSILON) { |
Copilot
AI
Jan 21, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comparison uses absolute tolerance for all values. For very large or very small values, an absolute tolerance may not be appropriate. Consider using a relative tolerance comparison: std::abs(good.values[i] - test.values[i]) / std::max(std::abs(good.values[i]), std::abs(test.values[i])) > relative_epsilon for non-zero values, falling back to absolute tolerance for values near zero.
| // Check metadata | ||
| if (good.type != test.type) { | ||
| std::cout << "Matrix '" << good.name << "': type mismatch (good: " << good.type | ||
| << ", test: " << test.type << ")\n"; | ||
| has_mismatch = true; | ||
| } | ||
| if (good.rows != test.rows) { | ||
| std::cout << "Matrix '" << good.name << "': rows mismatch (good: " << good.rows | ||
| << ", test: " << test.rows << ")\n"; | ||
| has_mismatch = true; | ||
| } | ||
| if (good.columns != test.columns) { | ||
| std::cout << "Matrix '" << good.name << "': columns mismatch (good: " << good.columns | ||
| << ", test: " << test.columns << ")\n"; | ||
| has_mismatch = true; | ||
| } | ||
| if (std::abs(good.multiplier - test.multiplier) > EPSILON) { | ||
| std::cout << "Matrix '" << good.name << "': multiplier mismatch (good: " << good.multiplier | ||
| << ", test: " << test.multiplier << ")\n"; | ||
| has_mismatch = true; | ||
| } | ||
|
|
||
| // Check data size | ||
| if (good.values.size() != test.values.size()) { | ||
| std::cout << "Matrix '" << good.name | ||
| << "': value count mismatch (good: " << good.values.size() | ||
| << ", test: " << test.values.size() << ")\n"; | ||
| has_mismatch = true; | ||
| } else { | ||
| // Check individual values | ||
| for (size_t i = 0; i < good.values.size(); ++i) { | ||
| if ((std::abs(good.values[i] - test.values[i])) > EPSILON) { | ||
| std::cout << "Matrix '" << good.name << "': value mismatch at index " << i | ||
| << " (good: " << good.values[i] << ", test: " << test.values[i] << ")\n"; | ||
| has_mismatch = true; | ||
| break; // Only report first mismatch per matrix | ||
| } | ||
| } | ||
| } |
Copilot
AI
Jan 21, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When metadata mismatches occur (type, rows, columns, multiplier), the function still continues to compare values even though the matrices are fundamentally incompatible. If the dimensions don't match, comparing values could lead to out-of-bounds access or misleading error messages. Consider returning early from the function after detecting metadata mismatches that make value comparison meaningless.
| run: | | ||
| cd ../Testing/RegressionTesting | ||
| g++ -std=c++17 -O3 -Wall -Wextra compare_matrices.cpp -o compare_matrices | ||
| cd ../../build | ||
| # simulator runs | ||
| - id: tt | ||
| name: run test-tiny |
Copilot
AI
Jan 21, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hardcoded path navigation cd ../Testing/RegressionTesting and back cd ../../build assumes a specific directory structure. If the workflow's working directory changes or the repository structure is modified, this will break. Consider using absolute paths with $GITHUB_WORKSPACE or compiling from the build directory with appropriate path specifications.
| run: | | |
| cd ../Testing/RegressionTesting | |
| g++ -std=c++17 -O3 -Wall -Wextra compare_matrices.cpp -o compare_matrices | |
| cd ../../build | |
| # simulator runs | |
| - id: tt | |
| name: run test-tiny | |
| working-directory: ${{ github.workspace }}/Testing/RegressionTesting | |
| run: g++ -std=c++17 -O3 -Wall -Wextra compare_matrices.cpp -o compare_matrices | |
| # simulator runs | |
| - id: tt | |
| name: run test-tiny | |
| - id: tt | |
| name: run test-tiny |
|
|
||
| while (std::getline(file, line)) { | ||
| trim_inplace(line); | ||
| if (line.find("<Matrix") == 0) { |
Copilot
AI
Jan 21, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parsing assumes XML elements are on separate lines and uses simple string matching (line.find("<Matrix") == 0). This approach is fragile and won't handle:
- XML with attributes in different orders
- Extra whitespace before the opening tag
- Multi-line opening tags
- Comments or other valid XML constructs
Consider using a proper XML parser library or at minimum making the parsing more robust to handle common XML variations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to do this, as we control the XML output formatting from the simulator (also, we don't want to add dependencies to this testing program). But, it would be good to include this information as a comment in the code, so others can know its limitations and our reasoning.
| size_t end = input.find_last_not_of(" \t\r\n"); | ||
| if (start > 0) | ||
| input.erase(0, start); | ||
| input.erase(end - start + 1); |
Copilot
AI
Jan 21, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The calculation for the erase operation is incorrect. After erasing the leading characters, the indices change. The correct calculation should be: input.erase(end + 1) instead of input.erase(end - start + 1).
For example, if start=2, end=5, and we erase the first 2 characters, the new end position becomes 3, but end - start + 1 would be 4, causing incorrect trimming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check this.
.github/workflows/tests.yml
Outdated
| if: always() && steps.build.conclusion == 'success' | ||
| run: ./cgraphitti -c ../Testing/RegressionTesting/configfiles/test-small-911.xml | ||
| - name: verify test-small-911 | ||
| if: always() && steps.tt.conclusion == 'success' |
Copilot
AI
Jan 21, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The verification step for test-small-911 checks steps.tt.conclusion (test-tiny's status) instead of steps.ts911.conclusion. This means the verification will run based on the wrong test's success status, potentially skipping verification when test-small-911 fails or running when it shouldn't.
| if: always() && steps.tt.conclusion == 'success' | |
| if: always() && steps.ts911.conclusion == 'success' |
| } else if (field == "columns") { | ||
| std::from_chars(value.data(), value.data() + value.size(), current.columns); | ||
| } else if (field == "multiplier") { | ||
| std::from_chars(value.data(), value.data() + value.size(), current.multiplier); |
Copilot
AI
Jan 21, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using std::from_chars for parsing doubles can fail silently in certain edge cases. When parsing the multiplier field, if the conversion fails (ec != std::errc()), the value remains at its default of 1.0 without any error reporting. This could mask issues where the XML contains invalid multiplier values. Consider adding error handling or validation for critical fields like multiplier.
| std::from_chars(value.data(), value.data() + value.size(), current.multiplier); | |
| double parsed_multiplier = current.multiplier; | |
| const char *begin = value.data(); | |
| const char *end = value.data() + value.size(); | |
| auto result = std::from_chars(begin, end, parsed_multiplier); | |
| if (result.ec == std::errc() && result.ptr == end) { | |
| current.multiplier = parsed_multiplier; | |
| } else { | |
| std::cerr << "Warning: invalid multiplier value \"" << std::string(value) | |
| << "\"; using default multiplier " << current.multiplier << '\n'; | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a sensible suggestion to consider.
| //Anonymous namespace for static/re-usable methods | ||
| namespace { | ||
| // Tolerance for floating-point comparisons | ||
| constexpr double EPSILON = 1e-9; |
Copilot
AI
Jan 21, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tolerance EPSILON (1e-9) is very strict for floating-point comparisons. Depending on the computational methods used in the simulator and the precision of XML serialization/deserialization, this might cause false positives where matrices are functionally equivalent but differ slightly due to floating-point rounding. Consider whether a slightly larger tolerance or relative tolerance would be more appropriate for this use case.
| constexpr double EPSILON = 1e-9; | |
| constexpr double EPSILON = 1e-6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the comments for this review, including the copilot suggestions they reference (apparently, though I wrote these as responses to copilot comments, they don't link to those, so you'll need to scroll to the points in the code to see the copilot suggestions).
|
|
||
| while (std::getline(file, line)) { | ||
| trim_inplace(line); | ||
| if (line.find("<Matrix") == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to do this, as we control the XML output formatting from the simulator (also, we don't want to add dependencies to this testing program). But, it would be good to include this information as a comment in the code, so others can know its limitations and our reasoning.
| } else if (field == "columns") { | ||
| std::from_chars(value.data(), value.data() + value.size(), current.columns); | ||
| } else if (field == "multiplier") { | ||
| std::from_chars(value.data(), value.data() + value.size(), current.multiplier); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a sensible suggestion to consider.
| size_t end = input.find_last_not_of(" \t\r\n"); | ||
| if (start > 0) | ||
| input.erase(0, start); | ||
| input.erase(end - start + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check this.
|
Do subsequent tests run if an earlier test fails? They used to, but it seems like this action has that fixed. Please confer with @stardriftfx on this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Helper for floating-point comparison with relative tolerance | ||
| bool approximately_equal(double a, double b) | ||
| { | ||
| double diff = std::abs(a - b); | ||
| // Use absolute tolerance for values near zero | ||
| if (std::abs(a) < EPSILON && std::abs(b) < EPSILON) { | ||
| return diff <= EPSILON; | ||
| } | ||
| // Use relative tolerance for larger values | ||
| double max_val = std::max(std::abs(a), std::abs(b)); |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an edge case in the approximately_equal function when one value is near zero (but above EPSILON threshold) and the other is exactly zero. For example, comparing 0 with 2e-6 will fail because abs(b) = 2e-6 is not less than EPSILON (1e-6), triggering the relative tolerance check where diff/max_val = 1, which exceeds RELATIVE_EPSILON. Consider using an OR condition: return true if EITHER absolute difference is within EPSILON OR relative difference is within RELATIVE_EPSILON, rather than the current if-else approach.
| // Helper for floating-point comparison with relative tolerance | |
| bool approximately_equal(double a, double b) | |
| { | |
| double diff = std::abs(a - b); | |
| // Use absolute tolerance for values near zero | |
| if (std::abs(a) < EPSILON && std::abs(b) < EPSILON) { | |
| return diff <= EPSILON; | |
| } | |
| // Use relative tolerance for larger values | |
| double max_val = std::max(std::abs(a), std::abs(b)); | |
| // Helper for floating-point comparison with absolute and relative tolerance | |
| bool approximately_equal(double a, double b) | |
| { | |
| double diff = std::abs(a - b); | |
| // First, allow a small absolute difference regardless of magnitude | |
| if (diff <= EPSILON) { | |
| return true; | |
| } | |
| // Otherwise, fall back to a relative difference check for larger values | |
| double max_val = std::max(std::abs(a), std::abs(b)); | |
| if (max_val == 0.0) { | |
| // Both values are effectively zero (but diff > EPSILON would be unexpected) | |
| return true; | |
| } |
| trim_inplace(line); | ||
| if (line.find("<Matrix") == 0) { | ||
| current = MatrixData {}; | ||
| MatrixData::parse_all_fields(line, current); |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return value of parse_all_fields (which indicates whether parsing succeeded) is not checked. If parsing fails, the code continues silently with potentially incomplete or incorrect MatrixData. Consider checking the return value and either logging a warning with the matrix name or handling the error appropriately.
| MatrixData::parse_all_fields(line, current); | |
| bool parsed_ok = MatrixData::parse_all_fields(line, current); | |
| if (!parsed_ok) { | |
| std::cerr << "Warning: failed to parse Matrix attributes in file '" << filename | |
| << "' for line: " << line << "\n"; | |
| } |
| @@ -1,53 +1,232 @@ | |||
| #include <algorithm> | |||
| #include <charconv> | |||
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| } | ||
|
|
||
| // Parse matrices from XML file | ||
| // Uses simple string matching and assumes XML elements are on seperate lines. |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in comment: "seperate" should be "separate"
| // Uses simple string matching and assumes XML elements are on seperate lines. | |
| // Uses simple string matching and assumes XML elements are on separate lines. |
| // Check if parsing succeeded | ||
| if (good.empty() && test.empty()) { | ||
| std::cerr << "Error: Both files are empty or failed to parse.\n"; | ||
| return 1; |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check for empty files may incorrectly fail if one file is empty but valid (e.g., a legitimate empty test output) while the other is not. Consider checking each file independently and providing more specific error messages. The current logic uses && which only returns error when BOTH are empty, but a single empty file might also indicate a problem that should be caught.
| // Check if parsing succeeded | |
| if (good.empty() && test.empty()) { | |
| std::cerr << "Error: Both files are empty or failed to parse.\n"; | |
| return 1; | |
| // Check if parsing succeeded for each file | |
| if (good.empty() && test.empty()) { | |
| std::cerr << "Error: Both files are empty or failed to parse.\n"; | |
| return 1; | |
| } else if (good.empty()) { | |
| std::cerr << "Error: Reference file '" << argv[1] | |
| << "' is empty or failed to parse.\n"; | |
| return 1; | |
| } else if (test.empty()) { | |
| std::cerr << "Error: Test file '" << argv[2] | |
| << "' is empty or failed to parse.\n"; | |
| return 1; |
| } else { | ||
| ++ptr; | ||
| } |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When parsing a value fails (line 163), the code silently skips one character and continues. This could lead to incorrect parsing if the data contains unexpected characters or formats. Consider logging a warning with the problematic line content and position, or keeping track of parsing failures to report at the end. Silent failures during value parsing can lead to incorrect comparison results without clear indication of what went wrong.
stiber
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kyle, please respond to and/or resolve all comments in "Conversation" before requesting my review. Otherwise, I have no way of knowing if any of the matters raised have been addressed. Do the same for all copilot comments; many are at least worth considering.
Closes #848
Description
This PR adds a C++ file that compares two matrices instead of using the previous method of calling diff.
It also changes the tests.yml file to use the new compare_matrices testing method
Checklist (Mandatory for new features)
Testing (Mandatory for all changes)
test-medium-connected.xmlPassedtest-large-long.xmlPassed