Fix statically linked library functions being filtered out#49
Fix statically linked library functions being filtered out#49mattgodbolt wants to merge 3 commits into
Conversation
## Summary - Fixes issue where statically linked library functions like `__divdc3` were incorrectly filtered out even when directly called by user code - Implements a smart two-pass solution that keeps referenced functions while still filtering out unused library code ## Changes - **New method**: `collectReferencedFunctions()` scans assembly to identify function calls from user code - **Enhanced filtering**: Functions are now kept if they're either user functions OR referenced by user functions - **Comprehensive test case**: Added `ce-bug-1648.asm` test demonstrating `__divdc3` usage and proper filtering ## Root Cause The `binaryIgnoreFunction` regex filtered out all functions starting with `__`, including `__divdc3`, even when they were statically linked and called from user code. This created a poor user experience where important library function implementations were hidden. ## Solution Two-pass approach: 1. **Pass 1**: Scan all assembly lines to collect functions referenced from user code 2. **Pass 2**: Apply enhanced filtering logic that keeps both user functions and referenced library functions ## Impact - **Positive**: Users can now see implementations of library functions they're actually calling - **Minimal**: Unreferenced library functions are still filtered out to avoid output bloat - **Backward compatible**: No change in behavior for existing use cases Based on compiler-explorer/compiler-explorer#7776 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
|
Doesn't look bad to me; keeps in the spirit of the existing code (Which I don't know that well). It's not quite how I'd do it myself but it seems to work (and again keeps in the existing style). |
There was a problem hiding this comment.
Pull Request Overview
This PR implements a two-pass parsing strategy to ensure statically linked library functions called by user code aren’t filtered out.
- Adds
collectReferencedFunctions()to scan for user-referenced symbols before filtering - Updates
shouldIgnoreFunction()to preserve any functions found in the first pass - Introduces
ce-bug-1648.asmto validate that__divdc3is retained while other__*symbols remain filtered
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/objdump/parser.hpp | Added referenced_functions member and declared collectReferencedFunctions() |
| src/objdump/parser.cpp | Implemented two-pass logic: collect references, reset stream, override filtering |
| resources/ce-bug-1648.asm | New assembly test case exercising retention of __divdc3 |
Comments suppressed due to low confidence (1)
CLAUDE.md:1
- [nitpick] The file name
CLAUDE.mdis ambiguous; consider renaming it to something more descriptive likeCODE_GUIDELINES.mdorCONTRIBUTING.mdto clarify its purpose.
# CLAUDE.md
| std::string current_function; | ||
| bool in_user_function = false; | ||
|
|
||
| while (std::getline(in, line)) |
There was a problem hiding this comment.
I'm not a big fan of going over the istream twice, that might negate any benefit we get from using this parser. I would prefer adding this to the eol() function add the found functions to a list that can filtered with at the end before exporting to json.
There was a problem hiding this comment.
Ah but that might produce issues if the dump is larger than 5000 lines...
There was a problem hiding this comment.
Maybe this is the only viable solution... but I don't like it
There was a problem hiding this comment.
I thought about this when I did the TS version. But we need to do two passes, as having found something is "important" changes how we do all the removal later on. I don't think there's a way to do it in a single pass? At least in the typescript, the filtering happens as we go; so we can't do it at the end. Is the C++ code substantially different?
There was a problem hiding this comment.
There's a rationale in the TS issue. We can also choose not to fix this issue :)
|
Technically the example I think in this case it's better to be on the safe side and prevent any performance regressions that might be caused by iterating over the dump twice. So basically:
|
|
If the test cases pass and the original example cited in the bug work, then we should definitely simplify this here and in the main code. If not (the original example still fails) then the test is broken and I made it too simple Either way, glad to have more eyes on this. |
…d functions - assuming the labels and usages are ordered favorably
|
trying to add a test case but struggling... |
Summary
__divdc3were incorrectly filtered out even when directly called by user codeChanges
collectReferencedFunctions()scans assembly to identify function calls from user codece-bug-1648.asmtest demonstrating__divdc3usage and proper filteringRoot Cause
The
binaryIgnoreFunctionregex filtered out all functions starting with__, including__divdc3, even when they were statically linked and called from user code. This created a poor user experience where important library function implementations were hidden.Solution
Two-pass approach:
Test plan
ce-bug-1648.asmdemonstrates the fix working correctly__divdc3(the referenced function) is kept, other unreferenced__*functions remain filtered__divdc3function implementation now visible with static linkingImpact
Based on compiler-explorer/compiler-explorer#7776
🤖 Generated with Claude Code