[HIPIFY][feature] Local header hipification via Clang PPCallbacks (experimental, opt-in) - Part 1#2402
Conversation
|
Can we say that the main purpose of this PR is to allow hipifying header files individually as main translation units? |
Yes, correct. The main purpose is to allow hipifying local header files as main translation units. We are doing it by injecting their preceding includes from the parent source file using Clang's This feature is enabled via |
How do we guarantee finding the parent source(s) for a particular header file so we can hipify it as the main translation unit? |
We don't need to discover/guess the parent source file. The user provides it explicitly while hipifying. The main source file passed to hipify-clang is always the parent. For example: |
That is what worries me most.
Technically, the main problem is that we are trying to treat a file, a header file, which is not self-contained, as a source file for hipification (actually, compilation). I want us to think about other approaches here.
|
5e6249f to
353ebef
Compare
As per our discussions/suggestions, I've reworked this PR to implement suggestion 2. I will raise 2 separate PRs as a follow up for:
Updated PR title and description to reflect the changes. |
353ebef to
d10566f
Compare
| cl::opt<bool> OptLocalHeadersRecursive("local-headers-recursive", | ||
| cl::desc("Enable hipification of quoted local headers recursively"), | ||
| cl::opt<bool> SkipLocalHeaders("skip-local-headers", | ||
| cl::desc("Skip implicit hipification of local (quoted) headers included in the main source file"), |
There was a problem hiding this comment.
What if the "local" headers are included with <>?
There was a problem hiding this comment.
I believe this is out of scope. Angled-bracket includes files are resolved using -I search paths. They represent system headers or external library dependencies.
But I do think there is edge case like, if a project uses #include <mylib/utils.h> via -I project/src flag, that would require resolving each angled-bracket against every -I path. That is a meaningful work and requires a separate PR of its own.
There was a problem hiding this comment.
#2505: Angle-bracket local header hipification (#include <>)
| bool collectPrecedingIncludes(const std::string &mainSourceAbspath, | ||
| const std::string &targetHeaderAbspath, | ||
| std::vector<std::string> &outIncludes) { | ||
| std::string mainSourceContent; | ||
| if (!readFile(mainSourceAbspath, mainSourceContent)) { | ||
| errs() << sHipify << sError << "Cannot read source files: " | ||
| << mainSourceAbspath << "\n"; | ||
| return false; | ||
| } | ||
|
|
||
| std::string targetFileName = std::string(sys::path::filename(targetHeaderAbspath)); | ||
| std::istringstream iss(mainSourceContent); | ||
| std::string line; | ||
| std::smatch m; | ||
|
|
||
| while (std::getline(iss, line)) { | ||
| if (std::regex_match(line, m, LocalIncludeRe)) { | ||
| std::string quotedName = m[1].str(); | ||
| std::string quotedFileName = std::string(sys::path::filename(quotedName)); | ||
| if (quotedFileName == targetFileName) | ||
| break; | ||
| std::string absPath; | ||
| if (resolveLocalIncludeInternal(mainSourceAbspath, quotedName, absPath)) | ||
| outIncludes.push_back(absPath); | ||
| } | ||
|
|
||
| if (std::regex_search(line, m, SystemIncludeRe)) { | ||
| outIncludes.push_back(m[1].str()); | ||
| } | ||
| } | ||
|
|
||
| return true; | ||
| } | ||
| } |
There was a problem hiding this comment.
Looks like the collectPrecedingIncludes only scans the immediate parentPath.
What about deeply nested headers?
There was a problem hiding this comment.
yes, correct.
collectPrecedingIncludes scans only one level (the immediate parent). This is intentional for this PR, it was designed for (main.cu → A.h → B.h) where B.h depends on explicitly included before it in A.h and collectPrecedingIncludes(A.h, B.h) finds exactly what B.h needs.
I considered collecting full ancestor chain recursively (for nested headers) but that results in over-injection. When two of those injected headers define same symbol, we get redefinition errors.
I would like to take this (deeply nested headers) as a follow-up PR. We track the full ancestors list and modify from {hdr, parentPath} to {hdr, vector<parentChain>}. And only injects that appear in ancestors(parents) up to the nearest file that explicitly includes the current target. This avoids over-injection issue.
There was a problem hiding this comment.
#2507: Transitive preceding includes for deeply nested recursive headers
d10566f to
7d568b6
Compare
|
Follow up work
|
8d1113a to
8a1adf5
Compare
|
The added tests cover basic recursion and depth-1 injection of “includes”. However, the test that verifies the transitive context is preserved during recursive traversal is still needed. Please add a test that includes a non-self-contained header with a recursion depth of 2 or more. For example, the |
|
I honestly believe we need to preserve |
8a1adf5 to
e198b45
Compare
I've adapted to use
|
Thanks for raising this and it is a fair concern. Based on our earlier discussions, we collectively moved toward implicit-by-default to lower the barrier for users. The current revision already has multiple layers in place for end users like
And I am thinking to add below
With these measures together I believe the implicit behavior is transparent. That said, I am happy to address any specific scenario where this causes harm directly rather than revert the design. What do you think? @emankov |
I still believe that experimental features should stay experimental for at least one release. Meanwhile, it might be thoroughly tested. Next step: inform end users in the documentation that, with the next release, this particular feature will be enabled by default, and for fallback functionality, here are the options to set explicitly. |
|
[IMPORTANT] After reviewing this regex-based solution for header tree calculation, I decided it is not a good solution. Using regex in hipify-perl is a forced decision, but using a regex-based approach with a C++ compiler under the hood can't be treated as a good solution. [Reasons]
[Recommendations]
|
…jection via ArgumentsAdjusters
…and restore opt-in flags
e198b45 to
7d31bff
Compare
Agreed, both
|
Thanks for this feedback. I've removed the entire regex-based include scanning system and replaced with Clang's The new PPCallbacks approach uses We are directly addressing below concerns:
|
Problem Statement
hipify-clangcurrently only transforms the main source file passed as input. Local (quoted) headers included in the main file remain same as CUDA code. These headers often cannot be hipified individually because they are not self-contained headers which requires types, operators, and definitions from preceding includes in the parent source.Example: dxtc.cu includes
CudaMath.h, which usesfloat3operators (+=,*,-) defined inhelper_math.h. HipifyingCudaMath.halone fails because these operators are unavailable. Similar cases:Solution
Two new opt-in flags enable local header hipification:
The feature is experimental and disabled by default. It will be promoted to default behavior in later release cycles with
--skip-local-headersas the opt-out fallback.Supersedes #2275
Summary of changes in this PR
Clang PPCallbacks instead of regex
Include discovery is using Clang's
PPCallbacks::InclusionDirectivevia preprocessing pre-pass, not regex. This correctly handles conditional compilation (#ifdef), macro-computed includes (#include MACRO), compiler search paths (-I,-isystem), and comments.Fixed Compilation Database fallback
Header files which have no entry in
compile_commands.jsonfalls back to aFixedCompilationDatabaserooted at the main source file's directory so that relative paths resolve correctly.Cycle and duplicate detection
A
processedset prevents re-hipification, and aqueuedset prevents duplicate work-queue entries during recursive traversal.ToDo work (follow-up PRs):
--skip-local-headers)#include <mylib/utils.h>via-I) — [HIPIFY][#2402] hipify project-local headers included with angle brackets (<>) #2505