Fix invalid LLVM IR by updating abitype and buildVal#1822
Fix invalid LLVM IR by updating abitype and buildVal#1822zhouguangyuan0718 wants to merge 1 commit intoxgo-dev:mainfrom
Conversation
zhouguangyuan0718
commented
Apr 19, 2026
- use NamedStruct in abitype
- use aggregateValue instead of ConstArray in buildVal
- do not create alloca in non-entry block
There was a problem hiding this comment.
Code Review
This pull request introduces a createEntryAlloca method to ensure that alloca instructions are consistently placed in the function's entry block, which facilitates better LLVM optimizations like SROA and mem2reg. The changes refactor the SSA builder to use this method and update test expectations. Feedback was provided to refine the createEntryAlloca implementation for better consistency, efficiency, and safety, specifically suggesting a unified insertion logic and proper handling of empty entry blocks.
| func (b Builder) createEntryAlloca(typ llvm.Type) llvm.Value { | ||
| if b.blk == nil || b.blk.Index() == 0 { | ||
| return llvm.CreateAlloca(b.impl, typ) | ||
| } | ||
| blk := b.impl.GetInsertBlock() | ||
| entry := b.Func.impl.EntryBasicBlock() | ||
| if blk == entry { | ||
| return llvm.CreateAlloca(b.impl, typ) | ||
| } | ||
|
|
||
| entryB := b.Prog.ctx.NewBuilder() | ||
| entryB.SetInsertPointBefore(entry.FirstInstruction()) | ||
| ret := llvm.CreateAlloca(entryB, typ) | ||
| entryB.Dispose() | ||
| return ret | ||
| } |
There was a problem hiding this comment.
The implementation of createEntryAlloca has several issues that affect both compiler efficiency and the quality of the generated IR:
- Inconsistency: The current logic appends the
allocato the entry block if the builder is already positioned there (lines 162-163 and 167-168), but prepends it to the start of the entry block if called from a different block (lines 171-173). For optimal LLVM optimization (such as SROA and mem2reg), it is a best practice to place allallocainstructions at the very beginning of the entry block. - Redundancy: The checks
b.blk.Index() == 0andblk == entryare redundant as they both attempt to identify if the current insertion point is within the entry block. - Efficiency: Creating and disposing of a new
llvm.Builderon every call (lines 171-174) adds unnecessary overhead to the compiler's execution. While a dedicated builder could be cached, at the very least, the logic should be unified. - Safety:
entry.FirstInstruction()might return a nil value if the block is empty (though unlikely in a valid function body), which could causeSetInsertPointBeforeto fail. It is safer to check for this or useSetInsertPointAtEndas a fallback.
Consider a unified approach that always ensures the alloca is placed at the start of the entry block for consistency.
func (b Builder) createEntryAlloca(typ llvm.Type) llvm.Value {
entry := b.Func.impl.EntryBasicBlock()
entryB := b.Prog.ctx.NewBuilder()
defer entryB.Dispose()
if first := entry.FirstInstruction(); !first.IsNil() {
entryB.SetInsertPointBefore(first)
} else {
entryB.SetInsertPointAtEnd(entry)
}
return llvm.CreateAlloca(entryB, typ)
}b4cbd7b to
5423dc7
Compare
1. use NamedStruct in abitype 2. use aggregateValue instead of ConstArray in buildVal 3. do not create alloca in non-entry block Signed-off-by: ZhouGuangyuan <zhouguangyuan.xian@gmail.com>
5423dc7 to
c42b08e
Compare
There was a problem hiding this comment.
Review Summary
Solid PR that fixes three classes of invalid LLVM IR. The changes are well-scoped and the new tests directly verify both the presence of correct patterns and absence of incorrect ones. A few observations below, mostly around consistency and robustness — nothing blocking.
Changes reviewed:
ConstNamedStructin abitype (correct fix for named struct type mismatch)aggregateValuereplacingConstArrayin buildVal (correct —ConstArrayrequires constant elements)- Entry-block alloca hoisting via
createEntryAlloca(standard LLVM best practice) - Test fixes using
ConstPtrToInt(valid IR constants)
| @@ -158,6 +158,18 @@ func (b Builder) AllocaT(t Type) (ret Expr) { | |||
| return | |||
There was a problem hiding this comment.
Consistency question: AllocaT here still uses llvm.CreateAlloca(b.impl, t.ll) directly. It's called from di.go:698 which could potentially be invoked from non-entry blocks during debug info emission. Similarly, Alloca (line 144) and ArrayAlloca (line 286) use raw CreateAlloca/CreateArrayAlloca.
Is this an intentional scoping decision (only fixing the call sites that actually triggered invalid IR), or should these be updated too? If intentional, a brief comment explaining why they're exempt would help future maintainers.
| func (b Builder) createEntryAlloca(typ llvm.Type) llvm.Value { | ||
| entry := b.Func.impl.EntryBasicBlock() | ||
| entryB := b.Prog.ctx.NewBuilder() | ||
| defer entryB.Dispose() |
There was a problem hiding this comment.
Consider adding a doc comment explaining the LLVM constraint this method addresses, e.g.:
| func (b Builder) createEntryAlloca(typ llvm.Type) llvm.Value { | |
| entry := b.Func.impl.EntryBasicBlock() | |
| entryB := b.Prog.ctx.NewBuilder() | |
| defer entryB.Dispose() | |
| // createEntryAlloca places an alloca in the function's entry block. | |
| // LLVM requires allocas to be in the entry block for reliable stack | |
| // frame layout and mem2reg promotion to SSA registers. | |
| func (b Builder) createEntryAlloca(typ llvm.Type) llvm.Value { |
Also, a brief inline comment explaining why both the SSA-level check (b.blk.Index() == 0) and the LLVM-level check (blk == entry) are needed would help — they can diverge when SetBlockEx is called with setBlk=false.
| } | ||
|
|
||
| /* TODO(xsw): | ||
| // AllocaU allocates uninitialized space for n*sizeof(elem) bytes. |
There was a problem hiding this comment.
Minor robustness note: if this is ever called before any instruction is emitted in the entry block, entry.FirstInstruction() could return a nil value. In practice the guard at line 162 (b.blk == nil || b.blk.Index() == 0) short-circuits the common case, and existing code in internal/cabi/cabi.go uses the same pattern, so this is consistent — just worth noting the invariant.
Also: creating and disposing a new builder per call is fine given the bounded number of allocas per function, but if this ever becomes a hot path, caching a secondary builder on the builder struct would avoid repeated CGo roundtrips.
| telem := b.Prog.rawType(t.Elem()) | ||
| elem := b.buildVal(telem, val, lvl-1) | ||
| return Expr{llvm.ConstArray(typ.ll, []llvm.Value{elem.impl}), typ} | ||
| return Expr{aggregateValue(b.impl, typ.ll, elem.impl), typ} |
There was a problem hiding this comment.
Good fix. ConstArray requires constant elements, which isn't guaranteed in the type-assertion path. Using aggregateValue (insertvalue chain from undef) is the correct approach for runtime values.
Nit: buildVal always wraps a single element per recursion level (field 0 for structs, element 0 for arrays). A brief comment explaining this invariant would help readers understand why only one element is passed to aggregateValue.