Skip to content

build: use LLVM memcpyopt pass to replace cabi load/store optimization#1612

Merged
xushiwei merged 2 commits intoxgo-dev:mainfrom
luoliwoshang:feature/llvm-passes
Feb 6, 2026
Merged

build: use LLVM memcpyopt pass to replace cabi load/store optimization#1612
xushiwei merged 2 commits intoxgo-dev:mainfrom
luoliwoshang:feature/llvm-passes

Conversation

@luoliwoshang
Copy link
Copy Markdown
Contributor

@luoliwoshang luoliwoshang commented Feb 5, 2026

Fixed #1604 #1606 #1608

Summary

Use LLVM's built-in memcpyopt pass to replace the manual load/store optimization in cabi. This pass handles the same optimization more safely and robustly.

Changes

  • ssa/target.go: Add targetInfo() to return both TargetData and TargetMachine
  • ssa/package.go:
    • Store TargetMachine in Program struct
    • Add TargetMachine() and DataLayout() methods
  • internal/build/build.go:
    • Set data layout and target triple before running LLVM passes
    • Run memcpyopt pass after IR generation

Example

For returning large structs, this pass achieves the same optimization as the cabi code, converting load/store patterns to memcpy:

define void @command-line-arguments.returnStruct(ptr sret(%command-line-arguments.T) %0) {
_llgo_0:
  %1 = alloca %command-line-arguments.T, align 8
  call void @llvm.memset(ptr %1, i8 0, i64 24, i1 false)
  %2 = getelementptr inbounds %command-line-arguments.T, ptr %1, i32 0, i32 0
  %3 = getelementptr inbounds %command-line-arguments.T, ptr %1, i32 0, i32 1
  store i64 1, ptr %2, align 8
  store i64 2, ptr %3, align 8
  call void @llvm.memcpy.p0.p0.i64(ptr align 8 %0, ptr align 8 %1, i64 24, i1 false)
  ret void
}

Why use memcpyopt?

  1. Safer: LLVM's pass handles alias analysis and memory overlap detection automatically
  2. More robust: Handles more edge cases than manual implementation
  3. Maintained: Benefits from LLVM's ongoing improvements

Background

LLVM's memcpyopt pass (Memory Copy Optimization) is enabled at -O1 and above. It recognizes patterns like:

%val = load { i64, i64, i64 }, ptr %src
store { i64, i64, i64 } %val, ptr %dst

And converts them to:

call void @llvm.memcpy.p0.p0.i64(ptr %dst, ptr %src, i64 24, i1 false)

Fixes #1605

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @luoliwoshang, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates the LLVM memcpyopt pass into the build pipeline to enhance code generation efficiency by converting specific load/store sequences into optimized memory copy operations. To support this, the SSA program structure was extended to provide necessary LLVM target information. Crucially, a potential data corruption issue related to memcpy optimization in the C ABI for structure returns was identified and addressed by reverting to a safer store operation, accompanied by new regression tests.

Highlights

  • LLVM memcpyopt Pass Integration: The memcpyopt pass is now run after IR generation to optimize consecutive load/store operations into memcpy/memmove calls, improving code efficiency.
  • SSA Program Enhancements: The ssa/package.go and ssa/target.go files were updated to properly manage and expose llvm.TargetMachine and llvm.TargetData, which are crucial for configuring LLVM passes.
  • Module Configuration for LLVM Passes: LLVM modules are now configured with the correct data layout and target triple upon creation, a prerequisite for running optimization passes.
  • CABI sret Optimization Correction: A previous memcpy optimization in the C ABI transformer for structure returns (sret) was reverted to a store operation to prevent a regression (issue slice return unexpect value when struct field modified after read #1608) where the returned value could be incorrect if the source memory was modified.
  • Regression Test for CABI Fix: A new test case (_demo/go/return-1605/main.go and internal/cabi/cabi_test.go) was added to validate the fix for the sret memcpy issue.
  • Runtime Package Exclusion: The memcpyopt pass is currently skipped for runtime packages due to observed segmentation faults, indicating an area for future investigation.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • _demo/go/return-1605/main.go
  • internal/build/build.go
    • Imported the gollvm package.
    • Implemented the execution of the memcpyopt LLVM pass after IR generation.
    • Added a conditional check to skip applying the memcpyopt pass to runtime packages, as they currently cause crashes.
  • internal/cabi/cabi.go
    • Modified the transformFuncBody function to explicitly use b.CreateStore instead of p.callMemcpy for sret (structure return) cases. This change prevents a bug where the source address content might be modified between the load and return, leading to incorrect values.
  • internal/cabi/cabi_test.go
  • ssa/package.go
    • Added tm llvm.TargetMachine as a field to the aProgram struct to store the LLVM target machine.
    • Updated NewProgram to initialize the tm field using the new targetDataAndMachine method.
    • Introduced TargetMachine() and DataLayout() methods to the Program interface, providing access to the stored TargetMachine and its data layout string.
    • Modified NewPackage to set the DataLayout and Target triple on newly created LLVM modules, which is essential for LLVM optimization passes.
  • ssa/target.go
    • Refactored the targetData() method into targetDataAndMachine().
    • The targetDataAndMachine() method now returns both llvm.TargetData and llvm.TargetMachine, making the TargetMachine available for use in the Program struct.
Activity
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the memcpyopt LLVM pass to optimize load/store operations, which is a great performance improvement. The changes to enable this pass in ssa/package.go and ssa/target.go are well-structured. The bug fix in internal/cabi/cabi.go and the accompanying regression tests are also solid.

My review includes a couple of suggestions to improve maintainability and clarity, mainly regarding a hardcoded package path and an inconsistency in issue numbers in a test file. Overall, this is a good step forward, and I'm looking forward to seeing the WIP parts resolved.

@@ -0,0 +1,35 @@
package main

// Regression test for https://github.com/goplus/llgo/issues/1608
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There seems to be an inconsistency in the issue number. The directory is named return-1605 and the PR description states this fixes #1605, but the comment here refers to issue #1608. The test case appears to be a regression test for the scenario described in issue #1608, which is also referenced in internal/cabi/cabi.go and internal/cabi/cabi_test.go. To improve clarity and avoid future confusion, please consider aligning the directory name with the issue it tests, for example by renaming it to return-1608.

Comment thread internal/build/build.go
@luoliwoshang luoliwoshang force-pushed the feature/llvm-passes branch 3 times, most recently from c2d0a28 to 75cc7d4 Compare February 5, 2026 11:58
Comment thread ssa/package.go Outdated
@@ -119,7 +119,7 @@ type aProgram struct {

target *Target
td llvm.TargetData
// tm llvm.TargetMachine
tm llvm.TargetMachine
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resource leak concern: The TargetMachine (tm) is stored but never disposed. LLVM's TargetMachine holds native resources that aren't managed by Go's garbage collector. Consider adding a Dispose() method to aProgram that calls tm.Dispose() (and td.Dispose(), ctx.Dispose()) and ensure callers invoke it when the Program is no longer needed.

This could lead to memory pressure in long-running compilation sessions or when compiling many packages.

@@ -0,0 +1,35 @@
package main

// Regression test for https://github.com/goplus/llgo/issues/1608
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistency: The directory is named return-1605 but the comment references issue #1608. Issue #1608 describes the slice return value bug being tested here.

Consider renaming the directory to return-1608 for consistency with the code comment and the actual issue being fixed.

Comment thread internal/build/build.go
defer pbo.Dispose()
if err := ret.Module().RunPasses("memcpyopt", ctx.prog.TargetMachine(), pbo); err != nil {
return fmt.Errorf("run LLVM passes failed for %v: %v", pkgPath, err)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using existing helper: There's an isRuntimePkg() helper function at line 977 that uses env.LLGoRuntimePkg constant. Using it would improve consistency:

Suggested change
}
if !isRuntimePkg(pkgPath) {

Comment thread internal/build/build.go
return fmt.Errorf("run LLVM passes failed for %v: %v", pkgPath, err)
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance consideration: PassBuilderOptions is created and disposed for every package compilation. For projects with many packages, consider reusing a single instance at the context level to reduce allocation overhead.

Also, the comment mentions runtime packages cause SIGSEGV - it would be helpful to document this more specifically or link to an issue tracking the root cause investigation.

@fennoai
Copy link
Copy Markdown
Contributor

fennoai Bot commented Feb 5, 2026

Code Review Summary

This PR correctly fixes issue #1608 by removing the unsafe memcpy optimization in CABI transformation and delegating memory optimization to LLVM's memcpyopt pass. The approach is sound - LLVM's pass has proper alias analysis to determine when memcpy is safe.

Key observations:

  • The fix properly addresses the root cause where source memory could be modified between load and return
  • Good test coverage with both a unit test and a demo regression test
  • The memcpyopt pass integration adds a new dependency on TargetMachine

Minor issues to address:

  1. Directory/filename mismatch: return-1605 vs issue slice return unexpect value when struct field modified after read #1608
  2. Hardcoded runtime path check - consider using existing isRuntimePkg() helper
  3. TargetMachine resource lifecycle management (no disposal)

Overall the fix is well-designed. The runtime package crash warrants a follow-up investigation issue.

@luoliwoshang luoliwoshang force-pushed the feature/llvm-passes branch 2 times, most recently from fb8dd9b to a62459f Compare February 5, 2026 12:46
@luoliwoshang luoliwoshang changed the title [WIP] ssa: add memcpyopt pass to optimize load/store to memcpy build: use LLVM memcpyopt pass to replace cabi load/store optimization Feb 5, 2026
@luoliwoshang luoliwoshang force-pushed the feature/llvm-passes branch 3 times, most recently from 3b6f9fe to 9eac45e Compare February 5, 2026 13:29
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.10%. Comparing base (ba56141) to head (791c2fd).
⚠️ Report is 10 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1612      +/-   ##
==========================================
+ Coverage   91.05%   91.10%   +0.04%     
==========================================
  Files          45       45              
  Lines       11933    11921      -12     
==========================================
- Hits        10866    10861       -5     
+ Misses        892      886       -6     
+ Partials      175      174       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@luoliwoshang
Copy link
Copy Markdown
Contributor Author

@xgopilot merge main branch

@fennoai
Copy link
Copy Markdown
Contributor

fennoai Bot commented Feb 5, 2026

Merged main branch ✅

Successfully merged the latest main branch into the feature branch.

Changes:

  • Fetched latest changes from origin/main
  • Resolved merge conflict in ssa/target.go (kept CtxRegister changes and targetInfo function from feature branch)
  • Pushed merged changes to fork

Action Required: Please merge this PR in your fork to update the feature branch:
👉 luoliwoshang#196

- Add targetInfo() to return TargetData and TargetMachine
- Add TargetMachine() and DataLayout() methods to Program
- Set data layout and target triple before running LLVM passes
- Run memcpyopt pass after IR generation to convert load/store to memcpy

This replaces the manual optimization in cabi with LLVM's built-in
memcpyopt pass, which handles more cases and is more robust.

Fixes xgo-dev#1605
@luoliwoshang luoliwoshang requested a review from visualfc February 5, 2026 16:07
@luoliwoshang luoliwoshang force-pushed the feature/llvm-passes branch 3 times, most recently from 8b39300 to 8da359b Compare February 6, 2026 05:00
Comment thread ssa/package.go
return p.td
}

func (p Program) TargetMachine() llvm.TargetMachine {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xgopilot also need a expect test for this Method!create pr for me!

Copy link
Copy Markdown
Contributor

@fennoai fennoai Bot Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added test coverage for TargetMachine() and DataLayout()

Enhanced TestTargetMachineAndDataLayout in ssa/ssa_test.go to verify:

  • TargetMachine() returns a valid (non-nil) target machine
  • TargetData() returns a valid (non-nil) target data

PR Created: luoliwoshang#199

Please merge this PR to include the test coverage in your feature branch.

@luoliwoshang luoliwoshang requested a review from xushiwei February 6, 2026 15:04
@xushiwei xushiwei merged commit 6fc281b into xgo-dev:main Feb 6, 2026
37 checks passed
visualfc pushed a commit to visualfc/llgo that referenced this pull request Feb 7, 2026
build: use LLVM memcpyopt pass to replace cabi load/store optimization
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: TypeSwitch initialization statement missing when running under llgo when use gogen

3 participants