Skip to content

Comments

purego: add benchmarks for calling methods#363

Merged
hajimehoshi merged 8 commits intoebitengine:mainfrom
tmc:add-benchmark
Feb 7, 2026
Merged

purego: add benchmarks for calling methods#363
hajimehoshi merged 8 commits intoebitengine:mainfrom
tmc:add-benchmark

Conversation

@tmc
Copy link
Contributor

@tmc tmc commented Oct 31, 2025

Add benchmarks comparing SyscallN, RegisterFunc, and callback performance across different argument counts.
This helps measure and compare the overhead of different calling approaches in purego's function invocation system.

Closes #362

Current output:

goos: linux
goarch: arm64
pkg: github.com/ebitengine/purego
BenchmarkCallingMethods/RegisterFunc/Callback/1args-4            3150840               345.5 ns/op            96 B/op          6 allocs/op
BenchmarkCallingMethods/RegisterFunc/Callback/2args-4            3140649               415.8 ns/op           152 B/op          7 allocs/op
BenchmarkCallingMethods/RegisterFunc/Callback/3args-4            1962769               622.9 ns/op           224 B/op          8 allocs/op
BenchmarkCallingMethods/RegisterFunc/Callback/5args-4            2104015               549.3 ns/op           336 B/op         10 allocs/op
BenchmarkCallingMethods/RegisterFunc/Callback/10args-4           1280574              1162 ns/op             600 B/op         15 allocs/op
BenchmarkCallingMethods/RegisterFunc/Callback/14args-4            787353              1330 ns/op             888 B/op         19 allocs/op
BenchmarkCallingMethods/RegisterFunc/Callback/15args-4           1000000              1087 ns/op             928 B/op         20 allocs/op
BenchmarkCallingMethods/RegisterFunc/CFunc/1args-4               7312062               179.1 ns/op            40 B/op          3 allocs/op
BenchmarkCallingMethods/RegisterFunc/CFunc/2args-4               4575436               259.8 ns/op            72 B/op          4 allocs/op
BenchmarkCallingMethods/RegisterFunc/CFunc/3args-4               5702834               203.9 ns/op           112 B/op          5 allocs/op
BenchmarkCallingMethods/RegisterFunc/CFunc/5args-4               4748464               357.9 ns/op           176 B/op          7 allocs/op
BenchmarkCallingMethods/RegisterFunc/CFunc/10args-4              2852462               384.0 ns/op           328 B/op         12 allocs/op
BenchmarkCallingMethods/RegisterFunc/CFunc/14args-4              2587800               679.0 ns/op           472 B/op         16 allocs/op
BenchmarkCallingMethods/RegisterFunc/CFunc/15args-4              2343153               482.9 ns/op           512 B/op         17 allocs/op
BenchmarkCallingMethods/SyscallN/Callback/1args-4                5052274               268.1 ns/op            56 B/op          3 allocs/op
BenchmarkCallingMethods/SyscallN/Callback/2args-4                3734990               331.6 ns/op            80 B/op          3 allocs/op
BenchmarkCallingMethods/SyscallN/Callback/3args-4                3882723               302.8 ns/op           112 B/op          3 allocs/op
BenchmarkCallingMethods/SyscallN/Callback/5args-4                3444187               405.1 ns/op           160 B/op          3 allocs/op
BenchmarkCallingMethods/SyscallN/Callback/10args-4               1766395               624.9 ns/op           272 B/op          3 allocs/op
BenchmarkCallingMethods/SyscallN/Callback/14args-4               2003835               598.1 ns/op           416 B/op          3 allocs/op
BenchmarkCallingMethods/SyscallN/Callback/15args-4               1750140               848.6 ns/op           416 B/op          3 allocs/op
BenchmarkCallingMethods/SyscallN/CFunc/1args-4                  23856542                49.10 ns/op            0 B/op          0 allocs/op
BenchmarkCallingMethods/SyscallN/CFunc/2args-4                  25922942                46.47 ns/op            0 B/op          0 allocs/op
BenchmarkCallingMethods/SyscallN/CFunc/3args-4                  25489951                49.48 ns/op            0 B/op          0 allocs/op
BenchmarkCallingMethods/SyscallN/CFunc/5args-4                  24297174                51.78 ns/op            0 B/op          0 allocs/op
BenchmarkCallingMethods/SyscallN/CFunc/10args-4                 23921144                48.11 ns/op            0 B/op          0 allocs/op
BenchmarkCallingMethods/SyscallN/CFunc/14args-4                 24247138                52.55 ns/op            0 B/op          0 allocs/op
BenchmarkCallingMethods/SyscallN/CFunc/15args-4                 22587331                58.45 ns/op            0 B/op          0 allocs/op
BenchmarkCallingMethods/RoundTrip/1args-4                        4651420               222.7 ns/op            56 B/op          3 allocs/op
BenchmarkCallingMethods/RoundTrip/2args-4                        4831054               257.3 ns/op            80 B/op          3 allocs/op
BenchmarkCallingMethods/RoundTrip/3args-4                        3868870               380.7 ns/op           112 B/op          3 allocs/op
BenchmarkCallingMethods/RoundTrip/5args-4                        3571844               327.6 ns/op           160 B/op          3 allocs/op
BenchmarkCallingMethods/RoundTrip/10args-4                       2216022               608.3 ns/op           272 B/op          3 allocs/op
PASS
ok      github.com/ebitengine/purego    57.446s

@hajimehoshi
Copy link
Member

hajimehoshi commented Oct 31, 2025

The benchmark doesn't use C functions. Using C functions in a dynamic library would be more meaningful to test actual cases in the real world.

@tmc
Copy link
Contributor Author

tmc commented Oct 31, 2025

The benchmark doesn't use C functions. Using C functions in a dynamic library would be more meaningful to test actual cases in the real world.

will extend the callbacks to not be the go no-ops but c calls

@tmc
Copy link
Contributor Author

tmc commented Oct 31, 2025

done.

note: this uses long+uintptr because of the bug addressed in #360

@tmc
Copy link
Contributor Author

tmc commented Jan 26, 2026

rebased/refreshed.

// Build C library for benchmarking
libFileName := filepath.Join(b.TempDir(), "libbenchmark.so")
if err := buildSharedLib("CC", libFileName, filepath.Join("testdata", "benchmarktest", "benchmark.c")); err != nil {
b.Skipf("Failed to build C library: %v", err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be a Fatalf?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree

Copy link
Collaborator

@TotallyGamerJet TotallyGamerJet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides the one minor comment this LGTM

@TotallyGamerJet
Copy link
Collaborator

@hajimehoshi PTAL

// Build C library for benchmarking
libFileName := filepath.Join(b.TempDir(), "libbenchmark.so")
if err := buildSharedLib("CC", libFileName, filepath.Join("testdata", "benchmarktest", "benchmark.c")); err != nil {
b.Skipf("Failed to build C library: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree

tmc added 8 commits February 7, 2026 11:11
Add benchmark tests comparing RegisterFunc, SyscallN, and callback performance
across different argument counts (1, 2, 3, 5, 10, 14, 15 args). The benchmarks
measure:

- RegisterFunc with Go callbacks vs C functions
- SyscallN with Go callbacks vs C functions
- Round-trip calls (Go → C → Go callback)

Includes corresponding C library with sum functions and callback wrappers
to enable realistic performance comparisons between different calling
approaches in purego.
Replace direct purego.Dlopen/Dlclose calls with load.OpenLibrary/CloseLibrary
from the internal load package for consistency with other test code. Includes
proper error handling in the defer cleanup function.
Change benchmark callback functions from uintptr to int64 to match
C's long type on 64-bit systems. This removes the workaround that
was needed for the ARM64 argument corruption bug fixed in ebitengine#360.

The args slice remains []uintptr since that's what SyscallN requires.
- Use b.Cleanup for closing library (consistency with file deletion)
- Rename Go functions with "go" prefix for naming consistency
- Remove unnecessary //go:noinline directives
- Remove stale go:generate comment from benchmark.c
- Use b.Fatalf instead of b.Skipf for C library build failure
- Panic on unsupported arg count in makeRegisterFunc
- Add default panic case to callRegisterFunc
@tmc
Copy link
Contributor Author

tmc commented Feb 7, 2026

Resolved pr feedback and rebased to main

Copy link
Member

@hajimehoshi hajimehoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@hajimehoshi hajimehoshi merged commit 2fe737a into ebitengine:main Feb 7, 2026
46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add benchmarking

3 participants