Skip to content

Add a callgrind measure#314

Merged
fitzgen merged 2 commits into
bytecodealliance:mainfrom
fitzgen:callgrind-measure
Jun 5, 2026
Merged

Add a callgrind measure#314
fitzgen merged 2 commits into
bytecodealliance:mainfrom
fitzgen:callgrind-measure

Conversation

@fitzgen

@fitzgen fitzgen commented Jun 5, 2026

Copy link
Copy Markdown
Member

This commit adds a new callgrind measure. It must always be run inside a child process that is running under Valgrind's Callgrind tool. It uses the valgrind-requests crate to communicate with Valgrind and record data from the simulated caches and branch predictor.

Running under Callgrind is much slower than running natively, but also is much less noisy. Therefore we adjust the default numbers of processes and iterations per process accordingly.

Fixes #312

This commit adds a new `callgrind` measure. It must always be run inside a child
process that is running under Valgrind's Callgrind tool. It uses the
`valgrind-requests` crate to communicate with Valgrind and record data from the
simulated caches and branch predictor.

Running under Callgrind is much slower than running natively, but also is much
less noisy. Therefore we adjust the default numbers of processes and iterations
per process accordingly.

Fixes bytecodealliance#312
@fitzgen fitzgen force-pushed the callgrind-measure branch from f01f606 to f03bed1 Compare June 5, 2026 17:36
@fitzgen fitzgen mentioned this pull request Jun 5, 2026
@fitzgen fitzgen requested a review from posborne June 5, 2026 17:47
@@ -58,8 +238,8 @@ pub struct BenchmarkCommand {
engine_flags: Option<String>,

/// How many processes should we use for each Wasm benchmark?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice to document the default here so it shows up in help output again as a hint to users looking to modify things. Same applies for iterations-per-process.

@posborne posborne left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me; I was able to do a couple runs with callgrind to confirm.

The time to run is definitely astronomically slow; if the data is reliable with a smaller sample, we may want to see about coming up some smaller inputs. Possibly a different default suite when targeting callgrind compared with the default. That's, of course, secondary to getting results that can be trusted but there's some balance point in there.

@fitzgen

fitzgen commented Jun 5, 2026

Copy link
Copy Markdown
Member Author

The time to run is definitely astronomically slow; if the data is reliable with a smaller sample, we may want to see about coming up some smaller inputs. Possibly a different default suite when targeting callgrind compared with the default. That's, of course, secondary to getting results that can be trusted but there's some balance point in there.

For sure, I am planning on doing a pass over the benchmarks to get them all running roughly the same amount of instructions per execution iteration when I have a chance. Probably won't be for every single one, but the ones that are easy enough to do that, I will.

FWIW, I will also be making a PR for the PCA stuff soon. Have it working locally, just need to do some final tweaks.

@fitzgen fitzgen merged commit 1d56857 into bytecodealliance:main Jun 5, 2026
16 checks passed
@fitzgen fitzgen deleted the callgrind-measure branch June 5, 2026 23:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add callgrind measure

2 participants