Remove chunk size from each chunk in ChunkedBitSet.#145480
Remove chunk size from each chunk in ChunkedBitSet.#145480bors merged 2 commits intorust-lang:masterfrom
ChunkedBitSet.#145480Conversation
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Remove chunk size from each chunk in `ChunkedBitSet`.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
We have an open PR on dense bitsets that removes the domain size as well (and one passes it in in the few cases where it’s needed; it’s usually easy to know where the bitsets are constructed and used) with good effects IIRC, it’s also something we could try on chunked bitsets. |
|
Finished benchmarking commit (acb05a6): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -0.8%, secondary -3.5%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 2.8%, secondary 4.2%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary 0.7%, secondary -1.7%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 469.562s -> 468.956s (-0.13%) |
8a28a3d to
8fa6c55
Compare
|
@lqd do you have a link to the PR for dense bitsets? |
This comment has been minimized.
This comment has been minimized.
8fa6c55 to
e2f33ee
Compare
|
r? compiler |
| for (chunk_index, (mut self_chunk, other_chunk)) in | ||
| self.chunks.iter_mut().zip(other.chunks.iter()).enumerate() | ||
| { | ||
| let chunk_domain_size = if chunk_index + 1 == num_chunks { |
There was a problem hiding this comment.
Should this use self.chunk_domain_size(chunk_index)? I guess that maybe doesn't cache the more complex computation (modulus, etc.) but that should only happen once per iteration, so maybe that's better (less register pressure etc)?
There was a problem hiding this comment.
We are mutably borrowing self.chunks, so borrowck won't let us. And re-implementing is simple enough.
e2f33ee to
9c9a89a
Compare
|
This PR was rebased onto a different master commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
@bors r=Mark-Simulacrum |
|
☀️ Test successful - checks-actions |
What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.Comparing a1dbb44 (parent) -> ee361e8 (this PR) Test differencesNo test diffs found Test dashboardRun cargo run --manifest-path src/ci/citool/Cargo.toml -- \
test-dashboard ee361e8fca1c30e13e7a31cc82b64c045339d3a8 --output-dir test-dashboardAnd then open Job duration changes
How to interpret the job duration changes?Job durations can vary a lot, based on the actual runner instance |
|
Finished benchmarking commit (ee361e8): comparison URL. Overall result: no relevant changes - no action needed@rustbot label: -perf-regression Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)Results (secondary 2.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary -3.0%, secondary -3.7%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 469.015s -> 467.59s (-0.30%) |
Clear `ChunkedBitSet` without reallocating There doesn't appear to be any reason to clear a ChunkedBitSet via its constructor (which allocates a new list of chunks), when we could just fill the existing allocation with `Chunk::Zeros` instead. For comparison, the `insert_all` impl added by the same PR (rust-lang#93984) does the simple thing here and just overwrites every chunk with `Chunk::Ones`. (That fill was then made somewhat easier by rust-lang#145480, which removes the chunk size from all-zero/all-one chunks.) r? nnethercote (or compiler)
Clear `ChunkedBitSet` without reallocating There doesn't appear to be any reason to clear a ChunkedBitSet via its constructor (which allocates a new list of chunks), when we could just fill the existing allocation with `Chunk::Zeros` instead. For comparison, the `insert_all` impl added by the same PR (rust-lang#93984) does the simple thing here and just overwrites every chunk with `Chunk::Ones`. (That fill was then made somewhat easier by rust-lang#145480, which removes the chunk size from all-zero/all-one chunks.) r? nnethercote (or compiler)
Rollup merge of #147623 - Zalathar:clear-mixed, r=nnethercote Clear `ChunkedBitSet` without reallocating There doesn't appear to be any reason to clear a ChunkedBitSet via its constructor (which allocates a new list of chunks), when we could just fill the existing allocation with `Chunk::Zeros` instead. For comparison, the `insert_all` impl added by the same PR (#93984) does the simple thing here and just overwrites every chunk with `Chunk::Ones`. (That fill was then made somewhat easier by #145480, which removes the chunk size from all-zero/all-one chunks.) r? nnethercote (or compiler)
Almost all chunks in a
ChunkedBitSethave the same constantCHUNK_SIZE, except the last in a bitset which takes the remainderdomain_size % CHUNK_SIZE.r? @ghost for perf