Skip to content

perf(concat_source): skip Mutex acquire in add() under &mut self#232

Open
JSerFeng wants to merge 2 commits intomainfrom
fy/jolly-mendeleev-0b5e28
Open

perf(concat_source): skip Mutex acquire in add() under &mut self#232
JSerFeng wants to merge 2 commits intomainfrom
fy/jolly-mendeleev-0b5e28

Conversation

@JSerFeng
Copy link
Copy Markdown
Contributor

@JSerFeng JSerFeng commented Apr 23, 2026

What

Replace lock().unwrap() with get_mut().unwrap() inside ConcatSource::add, removing one redundant atomic acquire/release per call. Other &self access points (Clone, Debug::fmt, optimized_children, and the downcast paths inside add that touch other ConcatSource instances) still use lock().

Why

add already takes &mut self, so the borrow checker statically guarantees exclusive access to the inner Vec<BoxSource> — the runtime synchronization a Mutex provides is unnecessary here. Beyond the perf consideration, this also makes the intent more accurate: &mut self should not need to synchronize with anyone.

The existing benchmarks build a ConcatSource once during setup and then loop over .map().to_json(), so they exercise the cached read path rather than add — which is why CodSpeed shows no change here. Real Rspack hot paths use add very differently: rspack_plugin_javascript/src/runtime.rs does a par_iter().fold(ConcatSource::default, |mut acc, src| { acc.add(src); acc }), and rspack_core's concatenated module rendering chains 300+ add calls per module.

I ran a targeted local microbenchmark (Apple Silicon, 50 samples × 8s, criterion --baseline, repeated runs to confirm reproducibility): a ConcatSource built by 500 sequential add calls dropped from 8.29 µs to 5.05 µs (−39%, p < 0.001, ~1% CI width); the 16-add variant dropped from 421 ns to 278 ns (−32%). That works out to ~6–7 ns saved per add, matching the cost of a Mutex acquire/release on Apple Silicon. The microbenchmarks themselves are landing in a separate PR so this one stays a minimal, focused change.

How

  • src/concat_source.rs::add: &mut *self.children.lock().unwrap()self.children.get_mut().expect(...). get_mut borrows the whole Mutex mutably instead of going through a guard, equivalent to the original mutable-borrow semantics; the subsequent self.is_optimized.take() still works because it's a disjoint field borrow.
  • All 76 lib tests + 11 doc tests pass; cargo clippy --release and cargo fmt --check both clean.

`ConcatSource::add` takes `&mut self`, which already guarantees exclusive
access to the inner children Vec. Locking the Mutex was a redundant
atomic acquire/release on a hot construction path — rspack's JS code
generation calls `add` hundreds of times per chunk (e.g. concatenated
module rendering, runtime module assembly, parallel rayon folds).

Switch to `Mutex::get_mut()` which is the standard zero-overhead
alternative when exclusive access is statically known. All other
read-only accessors (`Clone`, `Debug`, `optimized_children`) still
acquire the lock.
Copilot AI review requested due to automatic review settings April 23, 2026 09:35
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes ConcatSource::add by avoiding an unnecessary Mutex lock when the method already has exclusive access via &mut self, reducing synchronization overhead in a hot path.

Changes:

  • Replaced self.children.lock().unwrap() with self.children.get_mut().expect(...) inside ConcatSource::add.
  • Added an explanatory comment documenting why the lock can be skipped safely in this method.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Apr 23, 2026

Merging this PR will not alter performance

✅ 11 untouched benchmarks
🆕 2 new benchmarks
⏩ 7 skipped benchmarks1

Performance Changes

Benchmark BASE HEAD Efficiency
🆕 concat_source_add_many N/A 46.8 µs N/A
🆕 concat_source_add_few N/A 6.1 µs N/A

Comparing fy/jolly-mendeleev-0b5e28 (9c6efe2) with main (7795fd2)2

Open in CodSpeed

Footnotes

  1. 7 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

  2. No successful run was found on main (10a3b71) during the generation of this report, so 7795fd2 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

@JSerFeng JSerFeng force-pushed the fy/jolly-mendeleev-0b5e28 branch from a35cad4 to 82da852 Compare April 23, 2026 11:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants