perf(ymake): pre-size TUpdIter::Nodes hash map — Δ=-0.26s (-1.9%), p=0.0029, r=0.504#44
Open
KorsarOfficial wants to merge 1 commit intoyandex:mainfrom
Open
perf(ymake): pre-size TUpdIter::Nodes hash map — Δ=-0.26s (-1.9%), p=0.0029, r=0.504#44KorsarOfficial wants to merge 1 commit intoyandex:mainfrom
KorsarOfficial wants to merge 1 commit intoyandex:mainfrom
Conversation
Author
Evidence ReportFull statistical analysis for this optimization: Also available:
All reports: https://github.com/KorsarOfficial/yatool/releases/tag/v1.0-perf-analysis |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pre-size TUpdIter::Nodes hash map for graph traversal
Summary
Call
reserve(graph.Size())on theTUpdIter::Nodeshash map in theTUpdIterconstructor to pre-allocate capacity before DFS graph traversal.This eliminates incremental rehashing as nodes are discovered, reducing
configure graph time by ~1.9%.
8-line change in
devtools/ymake/add_iter.h.Evidence
Warm configure graph benchmark (single ymake invocation, n=15 each):
Statistical validation:
(CI straddles zero due to n=15 sample size, but MWU confirms directional
effect; confidence level: MEDIUM)
Note: Benchmark measures a single-config ymake run (~13.5s), not the full
two-config run (tools-pic + main, ~60.6s). Percentage improvement applies
identically to each invocation.
Amdahl contribution (combined with Phase 17):
S_(17+19) = 1.0353, contributing to S_cold = 1.157 total cold-build speedup.
Supplementary:
data/19-statistical-validation.json,data/19-benchmark-before.json,data/19-benchmark-after.json.Changes
Patch
patches/19-combined.patchThis patch includes three files:
devtools/ymake/add_iter.h-- the graph optimization (this PR's change)devtools/ymake/asio_extern_templates.cpp-- correctness fix for Phase 17devtools/ymake/ymake_async.h-- companion to asio fixThe asio portion of the patch requires Phase 17 (
patches/17-combined.patch)to be pre-applied;
asio_extern_templates.cppmust exist before this patchcan modify it. The asio fix itself corrects a discovery made during Phase 19
work:
strand<any_io_executor>cannot be explicitly instantiated under theUnified Executors TS (not Networking TS). It is included here to keep the
fix co-located with its discovery context.
To apply only the graph optimization without Phase 17:
git apply --include='devtools/ymake/add_iter.h' patches/19-combined.patchTesting
# Build only (header change in ymake) ya make devtools/ymake/binBuild success verifies correctness.
reserve()does not change hash mapsemantics -- only pre-allocates capacity. Verify identical build output:
See
upstream/test-results.logfor test execution status and environmentalconstraints. Historical validation: patch applied in yatool Docker container
during Phase 19 implementation; build-plan diff confirmed empty.
CLA
I hereby agree to the terms of the CLA available at:
https://yandex.ru/legal/cla/?lang=en