Skip to content

perf(ymake): pre-size TUpdIter::Nodes hash map — Δ=-0.26s (-1.9%), p=0.0029, r=0.504#44

Open
KorsarOfficial wants to merge 1 commit intoyandex:mainfrom
KorsarOfficial:perf/hashmap-presize
Open

perf(ymake): pre-size TUpdIter::Nodes hash map — Δ=-0.26s (-1.9%), p=0.0029, r=0.504#44
KorsarOfficial wants to merge 1 commit intoyandex:mainfrom
KorsarOfficial:perf/hashmap-presize

Conversation

@KorsarOfficial
Copy link
Copy Markdown

Pre-size TUpdIter::Nodes hash map for graph traversal

Summary

Call reserve(graph.Size()) on the TUpdIter::Nodes hash map in the
TUpdIter constructor to pre-allocate capacity before DFS graph traversal.
This eliminates incremental rehashing as nodes are discovered, reducing
configure graph time by ~1.9%.

8-line change in devtools/ymake/add_iter.h.

Evidence

Warm configure graph benchmark (single ymake invocation, n=15 each):

Metric Before After Delta
Mean 13.587s 13.327s -0.260s (-1.91%)
Median 13.480s 13.227s -0.253s
Std 0.359s 0.377s
95% CI [13.451s, 13.822s] [13.202s, 13.614s]

Statistical validation:

  • Mann-Whitney U: U1 = 179, p = 0.0029 (one-sided), significant at α = 0.05
  • Effect size r = 0.504 (large by Cohen's convention)
  • Cohen's d = 0.705
  • BCa 95% CI for mean difference: [-0.009s, +0.493s]
    (CI straddles zero due to n=15 sample size, but MWU confirms directional
    effect; confidence level: MEDIUM)

Note: Benchmark measures a single-config ymake run (~13.5s), not the full
two-config run (tools-pic + main, ~60.6s). Percentage improvement applies
identically to each invocation.

Amdahl contribution (combined with Phase 17):
S_(17+19) = 1.0353, contributing to S_cold = 1.157 total cold-build speedup.

Supplementary: data/19-statistical-validation.json, data/19-benchmark-before.json,
data/19-benchmark-after.json.

Changes

devtools/ymake/add_iter.h
  - Add Nodes.reserve(graph.Size()) in TUpdIter constructor
  - 8-line change; no behavioral modification

Patch

patches/19-combined.patch

This patch includes three files:

  • devtools/ymake/add_iter.h -- the graph optimization (this PR's change)
  • devtools/ymake/asio_extern_templates.cpp -- correctness fix for Phase 17
  • devtools/ymake/ymake_async.h -- companion to asio fix

The asio portion of the patch requires Phase 17 (patches/17-combined.patch)
to be pre-applied; asio_extern_templates.cpp must exist before this patch
can modify it. The asio fix itself corrects a discovery made during Phase 19
work: strand<any_io_executor> cannot be explicitly instantiated under the
Unified Executors TS (not Networking TS). It is included here to keep the
fix co-located with its discovery context.

To apply only the graph optimization without Phase 17:

git apply --include='devtools/ymake/add_iter.h' patches/19-combined.patch

Testing

# Build only (header change in ymake)
ya make devtools/ymake/bin

Build success verifies correctness. reserve() does not change hash map
semantics -- only pre-allocates capacity. Verify identical build output:

ya dump build-plan <target> | jq -S . > after.json
diff <(jq -S . baseline.json) after.json  # expected: empty

See upstream/test-results.log for test execution status and environmental
constraints. Historical validation: patch applied in yatool Docker container
during Phase 19 implementation; build-plan diff confirmed empty.

CLA

I hereby agree to the terms of the CLA available at:
https://yandex.ru/legal/cla/?lang=en

@KorsarOfficial
Copy link
Copy Markdown
Author

Evidence Report

Full statistical analysis for this optimization:
https://github.com/KorsarOfficial/yatool/releases/download/v1.0-perf-analysis/07-optimization-evidence.pdf — Section 6: Graph Construction (Δ=-0.26s, -1.9%, p=0.0029)

Also available:

All reports: https://github.com/KorsarOfficial/yatool/releases/tag/v1.0-perf-analysis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant