Conversation
|
|
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Make `create_def` a query
This comment has been minimized.
This comment has been minimized.
|
💔 Test for ce9351f failed: CI. Failed job:
|
This comment has been minimized.
This comment has been minimized.
98be9b0 to
a6ea5b5
Compare
a6ea5b5 to
2cff70b
Compare
This comment has been minimized.
This comment has been minimized.
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Make `create_def` a query
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (5a8f10b): comparison URL. Overall result: ❌ regressions - please read:Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf. Next, please: If you can, justify the regressions found in this try perf run in writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 1.9%, secondary 2.3%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 2.8%, secondary 3.7%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary -0.0%, secondary -0.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 487.131s -> 493.731s (1.35%) |
|
I mean your PR is the "proper" way to do all that, and I wonder if #115613 (comment) is somewhat better because it's incomplete and will have similar perf to this PR if done properly. But even if not, maybe we should just take the hit here and analyze why the queries are so expensive compared to a version specifically tuned to DefId creation. Shorter term tho: maybe just don't call the query from untracked queries like the resolver and instead run the logic directly? |
2cff70b to
9126582
Compare
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Make `create_def` a query
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (fee3d49): comparison URL. Overall result: ❌✅ regressions and improvements - please read:Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf. Next, please: If you can, justify the regressions found in this try perf run in writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 1.3%, secondary 0.9%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary 6.5%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis perf run didn't have relevant results for this metric. Bootstrap: 486.61s -> 488.378s (0.36%) |
View all comments
Creating a new definition using
create_defcurrently behaves as a red node for incremental compilation: the caller query must be recomputed at the next run. This is suboptimal.If a query's dependencies have not changed, we'd like the query engine to re-create the definitions and continue replaying the dependency graph and load cached data. This idea falls short with a subtle global state dependency: when creating definitions, we must ensure that
DefPathHashare globally unique, and carry a disambiguator global state around. Because of this, a call tocreate_defmay change its resultDefPathHashdue to global state, and force the caller query to be recomputed. If that happens, the caller query would need to be recomputed, but must not re-create the definitions that the query engine created for it.This PR attempts to solve this issue by using a
create_def_rawquery. This query iseval_always, so it can safely read global state. The query engine sees its return value as theDefPathHash, so will detect if it unexpectedly changed. We benefit from the caching queries do.However, we want multiple successive calls to
create_deffrom the same queries to result in a new definition each time. For instance when creating anonymous definitions that are only identified by their parent and a disambiguator. To allow this, we add 2 extra parameters tocreate_def_rawthat are unique to each call by the caller query: the query'sDepNodeand a counter of calls from within that query.Consider this example. Some query A calls create_def twice, we generate calls to:
create_def(CRATE_DEF_ID, Use) = create_def_raw(CRATE_DEF_ID, Use, A, 0)This returns::{use#0}.create_def(CRATE_DEF_ID, Use) = create_def_raw(CRATE_DEF_ID, Use, A, 1)The query has different arguments, so returns a brand new::{use::1}.create_def(CRATE_DEF_ID, Impl) = create_def_raw(CRATE_DEF_ID, Impl, A, 2)This returns::{impl#0}.When replaying, if nothing changed, the query engine will call
create_def_rawthrice, andAwill be green.If another query created a definition with
(CRATE_DEF_ID, Impl),::{impl#0}is taken. The 3rd call tocreate_defneeds to use another disambiguator, that call tocreate_def_rawreturns::{impl#1}and be correctly red.Aneeds to be re-computed. The definitions foruses have already been created when walking the dep-graph. The re-computation calls use the results increate_def_rawcache, aka::{use#0},::{use#1}and::{impl#1}, and we do not have duplicate definitions.Reopening #142711
The original PR was halted pending design work, so this one is opened as draft.
r? @oli-obk