Add elapsed-time telemetry to RunProcessAndGetOutput test helper by jankratochvilcz · Pull Request #13829 · dotnet/msbuild

jankratochvilcz · 2026-05-21T09:26:26Z

The bootstrap-MSBuild test helper (RunProcessAndGetOutput in src/UnitTests.Shared/RunnerUtilities.cs) currently surfaces wall-clock duration only when the process exceeds its timeout. On success the caller has no signal at all about how close to the budget it ran.

This makes timeout tuning blind — both raising a budget (no data to justify the new value) and tightening one later (no data showing actual elapsed never approached the previous value).

Change

Wrap the existing WaitForExit with a Stopwatch and log elapsedMs / budgetMs / headroomMs on success via the existing WriteOutput path. Pure additive — no behavior change.

Risk

Test-only helper, single WriteOutput line added. All ~50 callers benefit; none are affected.

Copilot

Pull request overview

Adds elapsed-time telemetry to the RunnerUtilities.RunProcessAndGetOutput unit-test helper so test runs emit timing info even when they succeed, helping callers tune timeout budgets with real data.

Changes:

Start a Stopwatch around process execution/waiting logic.
Emit a new WriteOutput line that reports elapsed time, configured budget, and computed headroom.

Comments suppressed due to low confidence (1)

src/UnitTests.Shared/RunnerUtilities.cs:180

In DebugUnitTests mode the timeout budget isn’t enforced (the code uses WaitForExit() with no timeout), but the telemetry line still reports budget/headroom based on timeoutMilliseconds. This can produce misleading negative headroom for successful runs; consider either skipping the budget/headroom fields when Traits.Instance.DebugUnitTests is true or logging a distinct message indicating the run was unbounded.

                Stopwatch sw = Stopwatch.StartNew();
                if (Traits.Instance.DebugUnitTests)
                {
                    p.WaitForExit();
                }
                else if (!p.WaitForExit(timeoutMilliseconds))

github-actions

Review Summary — PR #13829

Clean, low-risk change adding timing telemetry to test infrastructure (RunnerUtilities.cs).

Status	Dimensions
✅ LGTM (22)	Backwards Compat, ChangeWave, Performance, Test Coverage, Error Messages, Logging, String Comparison, API Surface, Target Authoring, Design, Cross-Platform, Code Simplification, Concurrency, Naming, SDK Integration, Idiomatic C#, File I/O, Build Infrastructure, Scope, Evaluation Model, Dependency Mgmt, Security
📝 NIT (1)	Documentation Accuracy — phantom workflow reference
⏭️ N/A (1)	Correctness — no edge case concerns for local `Stopwatch` in test code

Assessment: The Stopwatch is a local variable with no concurrency/shared-state concerns. The System.Diagnostics using already exists. The arithmetic (timeoutMilliseconds - sw.ElapsedMilliseconds) correctly promotes int to long. Negative headroom in the DebugUnitTests path (no timeout enforcement) or when stream flushing exceeds budget is valid and useful information.

One minor nit filed inline about the comment referencing a non-existent workflow.

Note

🔒 Integrity filter blocked 1 item

The following item were blocked because they don't meet the GitHub integrity level.

#13829 pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

Generated by Expert Code Review (on open) for issue #13829 · ● 1.5M

jankratochvilcz · 2026-05-21T09:38:52Z

Thanks for the review. Addressed the legitimate concern; commented on the others.

Addressed (force-pushed):

DebugUnitTests mode no longer logs a misleading budget/headroom. When that flag is set the timeout is intentionally not enforced ( p.WaitForExit() without timeout), so the line now reads:
Process {pid} exited in {ms}ms (DebugUnitTests: budget not enforced)
Non-debug runs are unchanged.

Not addressed:

The reviewer's Integrity filter blocked / phantom workflow notes are auto-generated agent meta; nothing actionable for the diff itself.

This PR is test infra only (src/UnitTests.Shared/RunnerUtilities.cs), so production hot-path concerns don't apply.

Replaces the global default-timeout bump (reviewed against on PR #45) with elapsed-time logging in the bootstrap-MSBuild test helper. The Stopwatch around WaitForExit lets callers see actual ms taken vs the budget, so when follow-up PRs add per-test timeout overrides (e.g. the unmerged NetTaskHostTest_FallbackToDotnet tests tracked in #41/#42), the author can pick a tight value based on data instead of guessing. Rationale for not bumping the default: the helper is called from 53 test sites across 11 files; a global bump would silently triple the worst-case budget for tests that today don't even use 30s, masking perf regressions. The two flaky tests cited in the original issues don't exist in main yet (they're on a feature branch), so per-call scoping isn't possible today; this PR makes the data available so it can be scoped correctly when they land. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

jankratochvilcz · 2026-05-21T16:56:20Z

Closing in favor of #13836. The required license/cla check has been pending for hours on the head commit (c0801793) even though it reported success on every earlier commit in this PR. Re-pushed the exact same head SHA to a new branch and opened #13836 to mint a fresh CLA check — buggy CLA pipeline.

jankratochvilcz · 2026-05-21T16:57:15Z

Heads-up: auto-merge fired ~6 s before I checked, so this actually landed. Disregard the prior comment about #13836 — that PR is being closed as a duplicate.

jankratochvilcz · 2026-05-22T14:08:38Z

24h post-merge data from main (dnceng-public pipeline 75, builds 1430301, 1430551, 1430938, 1431045, 1431261).

Aggregate across 5 414 process exits captured by the new telemetry (no DebugUnitTests mode runs observed):

	min ms	p50 ms	p95 ms	max ms
elapsed	48	1 288	5 745	19 323
budget	30 000	30 000	900 000	900 000
remaining	13 641	28 934	897 420	899 680

Useful signal:

remaining < 0 count: 0. No process came close to its budget across 5k+ exits — the helper's budget enforcement is comfortably correct.
remaining > 10 × elapsed count: 5 059 (~93% of all exits). Most callers have very generous budgets.

Egregious cases — note that Microsoft.Build.BuildCheck.UnitTests.EndToEndTests callers run with a 900 000 ms (15 min) budget while typical actual elapsed is 700–12 000 ms. That's a ~75–900x oversize. Example line:

Process 15880 exited in 12047ms (budget 900000ms, 887953ms remaining)

Not opening a follow-up PR for that here (out of scope for this telemetry PR), but flagging it as a worthwhile target for a separate cleanup — the helper now makes that kind of audit possible across all ~50 callers without extra instrumentation.

Copilot AI review requested due to automatic review settings May 21, 2026 09:26

Copilot started reviewing on behalf of jankratochvilcz May 21, 2026 09:26 View session

Copilot AI reviewed May 21, 2026

View reviewed changes

Comment thread src/UnitTests.Shared/RunnerUtilities.cs Outdated

github-actions Bot reviewed May 21, 2026

View reviewed changes

Comment thread src/UnitTests.Shared/RunnerUtilities.cs Outdated

jankratochvilcz force-pushed the upstream-pr/nettaskhost-telemetry branch from 823de5a to 1e89a6a Compare May 21, 2026 09:38

jankratochvilcz force-pushed the upstream-pr/nettaskhost-telemetry branch from 1e89a6a to e79c332 Compare May 21, 2026 09:43

jankratochvilcz force-pushed the upstream-pr/nettaskhost-telemetry branch from e79c332 to 34add7c Compare May 21, 2026 09:43

AlesProkop approved these changes May 21, 2026

View reviewed changes

Comment thread src/UnitTests.Shared/RunnerUtilities.cs Outdated

Apply review nit: 'Nms remaining' wording

ddf7d2d

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

jankratochvilcz enabled auto-merge (squash) May 21, 2026 13:35

Merge branch 'main' into upstream-pr/nettaskhost-telemetry

c080179

jankratochvilcz mentioned this pull request May 21, 2026

Add elapsed-time telemetry to RunProcessAndGetOutput test helper #13836

Closed

jankratochvilcz merged commit f095875 into dotnet:main May 21, 2026
11 checks passed

This was referenced May 22, 2026

[release/10.0.4xx] Source code updates from dotnet/msbuild dotnet/dotnet#6758

Merged

[main] Source code updates from dotnet/msbuild dotnet/dotnet#6760

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add elapsed-time telemetry to RunProcessAndGetOutput test helper#13829

Add elapsed-time telemetry to RunProcessAndGetOutput test helper#13829
jankratochvilcz merged 3 commits into
dotnet:mainfrom
jankratochvilcz:upstream-pr/nettaskhost-telemetry

jankratochvilcz commented May 21, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

jankratochvilcz commented May 21, 2026

Uh oh!

Uh oh!

Uh oh!

jankratochvilcz commented May 21, 2026

Uh oh!

jankratochvilcz commented May 21, 2026

Uh oh!

jankratochvilcz commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jankratochvilcz commented May 21, 2026

Change

Risk

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Review Summary — PR #13829

Uh oh!

Uh oh!

jankratochvilcz commented May 21, 2026

Uh oh!

Uh oh!

Uh oh!

jankratochvilcz commented May 21, 2026

Uh oh!

jankratochvilcz commented May 21, 2026

Uh oh!

jankratochvilcz commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants