Add elapsed-time telemetry to RunProcessAndGetOutput test helper by jankratochvilcz · Pull Request #13836 · dotnet/msbuild

jankratochvilcz · 2026-05-21T16:56:04Z

Re-opening of #13829 to get a fresh license/cla check.

#13829 has been blocked for hours with mergeStateStatus=BLOCKED even though every msbuild-pr leg is green and the PR is approved (@AlesProkop). Root cause: the required license/cla check is missing on the head commit c0801793 — the Microsoft policy service reported on earlier commits in the PR but never on the merge commit. Same code, identical SHA pushed to a new branch to mint a fresh CLA check.

Below is the original description from #13829:

RunProcessAndGetOutput is the helper that almost every cross-process MSBuild test uses (net472/netcore task host launches, multi-proc node smoke tests, etc.). It already enforces a per-call timeout, but on failure we only see "process X is active for more than N sec" — no signal at all on the happy path about how close we ran to the budget.

This is exactly the kind of data we need to retune the three flaky ToolTask tests stabilized in the sibling PRs (#13828, #13830) — they pick their timeouts and slow-vs-fast gaps blind right now.

Changes

Capture sw.ElapsedMilliseconds immediately after WaitForExit() returns, before the post-exit drain loop, so the telemetry reports budget-relevant time only.
Log one line per process exit:
- Normal mode: Process {pid} exited in {N}ms (budget {T}ms, {T-N}ms remaining)
- DebugUnitTests mode (no enforced timeout): Process {pid} exited in {N}ms (DebugUnitTests: budget not enforced)
No behavioral change. Test-only file.

Risk

Test-helper-only changes. Single file: src/UnitTests.Shared/RunnerUtilities.cs. No product code touched.

Closes #13829.

Replaces the global default-timeout bump (reviewed against on PR #45) with elapsed-time logging in the bootstrap-MSBuild test helper. The Stopwatch around WaitForExit lets callers see actual ms taken vs the budget, so when follow-up PRs add per-test timeout overrides (e.g. the unmerged NetTaskHostTest_FallbackToDotnet tests tracked in #41/#42), the author can pick a tight value based on data instead of guessing. Rationale for not bumping the default: the helper is called from 53 test sites across 11 files; a global bump would silently triple the worst-case budget for tests that today don't even use 30s, masking perf regressions. The two flaky tests cited in the original issues don't exist in main yet (they're on a feature branch), so per-call scoping isn't possible today; this PR makes the data available so it can be scoped correctly when they land. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

jankratochvilcz · 2026-05-21T16:56:22Z

@AlesProkop sorry for the re-ping — could I get a re-approval here? This is #13829 reopened verbatim (same head SHA, just on a new branch) because the required license/cla check was stuck pending for hours on the merge commit even though it had reported success on the earlier commits in the PR. Trying to unblock past what looks like a buggy CLA pipeline. The PR is your already-approved version with your wording nit applied ({N}ms remaining).

jankratochvilcz · 2026-05-21T16:57:11Z

Closing as duplicate — #13829 auto-merged 6 seconds before I opened this one (the auto-merge I'd set up earlier fired the moment license/cla finally reported on the head commit). The content is on main already (merge commit f095875). Sorry for the noise, @AlesProkop — no action needed.

github-actions

Review Summary

This is a straightforward, low-risk diagnostic improvement that adds timing output to RunProcessAndGetOutput. The change helps developers investigate timeout-related test failures by showing how much of the timeout budget was consumed. Good addition for debuggability.

No blocking issues found.

Minor observations (all non-blocking):

#	Severity	Issue
1	Low	Stopwatch measures wait time, not total execution time from `p.Start()`. The log message "exited in Xms" could be slightly misleading.
2	Low	`sw.Stop()` is functionally dead — the stopwatch value was already captured and is never read again.
3	Low	Negative "remaining" is theoretically possible due to scheduling jitter after WaitForExit returns.
4	Nit	"Telemetry" label in comment is a misnomer for diagnostic log output.

What's good:

Correctly captures elapsed before the post-exit drain WaitForExit() so the budget comparison is accurate.
Correctly handles the DebugUnitTests path (no misleading budget info when budget isn't enforced).
No behavioral change to existing logic — purely additive diagnostic output.
Scoped to test infrastructure only — zero risk to production code.

Note

🔒 Integrity filter blocked 2 items

The following items were blocked because they don't meet the GitHub integrity level.

#13836 pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
#13836 pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

Generated by Expert Code Review (on open) for issue #13836 · ● 1.2M

github-actions · 2026-05-21T17:02:30Z

                p.StandardInput.Dispose();

                TimeSpan timeout = TimeSpan.FromMilliseconds(timeoutMilliseconds);
+                Stopwatch sw = Stopwatch.StartNew();


Nit (documentation gap): The stopwatch starts after p.Start() and async output reading has begun (lines 168–172). This means exitElapsedMs measures wait time (time spent in WaitForExit), not total process execution time. The process may have done significant work between Start() and this line.

This is arguably the right thing to measure for timeout-budget purposes (since it's the same duration WaitForExit(timeout) is checking), but the log message says "exited in {exitElapsedMs}ms" which could be misread as total execution time. Consider rewording to something like "Process {pid} waited {exitElapsedMs}ms for exit" or adding a brief inline comment noting it's measuring wait duration, not wall-clock execution time from process start.

github-actions · 2026-05-21T17:02:30Z

                pid = p.Id;
                successfulExit = p.ExitCode == 0;
+
+                // Telemetry: log actual elapsed time so callers tuning their timeouts can see


Nit (naming): The comment says "Telemetry" but this is diagnostic logging via WriteOutput, not actual telemetry infrastructure (no counters, no structured events, no collection pipeline). Consider using "Diagnostics:" or just removing the label — the comment already explains the purpose clearly without it.

github-actions · 2026-05-21T17:02:30Z

+                }
+                else
+                {
+                    WriteOutput($"Process {pid} exited in {exitElapsedMs}ms (budget {timeoutMilliseconds}ms, {timeoutMilliseconds - exitElapsedMs}ms remaining)");


Minor robustness: timeoutMilliseconds - exitElapsedMs mixes int and long, producing a long result. In normal operation this should always be non-negative (since WaitForExit(timeout) would have thrown on timeout). However, there's a theoretical window where thread scheduling after WaitForExit returns could make exitElapsedMs slightly exceed timeoutMilliseconds. A negative "remaining" value would be confusing in the log.

Consider using Math.Max(0, timeoutMilliseconds - exitElapsedMs) or simply documenting that negative values indicate measurement jitter. This is very low severity — just a polish item.

github-actions · 2026-05-21T17:02:30Z

                // See https://docs.microsoft.com/en-us/dotnet/api/system.diagnostics.process.waitforexit?view=net-6.0#system-diagnostics-process-waitforexit(system-int32).
                // The overload WaitForExit() waits for the error and output to be handled. The WaitForExit(int timeout) overload does not, so we could lose the data.
                p.WaitForExit();
+                sw.Stop();


Nit (dead code): sw.Stop() has no observable effect here — exitElapsedMs was already captured on line 190 and the stopwatch is never read again after this point. It could be removed to reduce noise, or if you prefer to keep it for clarity/hygiene that's fine too — just noting it's functionally inert.

Jan Krivanek and others added 3 commits May 21, 2026 11:43

Apply review nit: 'Nms remaining' wording

ddf7d2d

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Merge branch 'main' into upstream-pr/nettaskhost-telemetry

c080179

Copilot AI review requested due to automatic review settings May 21, 2026 16:56

Copilot started reviewing on behalf of jankratochvilcz May 21, 2026 16:56 View session

jankratochvilcz mentioned this pull request May 21, 2026

Add elapsed-time telemetry to RunProcessAndGetOutput test helper #13829

Merged

jankratochvilcz enabled auto-merge (squash) May 21, 2026 16:56

jankratochvilcz closed this May 21, 2026

auto-merge was automatically disabled May 21, 2026 16:57
Pull request was closed

jankratochvilcz deleted the upstream-pr/nettaskhost-telemetry-v2 branch May 21, 2026 16:57

github-actions Bot reviewed May 21, 2026

View reviewed changes

jankratochvilcz review requested due to automatic review settings May 21, 2026 17:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add elapsed-time telemetry to RunProcessAndGetOutput test helper#13836

Add elapsed-time telemetry to RunProcessAndGetOutput test helper#13836
jankratochvilcz wants to merge 3 commits into
dotnet:mainfrom
jankratochvilcz:upstream-pr/nettaskhost-telemetry-v2

jankratochvilcz commented May 21, 2026

Uh oh!

jankratochvilcz commented May 21, 2026

Uh oh!

jankratochvilcz commented May 21, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot May 21, 2026

Uh oh!

github-actions Bot May 21, 2026

Uh oh!

github-actions Bot May 21, 2026

Uh oh!

github-actions Bot May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jankratochvilcz commented May 21, 2026

Changes

Risk

Uh oh!

jankratochvilcz commented May 21, 2026

Uh oh!

jankratochvilcz commented May 21, 2026

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Review Summary

Minor observations (all non-blocking):

What's good:

Uh oh!

github-actions Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant