Propagation script fixes: resume, PR handling, descriptions#382
Propagation script fixes: resume, PR handling, descriptions#382parth21999 wants to merge 58 commits intomasterfrom
Conversation
…ltering - Ctrl+C during watch loop sleep intervals shows a menu: Enter=close PR and stop, n=stop but leave PR open, r=resume - Ctrl+C during fetch (az/gh commands) is caught and prompts the same menu - Resume detects already-merged PRs and skips them - Resume reuses existing active PRs instead of creating duplicates - Skip license/CLA checks when detecting CI completion - Suppress stray 'False' output from show-propagation-status - Remove stray Write-Host in get-repo-type - Add -poll_interval CLI arg (default 15s) - Reduce auto-merge/autocomplete poll intervals to 2s - Replace fixed 150s pipeline wait with polling for CI checks - Reuse existing PR on Azure TF401179 error - Dot-source build_graph.ps1, remove env var cache - Remove -useCachedRepoOrder (superseded by known graph) - Fetch known_graph.json from remote, auto-save on mismatch - Add -ForceBuildGraph flag Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
update-submodules-to-fixed-commits now checks the current submodule SHA before checkout. If the current commit is already at or ahead of the fixed commit (target is an ancestor of current), the submodule is left as-is. This prevents the script from downgrading a submodule that was already updated by someone else or by a previous propagation run. Uses git merge-base --is-ancestor to determine commit ordering. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Clicking the notification opens the PR URL in the browser. Uses Windows PowerShell 5.1 for WinRT toast API (not available in pwsh 7). Runs async via Start-Process so it doesn't block. Best-effort — silently ignored if notifications are unavailable. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Start-Process -Command doesn't handle multi-line scripts reliably. Write the toast script to a temp .ps1 file and execute with -File. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
build_graph.ps1 clones repos with --depth 1, but update-local-repo needs full history to compute git log ranges for PR descriptions. Now detects shallow clones and runs git fetch --unshallow before git pull. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
git submodule update --init creates submodules with minimal history. collect-upstream-changes needs git log ranges inside submodules. Now detects and unshallows submodules during update-submodules-to- fixed-commits. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When collect-upstream-changes returns empty (e.g., only YAML ref changes or submodule updates where all commits are dep-update commits), build a fallback description from the staged file list. Identifies changed submodules and modified files separately. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
close-pr now queries the PR status before trying to abandon (Azure) or close (GitHub). If the PR is already completed/merged or abandoned/closed, it skips the operation instead of failing with TF401181 or similar errors. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
A single transient az repos pr show failure was triggering fail-with-status even when the PR had already completed. This caused the script to try abandoning an already-merged PR (TF401181). Added get-pr-status-with-retry (3 attempts, 5s delay) so transient API failures don't immediately abort the propagation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When watch-azure-pr-policies returns false due to a non-blocking or optional policy failure, the PR can still autocomplete. Previously the script checked PR status once and immediately called fail-with-status if not yet completed, abandoning the PR before autocomplete had time to fire. Now waits up to 2 minutes for autocomplete before giving up, matching the behavior on the success path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
On resume, update-repo called update-local-repo (which pushes to the branch) before checking for an existing PR URL. This pushed a new iteration to the existing PR unnecessarily. Restructured update-repo to check for existing PR first. If found, skip update-local-repo entirely and just monitor the PR. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Three fixes to prevent resume from creating new iterations on existing PRs: 1. Before update-local-repo, query the remote for an active PR on the branch. If found, skip update-local-repo entirely and just monitor the existing PR. This handles the case where the state file was saved before the PrUrl was set (fail-with-status exited before the main-loop save). 2. Save state in fail-with-status before exiting, so future resumes have the PrUrl available without needing the remote query. 3. Store state params (branch_name, repo_order, etc.) as globals so fail-with-status can access them. Also fixed: az repos pr list returns a single PSObject (not array) when there's one result. Wrap with @() to force array. And repository.webUrl is empty in pr list output, so construct the URL from org/project/repo instead. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When resuming with an existing PR (saved or discovered), the script now checks if any submodule on current master is ahead of the fixed commit the PR would set. If so, the propagation fails with a clear message telling the user to abandon the PR and start fresh. Previously, the regression guard in update-submodules-to-fixed-commits was never reached on resume because update-local-repo was skipped. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Azure DevOps auto-links #NNN patterns to work items. GitHub commit subjects contain PR numbers like (#376) which get matched to random ADO bugs. Strip these patterns when collecting upstream changes for the PR description. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace (#NNN) in commit subjects with full GitHub URLs like (https://github.com/Azure/c-build-tools/pull/NNN) so ADO doesn't auto-link them as work items. Add a 'Related PRs' section to the PR body with links to all PRs created during this propagation run. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1. collect-upstream-changes returns null when no submodule diffs exist.
The fallback code did \ += @{Repo=...} on null,
which PowerShell treats as adding keys to a hashtable, crashing
with 'Key already added'. Fix: initialize as @() before appending.
2. show-propagation-status declared success when failed==0, ignoring
pending repos. Now requires both failed==0 AND pending==0.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| Pop-Location | ||
| return $would_regress | ||
| } | ||
|
|
There was a problem hiding this comment.
wouldn't this be better situated in git_operations.ps1?
#Closed
Merged the saved-PrUrl and discovered-PR branches into a single flow: resolve existing PR URL (saved or remote query), then handle it in one place. Removes the duplicate regression check, monitoring logic, and repo-type detection. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| # unknown repo type | ||
| } | ||
|
|
||
| if ($already_merged) |
There was a problem hiding this comment.
the github/azure branches code should be moved to azure_repo_ops.ps1 and github_repo_op.ps1 #Closed
|
|
||
| # update dependencies for given repo | ||
| function update-repo | ||
| { |
There was a problem hiding this comment.
update_repo should be moved to its own ps1 file #Closed
Move platform-specific logic out of propagate_updates.ps1 into the appropriate helper files: - check-pr-would-regress -> git_operations.ps1 - find-active-azure-pr, check-azure-pr-completed, monitor-azure-pr, close-pr-azure -> azure_repo_ops.ps1 - find-active-github-pr, check-github-pr-merged, monitor-github-pr, close-pr-github -> github_repo_ops.ps1 - close-pr in status_tracking.ps1 now dispatches to platform helpers - update-repo -> new update_repo.ps1 using platform helpers - Truncate PR body to 4000 chars for Azure CLI limits - Use --body-file for GitHub PR creation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| } | ||
| else | ||
| { | ||
| fail-with-status "Failed to create PR for repo $repo_name" |
There was a problem hiding this comment.
If we check that there is no PR for the branch before creating the PR, then do we really need to check this again? #Resolved
| # Reset console color — git checkout may leave ANSI color codes active | ||
| Write-Host "`e[0m" -NoNewline | ||
| } | ||
| } |
There was a problem hiding this comment.
can some of this logic be extracted into a common helper function between this and check-pr-would-regress? #Resolved
| # Toast notifications are best-effort — don't fail propagation if they don't work | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
This is how it is done in another place:
name: windows-toast-notification
description: Show a Windows toast notification with a brief summary message.
Show a Windows toast notification. Run this PowerShell, replacing <message> with a short summary (e.g., "Build succeeded", "Tests passed (42/42)", "PR created (#1234)"):
Add-Type -AssemblyName System.Windows.Forms; $n = New-Object System.Windows.Forms.NotifyIcon; $n.Icon = [System.Drawing.SystemIcons]::Information; $n.BalloonTipTitle = "Copilot CLI"; $n.BalloonTipText = "<message>"
$n.Visible = $true
$n.ShowBalloonTip(5000)
# REM Start-Sleep -Seconds 2
# $n.Dispose()can it be done like that at all? #Resolved
…fy toast 1. Simplify TF401179 fallback in create-pr-azure: existing PRs are now detected by update-repo before reaching create-pr-azure, so the redundant query is removed. Just fail on unexpected errors. 2. Extract test-submodule-is-ahead helper in git_operations.ps1, shared by check-pr-would-regress and update-submodules-to-fixed-commits. 3. Replace WinRT toast (temp file + PowerShell 5.1 subprocess) with System.Windows.Forms.NotifyIcon which works directly in pwsh 7. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
On resume, if the saved PR was abandoned (build failed, someone else merged changes), the script now detects this and falls through to the fresh update path instead of monitoring the dead PR forever. Flow: merged -> skip, abandoned -> fresh update, active -> monitor. Renamed check-azure-pr-completed/check-github-pr-merged to get-azure-pr-status/get-github-pr-status returning full status strings. Added get-pr-disposition returning merged/abandoned/active. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Removed 2>&1 redirection so users can see what Copilot is doing in real time. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When -AutoFix is set, leave failed PRs open so resume can find them as 'active' and run autofix directly. Previously, fail-with-status closed the PR, causing resume to see it as 'abandoned' and create a fresh PR+build instead of fixing the existing one. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
On resume with -AutoFix, if the previous PR was closed/abandoned, reopen it and run autofix directly instead of creating a fresh PR. This avoids a full rebuild cycle. Added reopen-pr-github and reopen-pr-azure helper functions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PowerShell captures stdout from external commands inside functions as part of the return value. Using Out-Host forces the output to the console instead. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
On resume, the saved PrUrl may be a stale closed PR while a newer active PR exists from the same branch. Now queries the remote first for an active PR, falls back to saved URL only if none found. This also fixes the gh pr reopen failure when another PR from the same branch already exists. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot CLI outputs UTF-8 emoji/icons which render as garbled characters (e.g. GùÅ) when the console encoding doesn't match. Set Console.OutputEncoding to UTF-8 before running Copilot and restore it after. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The Copilot CLI was using local skills (cmake-config, build-repo) which may not exist on other users' machines. Now: - Prompt includes explicit CMake configure/build commands - Added --no-custom-instructions to prevent loading AGENTS.md/skills - Instructions cover repo validation and traceability targets Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
All autofix guard conditions now also check propagation_cancelled. Previously, Ctrl+C -> close PR -> watch returns failure -> autofix triggered even though the user explicitly cancelled. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Detects the default CMake Visual Studio generator from cmake --help and includes it in the prompt. Tells Copilot to use EXACTLY that generator and not try others. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
For GitHub repos, post /AzurePipelines run comment after autofix pushes to retrigger the gate build. Azure repos trigger CI automatically on push. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When checks pass but auto-merge doesn't complete within 120s, attempt gh pr merge --squash --delete-branch directly instead of closing the PR and failing. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
monitor-github-pr (resume path) was returning immediately after checks passed without waiting for the PR to merge. Added the same auto-merge wait + direct merge fallback as update-repo-github. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Closing a GitHub PR cancels auto-merge. When reopening a PR or resuming monitoring of an active PR, re-enable auto-merge/autocomplete so the PR merges automatically once checks pass. - reopen-pr-github: enables auto-merge immediately after reopen - update-repo active path: re-enables auto-merge/autocomplete before monitoring Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
After git add, check git diff --cached --name-only. If no files are staged, set output to 'nothing to commit' and skip the commit+push. Previously, git commit could succeed with an empty diff (e.g., when the branch was force-pushed with identical content), creating a PR with 0 files changed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Azure Pipelines rejects /AzurePipelines run if the PR is updated after the comment (e.g., by enabling auto-merge). Now the trigger comment is posted AFTER auto-merge is enabled in both: - reopen-pr-github - update-repo resume monitoring path Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When user picks 'leave PR open' on Ctrl+C, the watch returns failure. The monitor/watch functions were calling fail-with-status which closes the PR despite the user's choice. Now check propagation_cancelled and just break instead. Fixed in monitor-github-pr, update-repo-github, and wait-until-complete-azure. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1. On fresh runs (no -Resume), skip resolve-existing-pr entirely. The branch is new so no PR can exist. Only resume needs the PR lookup/disposition/reopen logic. 2. GitHub autofix only triggers when watch_result.Message starts with 'Failed:' — not on timeout, cancellation, or other errors. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Test-ChecksComplete now checks for failed checks BEFORE checking for in-progress ones. If the Build check has failed, there's no point waiting for Status/Required reviewers to complete — declare failure immediately so autofix can start sooner. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Added get-failed-build-logs that fetches error context from CI: - Azure: gets validation errors and failed step issues from build timeline - GitHub: gets failed check links and run logs (last 200 lines) The logs are included in the Copilot prompt so it can diagnose the error directly (e.g., YAML validation failures) without needing to build locally first. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
get-failed-build-logs calls get-repo-type and get-azure-org-project which do Push-Location \. When called from inside the repo directory, this tries repo/repo which doesn't exist. Fix: fetch build logs from the work dir (before Push-Location into the repo) and pass them to invoke-copilot-autofix via -build_logs parameter. The function falls back to fetching from work dir if no logs are provided. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
GitHub repos use Azure Pipelines for CI, not GitHub Actions. gh run view --log-failed returns nothing. Now extracts the Azure build ID from the check URL (buildId=NNNN) and fetches validation errors + failed step details from the Azure DevOps API. Extracted shared get-azure-build-failure-details function used by both GitHub and Azure repo paths. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
check-pr-would-regress now also returns AlreadyUpToDate when ALL submodules on master are at or ahead of the fixed commits. In this case, the PR is redundant (someone else already updated the repo). The script closes the redundant PR and skips the repo instead of monitoring a PR that would produce no net change. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Proof Of Presence is a manual approval check that always shows as failed until a human completes it. Filter it out alongside license/CLA checks so it doesn't trigger fail-fast or autofix. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
test-pr-checks-already-failed now only considers Build/Gate/CI checks. Status policies, Required reviewers, Proof of Presence, and other non-build checks are filtered out — these are not fixable by autofix and should not trigger it. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…message The message appeared while the build was still running, caused by non-blocking policies (Status, Proof of Presence) failing before the build completed. Removed the message — the silent autocomplete poll still runs without the misleading output. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Test-ChecksComplete was failing fast when any blocking check failed, including Status policies which are external checks unrelated to the build. Now only fail-fast on checks whose name starts with 'Build'. A failed Status check while 42 Build checks are still pending should not trigger autofix — the build hasn't even finished yet. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The Build policy may have IsBlocking=false in some repos, causing the fail-fast check to miss failed Build jobs. Now checks all ci_checks (not just blocking_checks) for Build failures, since a failed build is always actionable regardless of blocking status. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Rewrote get-azure-build-failure-details to fetch the tail of failed
task logs using az devops invoke with startLine/endLine params.
Previously only got step issue messages ('Cmd.exe exited with code 8')
which gave Copilot no useful context.
Now fetches: build result, validation errors, timeline to find failed
Task records with log IDs, then the last 200 lines of each failed
task's log (showing test results, compiler errors, etc.).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The Azure autofix guard had no check for actual build failure — it triggered on timeout, non-build policy failure, or any watch exit. Now re-queries policy status to confirm a Build check has failed before attempting autofix. Timeouts and pending reviewer policies no longer trigger autofix. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Prints 'Waited 5s / 180s...' on the same line (carriage return) during the CI check polling loop so the user sees progress. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fixes
Ctrl+C graceful cancellation - Added
wait-or-cancelandprompt-cancel-propagationfor Ctrl+C during sleep and fetch. State is saved after each repo so-Resumecontinues from where it stopped.Prevent submodule regression -
update-submodules-to-fixed-commitschecks if the current submodule is already ahead of the target usinggit merge-base --is-ancestorand keeps the newer version.Windows toast notifications - Shows a toast notification with PR link when a PR is created. Uses PowerShell 5.1 WinRT API (pwsh 7 doesn't support it).
Fallback PR description - When no submodule diffs exist (e.g., only YAML ref changes), builds description from staged file list instead of leaving it empty.
Fix PR abandoned despite checks passing - When a non-blocking policy fails, the watch reported failure. Script now waits up to 2 minutes for autocomplete instead of immediately abandoning.
Retry PR status checks - Added
get-pr-status-with-retry(3 attempts, 5s delay) so transientaz repos pr showfailures don't triggerfail-with-status.Check PR status before close/abandon -
close-prnow queries PR status before attempting to abandon. Skips if already completed/merged.Resume doesn't push to existing PRs - Checks for existing PR (saved URL or remote query) before calling
update-local-repo. If found, skips push and just monitors.Save state on failure -
fail-with-statusnow saves propagation state before exiting, so resume has PrUrls even after crashes.Regression check on resume - Before monitoring an existing PR, checks if any submodule on master is ahead of the fixed commit. Stops propagation if the PR would downgrade.
GitHub PR links in ADO descriptions - Replaces
(#123)with full GitHub URLs to prevent ADO auto-linking to random work items. Adds a "Related PRs" section with links to all PRs in the propagation run.Fix crash in fallback description -
$upstream_changes += @{}on$nullcrashed with "Key already added". Fixed by initializing as@().Fix false success with pending repos -
show-propagation-statusdeclared success whenfailed==0, ignoring pending repos. Now requirespending==0too.Remove unnecessary submodule unshallow -
git submodule update --initcreates full clones, not shallow ones.CLA check filtering - Filter out license/CLA checks when waiting for CI, since they pass instantly before CI starts.