Add gh-aw rerun review scanner by kubaflo · Pull Request #35685 · dotnet/maui

kubaflo · 2026-06-01T13:28:05Z

Note

Are you waiting for the changes in this PR to be merged?
It would be very helpful if you could test the resulting artifacts from this PR and let us know in a comment if this change resolves your issue. Thank you!

Summary

Moves the actual /review rerun command implementation out of Enhance MauiBot AI summary review output #35677 and into this follow-up PR.
Adds deterministic /review rerun eligibility checks for new comments or commits and applies s/agent-ready-for-rerun when another review is justified.
Adds deterministic rerun context for pre-flight so the next review can focus on activity since the prior AI Summary / rerun checkpoint.
Adds an hourly gh-aw workflow (rerun-review-scanner.md + compiled .lock.yml) that scans queued PRs, asks AI for a trigger / skip decision, and uses a custom safe-output job for reactions, queue-label cleanup, and AzDO triggering.
Adds helper scripts for querying queued PRs and processing scanner decisions.

Validation

gh aw compile .github/workflows/rerun-review-scanner.md
Parsed changed PowerShell scripts with System.Management.Automation.Language.Parser.
Parsed .github/workflows/rerun-review-scanner.lock.yml and .github/workflows/review-trigger.yml as YAML.
Invoke-Pester .github/scripts/Resolve-RerunEligibility.Tests.ps1,.github/scripts/Post-AISummaryComment.Tests.ps1,.github/scripts/Remove-StaleMauiBotComments.Tests.ps1 -CI
Read-only query helper run: Query-RerunReadyPRs.ps1 -MaxPRs 1 -OutputPath /tmp/rerun-candidates.json

Depends on

Enhance MauiBot AI summary review output #35677 for the AI Summary layout and visible /review rerun instruction text.

Teach /review rerun to run a deterministic activity check for new comments or commits and apply s/agent-ready-for-rerun when another AI review is justified. Also add rerun guidance to generated AI Summary comments. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Treat label application as successful when the ready-for-rerun label is present after the GitHub API call, avoiding false failures from brittle gh exit-code handling. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Post label additions with the GitHub Issues API JSON payload shape so /review rerun can reliably apply s/agent-ready-for-rerun from GitHub Actions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Grant pull-requests: write to the /review rerun labeling job so it can apply s/agent-ready-for-rerun to pull requests after deterministic eligibility passes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Generate a deterministic rerun context artifact listing new comments and commits since the latest AI Summary or previous /review rerun checkpoint, and instruct pre-flight to read it before reviewing. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add an hourly gh-aw scanner that reviews PRs labeled s/agent-ready-for-rerun, asks AI for a trigger/skip decision, then uses a custom safe-output job to react to the /review rerun comment, remove the queue label, and trigger the existing AzDO review pipeline deterministically. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Allow Resolve-RerunEligibility.ps1 to be dot-sourced by the queued PR query helper without binding required CLI parameters. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Use the GitHub repository owner context in the gh-aw scanner safe-output job while keeping the repository name fixed, so fork dry-runs can inspect their own queued PRs without template injection risk. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

T-Gro · 2026-06-01T13:54:14Z

+---
+on:
+  schedule:
+    - cron: "0 * * * *"


What is your concurrency strategy for reviews?
What prevents 2+ reviews for the same PR to run ?

I was thinking about running it once per hour. WDYT?

Well the job that is triggered happens in an async-fashion, and I do not see anything acting like a "lock" (either repo-memory function of GH AW - to keep track if disposed review ;; or perhaps some label or reaction on the PR to review).

If the run at 9AM triggers the (async) job to do the review, what prevents the run at 10AM to do the same? (if the first one is not finished)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Pass Owner and Repo through when dot-sourcing Resolve-RerunEligibility so Query-RerunReadyPRs does not reset to dotnet/maui and can scan fork/test repositories correctly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Parse the latest normal /review command and carry its platform and pipeline branch through rerun scanner candidates and safe-output triggering so /review rerun follows the same target rules as the original review request. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Require /review rerun commenters to have OWNER, MEMBER, COLLABORATOR, or CONTRIBUTOR association before applying the rerun-ready label. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

PureWeen

Adversarial PR Review — 3 reviewers, consensus-validated

3 independent reviewers (different model families) analyzed this PR in parallel without seeing each other's findings or this PR's existing review threads. Findings below survived adversarial consensus (3/3 unanimous, 2/3 corroborated, or 1/3 with direct source-code verification).

Verdict

Concerns before merge. Three blocking issues plus a 2/3-consensus race condition that directly corroborates @T-Gro's unresolved concurrency concern.

Blocking (❌)

#	Issue	Consensus
1	`/review rerun` bypasses the actor permission check that `/review` enforces	3/3 unanimous
2	Scanner aborts the entire batch on a `skip` with missing `rerun_comment_id` — which the prompt itself instructs the agent to send as `0`	1/3 + verified
3	Manual `/review` never removes `s/agent-ready-for-rerun`, causing a duplicate review on the next hourly scanner run	1/3 + verified

Should fix (⚠️)

#	Issue	Consensus
4	Label-based "lock" is non-atomic check-then-set, and the scanner and `/review` workflows have no shared concurrency group — directly corroborates @T-Gro's concern	2/3
5	`/review --branch refs/heads/foo` produces `refs/heads/refs/heads/foo` and breaks the AzDO trigger	1/3 + verified
6	`--platform=ios` (equals form) is silently ignored — user thinks iOS was selected; pipeline runs on Android	1/3 + verified
7	Editing the AI Summary advances the rerun checkpoint via `updated_at`, silently dropping eligible reruns	1/3
8	`+1` reaction posted before AzDO trigger; on failure, the misleading `+1` stays as a "review started" signal	1/3 + verified

(Inline comments below have file:line context, repro scenarios, and fix suggestions.)

Design-level observations (not posted as inline)

expected_head_sha empty-string bypass (Invoke-RerunReviewTrigger.ps1:181) — if ($expectedHeadSha -and …) short-circuits on empty string, skipping the head-SHA staleness check. gh-aw's required: true enforces presence, not non-emptiness; treating empty as invalid is cheap defense-in-depth.
Stale-lock recovery direction (Test-AgentReviewInProgressIsStale) — on events-API error or missing labeled event, returns $false (fresh). If the labeling event ages out or is unreadable, a lock that was never cleared can wedge a PR indefinitely. Falling back to issue updated_at would let the 18-hour window still elapse.
Heredoc delimiter EOF (rerun-review-scanner.md:114-116) — un-randomized. Practically near-impossible to trigger via JSON-encoded content, but EOF_$(openssl rand -hex 8) is the documented hardening.
Test coverage gap — Invoke-RerunReviewTrigger.ps1 and Query-RerunReadyPRs.ps1 have no Pester tests. Issues #2, #5, #6, #8 above would have been caught by basic argument-validation and Normalize-PipelineRef coverage.

Prior review threads (acknowledgement)

@T-Gro asked twice (2026-06-01 and again on 2026-06-02 against b84cc7f3f): "What is your concurrency strategy? What prevents 2+ reviews for the same PR?" — Finding #4 above (the 2/3-consensus race) is the same issue, now with a concrete trigger sequence: scanner-vs-/review workflows have no shared concurrency group and Set-AgentReviewInProgress is check-then-set. Hourly serialization only protects scanner-vs-scanner.
@JanKrivanek asked where trigger_rerun_review is defined. It is generated by gh-aw from the safe-outputs.jobs.trigger-rerun-review: block in .github/workflows/rerun-review-scanner.md. The hyphenated job key compiles into an underscored tool name (trigger_rerun_review) for the agent. See compiled lock.yml around line 469.

Methodology

3 reviewers with different model families ran in parallel:

Reviewer 1: deep-reasoning model, focused on subtle logic and concurrency walkthrough
Reviewer 2: pattern-matching model, focused on security and common bug classes
Reviewer 3: cross-family model, focused on edge cases

Disputed (1/3) findings were either verified directly against source code or discarded. Findings about issues already raised in existing review threads were merged with attribution rather than re-flagged. CI build/test status was intentionally out of scope.

PureWeen · 2026-06-02T20:11:49Z

+      group: review-rerun-${{ github.event.issue.number }}
+      cancel-in-progress: false
+    timeout-minutes: 5
+    permissions:


❌ Security — mark-rerun-ready lacks the actor permission check that trigger-review enforces (lines 127-138 in this same file). Any commenter — including a first-time contributor on their own PR — can post /review rerun and successfully apply s/agent-ready-for-rerun, since their own comment is fresh evidence past the AI Summary checkpoint. The hourly scanner then picks it up and triggers a full AzDO review. This is a privilege escalation: the direct-trigger path is gated; the rerun path is not.
Flagged by: 3/3 reviewers (unanimous)

Fix: Add the same Check actor permission step (write/maintain/admin) to mark-rerun-ready before invoking Resolve-RerunEligibility.ps1 -ApplyLabel.

PureWeen · 2026-06-02T20:11:49Z

+    if ($prNumber -le 0) {
+        throw "Invalid PR number '$($item.pr_number)'."
+    }
+    if ($rerunCommentId -le 0) {


❌ Logic — This validation runs before branching on $decision. But the scanner prompt explicitly tells the agent: "If it is missing, choose skip and use 0." (rerun-review-scanner.md:157). And Query-RerunReadyPRs.ps1:99 already produces rerunCommentId = $null (→ 0 after [Int64] cast) whenever a PR was queued without a /review rerun comment (e.g. a maintainer applied the queue label manually).

So a valid skip with 0 always throws here. Worse: the foreach at line 158 has no per-item try/catch, so $ErrorActionPreference = 'Stop' propagates the throw and aborts processing of every remaining candidate in the batch — one bad item silently drops the rest.
Flagged by: 1/3 reviewer; verified directly against source.

Fix: Only require rerun_comment_id > 0 on the trigger branch; allow skip with 0 (and guard Add-CommentReaction on >0). Wrap the per-candidate body in try/catch { Write-Error; continue } so one malformed item doesn't kill the whole batch.

PureWeen · 2026-06-02T20:11:49Z

+        id: review_lock
+        shell: pwsh
+        env:
+          GH_TOKEN: ${{ github.token }}


❌ State Bug — trigger-review acquires s/agent-review-in-progress here, but the job never removes s/agent-ready-for-rerun if it's present.

Scenario: PR has s/agent-ready-for-rerun (queued). Maintainer types /review directly. trigger-review sets the in-progress lock, runs AzDO, AzDO's CleanupReviewLock stage removes the in-progress label. The queue label is still there. Next hourly scanner pass sees a queued PR with no in-progress lock → triggers a second review.
Flagged by: 1/3 reviewer; verified directly against source.

Fix: After successful Set-AgentReviewInProgress, also remove s/agent-ready-for-rerun (best-effort with --silent is fine — it may not be present).

PureWeen · 2026-06-02T20:11:49Z

+# ============================================================
+# Set-AgentReviewInProgress
+# ============================================================
+function Set-AgentReviewInProgress {


⚠️ Race Condition — Set-AgentReviewInProgress is check-then-set: Get-AgentLabels → if (-not contains) → Add-Label → Get-AgentLabels (verify). GitHub's add-label API is idempotent — two concurrent callers will both pass the -notcontains check and both call Add-Label, and both success returns will report "the label is present".

This matters because the scanner workflow uses concurrency: gh-aw-${{ github.workflow }} (scanner-vs-scanner only) and review-trigger.yml uses concurrency: review-trigger-${{ github.event.issue.number }} (per-PR direct trigger only). These are different groups in different workflows, so the safe-output job and the /review job can race on the same PR with no serialization. This is exactly the unresolved concern @T-Gro raised twice on this PR.
Flagged by: 2/3 reviewers (lowered to ⚠️ per consensus rule).

Fix: Either (a) route both workflows through the same concurrency group keyed by PR number (e.g. pr-review-${{ pr_number }}), or (b) make acquisition tag-based — write the label with a unique run-id marker (e.g. as a comment or via a label-prefix the script stamps with its own run-id), then re-read and abort if the marker doesn't match this run.

PureWeen · 2026-06-02T20:11:49Z

+        resources = @{
+            repositories = @{
+                self = @{
+                    refName = "refs/heads/$PipelineRef"


⚠️ Logic — refName = "refs/heads/$PipelineRef" blindly prepends refs/heads/ even when $PipelineRef already starts with it. Normalize-PipelineRef strips most metacharacters but allows /, so refs/heads/main survives unchanged → final refName is refs/heads/refs/heads/main, which AzDO rejects.

User scenario: someone types /review --branch refs/heads/main (a perfectly reasonable expectation given the rest of .github/ uses refs/heads/...). The pipeline trigger fails, the catch block re-throws, the entire scanner batch aborts (see comment on line 171). Same hazard exists in the bash version of this prepend at review-trigger.yml:369.
Flagged by: 1/3 reviewer; verified directly against source.

Fix: In Normalize-PipelineRef (and the equivalent block in Resolve-RerunEligibility.ps1 and review-trigger.yml), strip a leading refs/heads/ before sanitization.

PureWeen · 2026-06-02T20:11:49Z

+                }
+                continue
+            }
+            '^(--platform|-p)$' {


⚠️ Logic — The regex ^(--platform|-p)$ only matches the flag in its bare form. /review --platform=ios (with an equals sign — the form most CLIs accept) does NOT match. Falls through to the default case which checks $candidate = $token.ToLowerInvariant() against the valid platform list — --platform=ios is not a valid platform name, so it's silently dropped and $platform stays empty.

User thinks they selected iOS; the pipeline runs on android (the fallback in Get-PlatformFromLabels).

Same hazard exists in review-trigger.yml's bash parser (lines 170-200) — it also only handles the space-separated form.
Flagged by: 1/3 reviewer; verified directly against source.

Fix: Pre-process tokens to split on = before iterating ($tokens = $tokens | ForEach-Object { $_ -split '=', 2 }), or extend the regex: ^(--platform|-p)(=(.+))?$ and capture group 3.

PureWeen · 2026-06-02T20:11:49Z

+
+    return @($Comments |
+        Where-Object { $_.body -and ([string]$_.body).Contains($AISummaryMarker) } |
+        Sort-Object @{ Expression = { Get-ObjectDate $_ 'updated_at' }; Descending = $true }, @{ Expression = { [Int64]$_.id }; Descending = $true } |


⚠️ Logic — Get-LatestAISummaryComment sorts by updated_at, and downstream summaryUpdatedAt is used as the eligibility checkpoint (lines 353, 490). If a maintainer edits the AI Summary comment after new reviewer comments/commits have landed (typo fix, formatting), updated_at jumps ahead of that real activity → Test-HasEvidenceCommentAfter and Test-HasCommitAfter see nothing past the checkpoint → eligibility returns no-new-comments-or-commits and a legitimate /review rerun is silently dropped.
Flagged by: 1/3 reviewer.

Fix: Anchor the checkpoint to the comment's created_at rather than updated_at, or (better) to the SHA embedded in the SESSION marker via Get-LatestReviewedSha, which is the actual state the previous review observed.

PureWeen · 2026-06-02T20:11:49Z

+                }
+            }
+
+            Add-CommentReaction -CommentId $rerunCommentId -Content '+1'


⚠️ Error Handling — The +1 reaction is posted before Invoke-AzDOReviewPipeline. If the AzDO trigger throws, the catch block at line 221 clears the in-progress lock but does NOT remove the +1 reaction or restore the queue label. The user sees a +1 (meaning "your rerun is starting") but no review actually started. Combined with the loop-abort issue at line 171, the same throw also kills processing of every later candidate in the batch.
Flagged by: 1/3 reviewer; verified directly against source.

Fix: Post the +1 reaction only after Invoke-AzDOReviewPipeline returns successfully. On failure, either remove the reaction, post a -1, or leave the queue label in place so the next scanner pass retries.

Copilot AI added 8 commits June 1, 2026 15:26

Verify rerun label application

b20223d

Treat label application as successful when the ready-for-rerun label is present after the GitHub API call, avoiding false failures from brittle gh exit-code handling. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Use JSON when adding agent labels

6706df5

Post label additions with the GitHub Issues API JSON payload shape so /review rerun can reliably apply s/agent-ready-for-rerun from GitHub Actions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Allow rerun workflow to label PRs

69238a9

Grant pull-requests: write to the /review rerun labeling job so it can apply s/agent-ready-for-rerun to pull requests after deterministic eligibility passes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Make rerun helper importable

798f837

Allow Resolve-RerunEligibility.ps1 to be dot-sourced by the queued PR query helper without binding required CLI parameters. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions Bot added the area-infrastructure CI, Maestro / Coherency, upstream dependencies/versions label Jun 1, 2026

T-Gro reviewed Jun 1, 2026

View reviewed changes

JanKrivanek reviewed Jun 1, 2026

View reviewed changes

Comment thread .github/workflows/rerun-review-scanner.md Outdated

PureWeen mentioned this pull request Jun 1, 2026

[Epic] Improve on the agent story inside the .NET MAUI Repository #32454

Open

PureWeen added the area-ai-agents Copilot CLI agents, agent skills, AI-assisted development label Jun 1, 2026

This was referenced Jun 2, 2026

[PR Review Queue] 2026-06-02 PureWeen/maui#98

Open

[PR Review Queue] 2026-06-02 #35699

Open

Copilot AI added 2 commits June 2, 2026 13:14

Clarify rerun scanner concurrency

b84cc7f

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add review rerun in-progress lock

a3412bc

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

PureWeen mentioned this pull request Jun 2, 2026

[Epic] Improve Agent PR Review Pipeline — Pre-Flight, Try-Fix, and Code Review #35048

Open

19 tasks

Copilot AI added 3 commits June 2, 2026 21:26

Preserve rerun scanner target repository

1dcdb74

Pass Owner and Repo through when dot-sourcing Resolve-RerunEligibility so Query-RerunReadyPRs does not reset to dotnet/maui and can scan fork/test repositories correctly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Restrict review rerun to contributors

530a7c5

Require /review rerun commenters to have OWNER, MEMBER, COLLABORATOR, or CONTRIBUTOR association before applying the rerun-ready label. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

PureWeen reviewed Jun 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add gh-aw rerun review scanner#35685

Add gh-aw rerun review scanner#35685
kubaflo wants to merge 13 commits into
feature/enhanced-reviewerfrom
feature/rerun-review-scanner

kubaflo commented Jun 1, 2026

Uh oh!

T-Gro Jun 1, 2026

Uh oh!

kubaflo Jun 1, 2026

Uh oh!

T-Gro Jun 2, 2026

Uh oh!

Uh oh!

PureWeen left a comment

Uh oh!

PureWeen Jun 2, 2026

Uh oh!

PureWeen Jun 2, 2026

Uh oh!

PureWeen Jun 2, 2026

Uh oh!

PureWeen Jun 2, 2026

Uh oh!

PureWeen Jun 2, 2026

Uh oh!

PureWeen Jun 2, 2026

Uh oh!

PureWeen Jun 2, 2026

Uh oh!

PureWeen Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

kubaflo commented Jun 1, 2026

Summary

Validation

Depends on

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

PureWeen left a comment

Choose a reason for hiding this comment

Adversarial PR Review — 3 reviewers, consensus-validated

Verdict

Blocking (❌)

Should fix (⚠️)

Design-level observations (not posted as inline)

Prior review threads (acknowledgement)

Methodology

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants