From e744de771aa6f3a03b906160581fe5a96100b1f4 Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Tue, 14 Apr 2026 03:58:43 +0000
Subject: [PATCH 1/2] Initial plan


From 9f2481ceebf659c7923b76d815c26da8132de769 Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Tue, 14 Apr 2026 04:34:34 +0000
Subject: [PATCH 2/2] fix: reduce agentic-workflows test scope and strengthen
 safe-output instructions in Agent Persona Explorer (#25231)

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/3c2b2c4d-a7b1-45b3-9855-08116477b367

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
---
 .github/workflows/agent-persona-explorer.md | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/.github/workflows/agent-persona-explorer.md b/.github/workflows/agent-persona-explorer.md
index dac4c5363d7..fffd81fd3b6 100644
--- a/.github/workflows/agent-persona-explorer.md
+++ b/.github/workflows/agent-persona-explorer.md
@@ -73,7 +73,7 @@ Store all scenarios in cache memory.
 
 ## Phase 3: Test Agent Responses (15 minutes)
 
-**Token Budget Optimization**: Test a **representative subset of 6-8 scenarios** (not all scenarios) to reduce token consumption while maintaining quality insights.
+**Token Budget Optimization**: Test a **representative subset of 3-4 scenarios** (not all scenarios) to reduce token consumption and ensure budget remains for Phase 5 publishing.
 
 For each selected scenario, invoke the "agentic-workflows" custom agent tool and:
 
@@ -99,6 +99,7 @@ For each selected scenario, invoke the "agentic-workflows" custom agent tool and
 - You are ONLY testing the agent's responses, NOT creating actual workflows
 - **Keep responses focused and concise** - summarize findings instead of verbose descriptions
 - Aim for quality over quantity - fewer well-analyzed scenarios are better than many shallow ones
+- **If any tool call fails, record the error briefly and move on to the next scenario** - do NOT retry or get stuck
 
 ## Phase 4: Analyze Results (4 minutes)
 
@@ -124,7 +125,9 @@ Review all captured responses and identify:
 
 ## Phase 5: Document and Publish Findings (1 minute)
 
-Create a GitHub discussion with a **concise** summary report. Use the `create discussion` safe-output to publish your findings.
+**MANDATORY OUTPUT**: Regardless of how many phases completed successfully, you MUST call either the `create discussion` or the `noop` safe-output tool before finishing. Failing to call a safe-output tool is the most common cause of workflow failures.
+
+Create a GitHub discussion with a **concise** summary report. Use the `create discussion` safe-output to publish your findings. Even if only 1-2 scenarios were tested, create the discussion with partial results.
 
 **Discussion title**: "Agent Persona Exploration - [DATE]" (e.g., "Agent Persona Exploration - 2024-01-16")
 
@@ -221,15 +224,18 @@ Example:
 ## Success Criteria
 
 Your effectiveness is measured by:
+- **Safe output**: ALWAYS call either `create discussion` or `noop` — this is the most critical requirement
 - **Efficiency**: Complete analysis within token budget (timeout: 180 minutes, concise outputs)
-- **Quality over quantity**: Test 6-8 representative scenarios thoroughly rather than all scenarios superficially
+- **Quality over quantity**: Test 3-4 representative scenarios thoroughly rather than many scenarios superficially
 - **Actionable insights**: Provide 3-5 concrete, implementable recommendations
 - **Concise documentation**: Report under 1000 words with progressive disclosure
 - **Consistency**: Maintain objective, research-focused methodology
 
 Execute all phases systematically and maintain an objective, research-focused approach to understanding the agentic-workflows custom agent's capabilities and limitations.
 
-**Important**: If no action is needed after completing your analysis, you **MUST** call the `noop` safe-output tool with a brief explanation. Failing to call any safe-output tool is the most common cause of safe-output workflow failures.
+**CRITICAL**: You MUST call a safe-output tool before finishing. Choose one:
+1. Call `create discussion` to publish findings (preferred — even partial results are valuable)
+2. Call `noop` if you were completely unable to gather any data
 
 ```json
 {"noop": {"message": "No action needed: [brief explanation of what was analyzed and why]"}}