Add MCP servers to dotnet-blazor plugin by javiercn · Pull Request #703 · dotnet/skills

javiercn · 2026-05-29T15:19:21Z

Summary

add the Microsoft Learn MCP server to plugins/dotnet-blazor/plugin.json
add the Playwright MCP server to plugins/dotnet-blazor/plugin.json
allowlist both MCP server declarations in eng/allowed-external-deps.txt

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

This PR updates the dotnet-blazor plugin manifest to declare two MCP server dependencies (Microsoft Learn + Playwright) and adds the corresponding allowlist entries so the skill-validator’s external dependency checks don’t flag them.

Changes:

Add mcpServers entries to plugins/dotnet-blazor/plugin.json.
Allowlist the new MCP server declarations in eng/allowed-external-deps.txt.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File	Description
plugins/dotnet-blazor/plugin.json	Declares two MCP servers for the `dotnet-blazor` plugin.
eng/allowed-external-deps.txt	Adds allowlist entries for the newly declared MCP servers.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

github-actions · 2026-05-29T15:27:20Z

Skill Coverage Report

	Plugin	Skill	Covered	Coverage
❌	`dotnet-blazor`	`author-component`	0/1	0%
❌	`dotnet-blazor`	`collect-user-input`	0/4	0%
❌	`dotnet-blazor`	`configure-auth`	0/3	0%
❌	`dotnet-blazor`	`coordinate-components`	0/2	0%
❌	`dotnet-blazor`	`fetch-and-send-data`	1/5	20%
❌	`dotnet-blazor`	`plan-ui-change`	0/1	0%
❌	`dotnet-blazor`	`support-prerendering`	0/3	0%
❌	`dotnet-blazor`	`use-js-interop`	0/2	0%

Uncovered: dotnet-blazor/author-component

[CodePattern] [Parameter] (line 31)

Uncovered: dotnet-blazor/collect-user-input

[CodePattern] [CascadingParameter] (line 162)
[CodePattern] [Range] (line 135)
[CodePattern] [SupplyParameterFromForm] (line 31)
[CodePattern] [Required] (line 135)

Uncovered: dotnet-blazor/configure-auth

[CodePattern] [CascadingParameter] (line 51)
[CodePattern] [ExcludeFromInteractiveRouting] (line 152)
[CodePattern] [Authorize] (line 101)

Uncovered: dotnet-blazor/coordinate-components

[CodePattern] [CascadingParameter] (line 63)
[CodePattern] readonly (line 137)

Uncovered: dotnet-blazor/fetch-and-send-data

[CodePattern] [StreamRendering] (line 74)
[CodePattern] [Parameter] (line 159)
[CodePattern] [SupplyParameterFromQuery] (line 159)
[CodePattern] [PersistentState] (line 84)

Uncovered: dotnet-blazor/plan-ui-change

[CodePattern] [Parameter] (line 64)

Uncovered: dotnet-blazor/support-prerendering

[CodePattern] [CascadingParameter] (line 159)
[CodePattern] [ExcludeFromInteractiveRouting] (line 148)
[CodePattern] [PersistentState] (line 39)

Uncovered: dotnet-blazor/use-js-interop

[CodePattern] readonly (line 118)
[CodePattern] sealed (line 118)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

AbhitejJohn · 2026-05-29T17:49:10Z

/evaluate

github-actions · 2026-05-29T18:19:59Z

Skill Validation Results

Skill	Scenario	Quality	Skills Loaded	Overfit	Verdict
mcp-csharp-create	Implement MCP tools with proper attributes and DI	4.0/5 → 5.0/5 🟢	✅ mcp-csharp-create; tools: skill	✅ 0.16	✅
mcp-csharp-create	Create an HTTP MCP server with tools and resources	4.0/5 → 5.0/5 🟢	✅ mcp-csharp-create; tools: skill	✅ 0.16	✅
mcp-csharp-create	Create an MCP server with tools, prompts, and proper logging	4.0/5 → 4.0/5	✅ mcp-csharp-create; tools: skill	✅ 0.16	✅
mcp-csharp-test	Write unit and integration tests for an MCP server	3.0/5 → 5.0/5 🟢	✅ mcp-csharp-test; tools: skill, report_intent, view / ✅ mcp-csharp-test; tools: report_intent, skill, view	🟡 0.23	✅
mcp-csharp-test	Test an HTTP MCP server with WebApplicationFactory	4.0/5 → 5.0/5 🟢	✅ mcp-csharp-test; tools: skill, report_intent, view	🟡 0.23	❌ [1]
mcp-csharp-test	Create evaluations for an MCP server	2.0/5 → 5.0/5 🟢	✅ mcp-csharp-test; tools: skill, view	🟡 0.23	✅
technology-selection	ML.NET classification on tabular data	3.0/5 → 4.0/5 🟢	✅ technology-selection; tools: skill, read_bash, stop_bash / ✅ technology-selection; tools: skill	🟡 0.34	✅
technology-selection	LLM integration with MEAI abstraction	1.0/5 → 1.0/5	⚠️ NOT ACTIVATED	🟡 0.34	❌ [2]
technology-selection	Reject LLM for tabular classification	3.0/5 → 5.0/5 🟢	✅ technology-selection; tools: skill	🟡 0.34	✅
technology-selection	Agentic workflow with guardrails	3.0/5 → 3.0/5	✅ technology-selection; tools: skill / ✅ technology-selection; tools: skill, create	🟡 0.34	❌ [3]
technology-selection	Natural-language scenario decomposition — RAG chatbot	4.0/5 → 5.0/5 🟢	✅ technology-selection; tools: skill	🟡 0.34	✅ [4]
technology-selection	RAG pipeline with vector search	4.0/5 → 5.0/5 🟢	✅ technology-selection; tools: skill	🟡 0.34	✅
mcp-csharp-publish	Publish an MCP server as a NuGet tool package	3.0/5 → 4.0/5 🟢	✅ mcp-csharp-publish; tools: skill	✅ 0.19	✅
mcp-csharp-publish	Deploy an HTTP MCP server to Azure Container Apps	4.0/5 → 5.0/5 🟢	✅ mcp-csharp-publish; tools: skill, report_intent, view	✅ 0.19	✅
mcp-csharp-publish	Publish to the MCP Registry	1.0/5 → 3.0/5 🟢	✅ mcp-csharp-publish; tools: skill	✅ 0.19	✅
mcp-csharp-debug	Debug an MCP server with MCP Inspector	4.0/5 → 4.0/5	✅ mcp-csharp-debug; tools: skill	✅ 0.10	❌ [5]
mcp-csharp-debug	Configure VS Code to use an MCP server	4.0/5 → 4.0/5	✅ mcp-csharp-debug; tools: skill, report_intent, grep, glob / ✅ mcp-csharp-debug; tools: skill	✅ 0.10	✅
mcp-csharp-debug	Debug a failing MCP server tool	5.0/5 → 4.0/5 🔴	✅ mcp-csharp-debug; tools: report_intent, skill / ✅ mcp-csharp-debug; tools: skill	✅ 0.10	❌
template-authoring	Validate a template.json file	3.0/5 → 5.0/5 🟢	✅ template-authoring; tools: glob, skill / ⚠️ NOT ACTIVATED	✅ 0.18	✅
template-authoring	Create template from existing project	3.0/5 → 4.0/5 🟢	✅ template-authoring; tools: skill	✅ 0.18	✅
template-validation	Validate template with multiple errors	3.0/5 → 4.0/5 🟢	✅ template-validation; tools: skill	✅ 0.10	✅
template-validation	Validate correct template and suggest improvements	1.0/5 → 3.0/5 🟢	✅ template-validation; tools: glob, skill / ✅ template-validation; tools: skill	✅ 0.10	✅
template-discovery	Find template for web API project	2.0/5 → 4.0/5 🟢	✅ template-discovery; tools: report_intent, skill, bash	🟡 0.23	✅
template-discovery	Inspect template parameters and compare choices	3.0/5 → 4.0/5 🟢	✅ template-discovery; tools: skill	🟡 0.23	✅
template-discovery	Search NuGet for specialized template	4.0/5 → 4.0/5	✅ template-discovery; tools: skill	🟡 0.23	✅
template-discovery	Resolve ambiguous project intent to multiple candidates	4.0/5 → 4.0/5	✅ template-discovery; tools: skill	🟡 0.23	✅
template-discovery	Preview project creation with dry run	3.0/5 → 3.0/5	✅ template-discovery; tools: skill / ⚠️ NOT ACTIVATED	🟡 0.23	✅
template-instantiation	Create a console application	4.0/5 → 5.0/5 🟢	✅ template-instantiation; tools: skill	✅ 0.19	✅

[1] (Isolated) Quality improved but weighted score is -8.2% due to: tokens (13230 → 45021), tool calls (0 → 3), time (17.2s → 22.2s)
[2] (Isolated) Quality unchanged but weighted score is -4.9% due to: tokens (85496 → 137447), time (42.5s → 64.1s), tool calls (13 → 16)
[3] (Plugin) Quality unchanged but weighted score is -2.0% due to: tokens (190806 → 1858992), tool calls (12 → 49), time (109.6s → 281.1s)
[4] (Plugin) Quality dropped but weighted score is +11.9% due to: efficiency metrics
[5] (Plugin) Quality unchanged but weighted score is -8.0% due to: tokens (12823 → 30428), tool calls (0 → 1), time (11.5s → 13.8s)

Model: claude-opus-4.6 | Judge: claude-opus-4.6

🔍 Full Results - additional metrics and failure investigation steps

To investigate failures, paste this to your AI coding agent:

For PR 703 in dotnet/skills, download eval artifacts with gh run download 26653053060 --repo dotnet/skills --pattern "skill-validator-results-*" --dir ./eval-results, then fetch https://raw.githubusercontent.com/dotnet/skills/579fcb7269f9d16cd40d4fe9521d1fb1f747fbc0/eng/skill-validator/src/docs/InvestigatingResults.md and follow it to analyze the results.json files. Diagnose each failure, suggest fixes to the eval.yaml and skill content, and tell me what to fix first.

▶ Sessions Visualisation -- interactive replay of all evaluation sessions

github-actions · 2026-06-03T23:01:04Z

✅ Evaluation passed for 579fcb7. cc @ViktorHofer @JanKrivanek @dotnet/aspnet — please review.

danroth27

I think we should move the MS Learn MCP server to the core dotnet plugin, but otherwise this looks fine to me.

Copilot · 2026-06-04T07:45:23Z

@javiercn I've opened a new pull request, #723, to work on those changes. Once the pull request is ready, I'll request review from you.

github-actions · 2026-06-04T07:58:17Z

👋 @javiercn — this PR has 2 unresolved review thread(s). When you're ready, please address the feedback and push an update; the triage bot will pick up the next state automatically. (Add the no-stale label to silence further pings.)

…ation (#723) * Initial plan * feat(dotnet-blazor): pin playwright MCP to 0.0.75 and add weekly update workflow * fix(update-playwright-mcp-version): use env vars in node scripts and descending sort --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated no new comments.

AbhitejJohn · 2026-06-05T00:25:36Z

/evaluate

github-actions · 2026-06-05T00:38:30Z

Skill Validation Results

❌ Skill validation errors

assertion-quality: Eval scenario 'Identify self-referential assertions in identity and round-trip tests' prompt mentions target name 'assertion-quality' (skill or agent) — remove the target name from the prompt to avoid biasing baseline runs.

Skill	Scenario	Quality	Skills Loaded	Overfit	Verdict
test-gap-analysis	Find boundary mutation gaps in tiered discount and shipping logic	4.7/5 → 5.0/5 🟢	✅ test-gap-analysis; tools: skill / ⚠️ NOT ACTIVATED	✅ 0.08	✅ [1]
test-gap-analysis	Find logic and null-check mutation gaps in access control code	4.3/5 → 5.0/5 🟢	✅ test-gap-analysis; tools: skill / ⚠️ NOT ACTIVATED	✅ 0.08	✅ [2]
test-gap-analysis	Acknowledge well-tested code with few surviving mutations	4.7/5 → 4.0/5 🔴	✅ test-gap-analysis; tools: skill / ⚠️ NOT ACTIVATED	✅ 0.08	❌ [3]
test-gap-analysis	Decline request to write new tests from scratch	4.0/5 → 4.0/5	ℹ️ not activated (expected)	✅ 0.08	❌ [4]
code-testing-agent	Generate tests for ContosoUniversity ASP.NET Core MVC app	4.0/5 → 4.0/5	✅ code-testing-agent; tools: skill, task, read_agent / ✅ code-testing-agent; code-testing-extensions; tools: skill, task, read_agent	✅ 0.04	❌ [5]
writing-mstest-tests	Write unit tests for a service class	4.0/5 → 4.7/5 🟢	✅ writing-mstest-tests; tools: skill, glob, bash, edit / ✅ writing-mstest-tests; tools: skill, glob	🟡 0.34	✅ [6]
writing-mstest-tests	Write data-driven tests for a calculator	3.0/5 → 5.0/5 🟢	✅ writing-mstest-tests; tools: skill, report_intent, view, create / ✅ writing-mstest-tests; tools: view, skill, bash, edit, report_intent, create	🟡 0.34	✅
writing-mstest-tests	Write async tests with cancellation	3.0/5 → 5.0/5 🟢	✅ writing-mstest-tests; tools: skill / ✅ writing-mstest-tests; tools: report_intent, skill	🟡 0.34	✅
writing-mstest-tests	Fix swapped Assert.AreEqual arguments	5.0/5 → 5.0/5	✅ writing-mstest-tests; tools: skill / ⚠️ NOT ACTIVATED	🟡 0.34	❌ [7]
writing-mstest-tests	Modernize legacy test patterns	4.3/5 → 4.7/5 🟢	✅ writing-mstest-tests; tools: skill / ⚠️ NOT ACTIVATED	🟡 0.34	✅ [8]
writing-mstest-tests	Replace ExpectedException with Assert.Throws	3.0/5 → 3.7/5 🟢	✅ writing-mstest-tests; tools: report_intent, skill / ⚠️ NOT ACTIVATED	🟡 0.34	✅ [9]
writing-mstest-tests	Use proper collection assertions	3.0/5 → 2.7/5 🔴	⚠️ NOT ACTIVATED / ✅ writing-mstest-tests; tools: report_intent, skill	🟡 0.34	❌
writing-mstest-tests	Use proper type assertions instead of casts	4.0/5 → 4.0/5	⚠️ NOT ACTIVATED / ✅ writing-mstest-tests; tools: report_intent, skill	🟡 0.34	✅ [10]
writing-mstest-tests	Set up test lifecycle correctly	2.0/5 → 4.7/5 🟢	✅ writing-mstest-tests; tools: skill, report_intent / ✅ writing-mstest-tests; tools: report_intent, skill, view	🟡 0.34	✅
writing-mstest-tests	Use DynamicData with ValueTuples over object arrays	3.0/5 → 3.0/5	✅ writing-mstest-tests; tools: skill / ⚠️ NOT ACTIVATED	🟡 0.34	❌ [11]
writing-mstest-tests	Use string assertions for format validation	3.7/5 → 4.7/5 ⏰ 🟢	✅ writing-mstest-tests; tools: skill, bash, edit, view	🟡 0.34	❌ [12]
writing-mstest-tests	Use comparison assertions for boundary testing	3.0/5 → 5.0/5 🟢	✅ writing-mstest-tests; tools: skill	🟡 0.34	✅
writing-mstest-tests	Write tests with collection, null, and reference assertions	4.0/5 → 4.7/5 🟢	✅ writing-mstest-tests; tools: skill, glob / ⚠️ NOT ACTIVATED	🟡 0.34	❌ [13]
writing-mstest-tests	Configure conditional execution, retry, and cleanup	3.0/5 → 5.0/5 🟢	✅ writing-mstest-tests; tools: report_intent, skill / ✅ writing-mstest-tests; tools: skill, report_intent	🟡 0.34	✅
writing-mstest-tests	Configure test parallelization and MSTest.Sdk project	3.0/5 → 5.0/5 🟢	✅ writing-mstest-tests; tools: skill	🟡 0.34	✅
test-smell-detection	Detect multiple test smells in order processing test suite	3.0/5 → 5.0/5 🟢	✅ test-smell-detection; tools: skill / ⚠️ NOT ACTIVATED	🟡 0.44	❌ [14]
test-smell-detection	Recognize well-written tests with no significant smells	4.3/5 → 5.0/5 🟢	✅ test-smell-detection; tools: skill / ⚠️ NOT ACTIVATED	🟡 0.44	✅ [15]
test-smell-detection	Recognize integration tests and avoid false positives for external resources	5.0/5 → 5.0/5	✅ test-smell-detection; tools: skill / ✅ test-anti-patterns; test-smell-detection; tools: skill	🟡 0.44	❌ [16]
test-smell-detection	Decline request to write new tests from scratch	4.3/5 → 4.7/5 🟢	ℹ️ not activated (expected)	🟡 0.44	✅ [17]
test-tagging	Tag an untagged MSTest test suite	2.3/5 → 2.7/5 🟢	✅ test-tagging; tools: skill / ✅ test-tagging; tools: glob, skill	🟡 0.29	✅ [18]
test-tagging	Tag an untagged xUnit test suite	2.7/5 → 2.7/5	✅ test-tagging; tools: skill, bash / ⚠️ NOT ACTIVATED	🟡 0.29	❌ [19]
test-tagging	Tag an untagged NUnit test suite	2.7/5 → 2.3/5 🔴	✅ test-tagging; tools: glob, skill, bash / ⚠️ NOT ACTIVATED	🟡 0.29	❌ [20]
test-tagging	Audit test distribution without modifying files	5.0/5 → 5.0/5	✅ test-tagging; tools: skill / ⚠️ NOT ACTIVATED	🟡 0.29	❌ [21]
test-tagging	Decline request to write new tests	4.0/5 → 3.3/5 🔴	ℹ️ not activated (expected)	🟡 0.29	❌ [22]
test-tagging	Tag a partially-tagged MSTest suite without duplicating existing traits	4.0/5 → 4.7/5 🟢	✅ test-tagging; tools: skill / ⚠️ NOT ACTIVATED	🟡 0.29	❌ [23]
test-tagging	Accurately classify NUnit tests with misleading method names	4.0/5 → 5.0/5 🟢	✅ test-tagging; tools: skill / ⚠️ NOT ACTIVATED	🟡 0.29	✅ [24]
test-tagging	Tag MSTest tests and verify the project still builds	5.0/5 → 4.7/5 🔴	✅ test-tagging; tools: skill	🟡 0.29	❌ [25]

[1] ⚠️ High run-to-run variance (CV=339%) — consider re-running with --runs 5
[2] ⚠️ High run-to-run variance (CV=116%) — consider re-running with --runs 5
[3] ⚠️ High run-to-run variance (CV=86%) — consider re-running with --runs 5
[4] ⚠️ High run-to-run variance (CV=162%) — consider re-running with --runs 5. (Isolated) Quality unchanged but weighted score is -21.4% due to: judgment, quality, time (44.6s → 56.7s), tool calls (5 → 6)
[5] ⚠️ High run-to-run variance (CV=363%) — consider re-running with --runs 5. (Isolated) Quality unchanged but weighted score is -20.4% due to: judgment, quality
[6] ⚠️ High run-to-run variance (CV=412%) — consider re-running with --runs 5
[7] ⚠️ High run-to-run variance (CV=88%) — consider re-running with --runs 5. (Isolated) Quality unchanged but weighted score is -20.3% due to: judgment, tokens (12870 → 30170), tool calls (0 → 1), time (16.5s → 21.5s)
[8] ⚠️ High run-to-run variance (CV=115%) — consider re-running with --runs 5
[9] ⚠️ High run-to-run variance (CV=435%) — consider re-running with --runs 5
[10] ⚠️ High run-to-run variance (CV=844%) — consider re-running with --runs 5
[11] (Isolated) Quality unchanged but weighted score is -8.6% due to: tokens (12899 → 29929), tool calls (0 → 1), time (7.2s → 10.4s)
[12] ⚠️ High run-to-run variance (CV=175%) — consider re-running with --runs 5. (Isolated) Quality improved but weighted score is -18.8% due to: judgment, tokens (112499 → 603106), tool calls (6 → 25), time (66.1s → 159.8s)
[13] ⚠️ High run-to-run variance (CV=150%) — consider re-running with --runs 5. (Plugin) Quality unchanged but weighted score is -2.6% due to: tokens (180148 → 246970)
[14] ⚠️ High run-to-run variance (CV=332%) — consider re-running with --runs 5. (Plugin) Quality improved but weighted score is -2.2% due to: tokens (41133 → 55233), time (23.8s → 28.9s)
[15] ⚠️ High run-to-run variance (CV=139%) — consider re-running with --runs 5
[16] (Plugin) Quality unchanged but weighted score is -8.3% due to: tokens (41106 → 110745), tool calls (4 → 7), time (35.8s → 55.8s)
[17] ⚠️ High run-to-run variance (CV=272%) — consider re-running with --runs 5
[18] ⚠️ High run-to-run variance (CV=117%) — consider re-running with --runs 5
[19] ⚠️ High run-to-run variance (CV=303%) — consider re-running with --runs 5. (Isolated) Quality unchanged but weighted score is -17.2% due to: judgment, quality, tokens (87033 → 129397)
[20] ⚠️ High run-to-run variance (CV=96%) — consider re-running with --runs 5
[21] ⚠️ High run-to-run variance (CV=65%) — consider re-running with --runs 5
[22] ⚠️ High run-to-run variance (CV=73%) — consider re-running with --runs 5
[23] ⚠️ High run-to-run variance (CV=69%) — consider re-running with --runs 5. (Plugin) Quality unchanged but weighted score is -4.4% due to: tokens (155131 → 282226), time (76.4s → 96.2s)
[24] ⚠️ High run-to-run variance (CV=239%) — consider re-running with --runs 5
[25] ⚠️ High run-to-run variance (CV=231%) — consider re-running with --runs 5

⏰ timeout — run(s) hit the (120s, 180s) scenario timeout limit; scoring may be impacted by aborting model execution before it could produce its full output (increase via timeout in eval.yaml)

Model: claude-opus-4.6 | Judge: claude-opus-4.6

🔍 Full Results - additional metrics and failure investigation steps

To investigate failures, paste this to your AI coding agent:

For PR 703 in dotnet/skills, download eval artifacts with gh run download 26987777071 --repo dotnet/skills --pattern "skill-validator-results-*" --dir ./eval-results, then fetch https://raw.githubusercontent.com/dotnet/skills/8a00c7e65386eb3b4ea68dacd5f923a766682b70/eng/skill-validator/src/docs/InvestigatingResults.md and follow it to analyze the results.json files. Diagnose each failure, suggest fixes to the eval.yaml and skill content, and tell me what to fix first.

github-actions · 2026-06-05T11:06:57Z

Skill Validation Results

Skill	Scenario	Quality	Skills Loaded	Overfit	Verdict
binlog-generation	Build project with /bl flag	1.0/5 → 2.0/5 🟢	⚠️ NOT ACTIVATED	🟡 0.35	✅
binlog-generation	Build with /bl in PowerShell	3.0/5 → 5.0/5 🟢	✅ binlog-generation; tools: skill, glob / ⚠️ NOT ACTIVATED	🟡 0.35	❌ [1]
binlog-generation	Build multiple configurations with unique binlogs	4.0/5 → 5.0/5 🟢	✅ binlog-generation; tools: skill / ⚠️ NOT ACTIVATED	🟡 0.35	❌
build-perf-baseline	Establish build performance baseline and recommend optimizations	3.0/5 → 4.0/5 🟢	✅ build-perf-baseline; tools: skill, binlog-binlog_overview, binlog-binlog_diagnose, binlog-binlog_expensive_projects, binlog-binlog_expensive_tasks, binlog-binlog_expensive_analyzers, binlog-binlog_double_writes / ⚠️ NOT ACTIVATED	🟡 0.26	❌ [2]
eval-performance	Analyze MSBuild evaluation performance issues	5.0/5 → 5.0/5	✅ eval-performance; tools: skill / ⚠️ NOT ACTIVATED	✅ 0.20	❌
including-generated-files	Diagnose generated file inclusion failure	3.0/5 → 5.0/5 🟢	✅ including-generated-files; tools: skill / ⚠️ NOT ACTIVATED	🟡 0.23	✅
incremental-build	Analyze incremental build issues	3.0/5 → 4.0/5 🟢	⚠️ NOT ACTIVATED	✅ 0.13	❌ [3]
msbuild-modernization	Modernize legacy project to SDK-style	5.0/5 → 5.0/5	✅ msbuild-modernization; tools: skill / ⚠️ NOT ACTIVATED	✅ 0.06	❌ [4]
msbuild-server	Recommend MSBuild Server for slow CLI incremental builds	3.0/5 → 5.0/5 🟢	✅ msbuild-server; tools: skill	✅ 0.15	✅
resolve-project-references	Explain misleading ResolveProjectReferences time	4.0/5 → 5.0/5 🟢	✅ resolve-project-references; tools: skill / ⚠️ NOT ACTIVATED	✅ 0.16	❌ [5]
build-parallelism	Analyze build parallelism bottlenecks	4.0/5 → 4.0/5	✅ build-parallelism; tools: skill, binlog-binlog_overview, binlog-binlog_expensive_projects, binlog-binlog_projects, glob, binlog-binlog_search, binlog-binlog_expensive_targets, binlog-binlog_project_target_times / ⚠️ NOT ACTIVATED	✅ 0.20	❌ [6]
build-perf-diagnostics	Diagnose slow build for a small project	5.0/5 → 5.0/5	⚠️ NOT ACTIVATED	🟡 0.21	❌ [7]
check-bin-obj-clash	Diagnose bin/obj output path clashes	4.0/5 → 5.0/5 🟢	✅ check-bin-obj-clash; tools: skill, binlog-binlog_overview, binlog-binlog_double_writes, binlog-binlog_evaluations, binlog-binlog_properties, binlog-binlog_evaluation_properties / ✅ check-bin-obj-clash; tools: skill, binlog-binlog_overview, binlog-binlog_errors, binlog-binlog_evaluations, binlog-binlog_properties, binlog-binlog_evaluation_global_properties, binlog-binlog_evaluation_properties	✅ 0.16	❌ [8]
directory-build-organization	Organize build infrastructure for a multi-project repo	3.0/5 → 5.0/5 🟢	✅ directory-build-organization; tools: skill / ✅ directory-build-organization; tools: skill, create, edit, bash	✅ 0.15	✅
extension-points	Diagnose build extension point failures	3.0/5 → 5.0/5 🟢	✅ extension-points; tools: skill	✅ 0.08	✅
extension-points	Diagnose NuGet package and repo extension conflicts	3.0/5 → 3.0/5	✅ extension-points; tools: skill / ✅ extension-points; tools: skill, edit	✅ 0.08	❌ [9]
extension-points	Fix extension point anti-patterns	5.0/5 → 5.0/5	✅ extension-points; tools: skill	✅ 0.08	❌
item-management	Diagnose item group and batching issues	4.0/5 → 5.0/5 🟢	✅ item-management; tools: skill / ⚠️ NOT ACTIVATED	🟡 0.24	✅
item-management	Diagnose cascading item and batching bugs in code generation pipeline	4.0/5 → 4.0/5	✅ item-management; tools: skill, edit, bash / ⚠️ NOT ACTIVATED	🟡 0.24	❌ [10]
item-management	Fix item management anti-patterns	4.0/5 → 4.0/5	✅ item-management; tools: skill / ⚠️ NOT ACTIVATED	🟡 0.24	✅
binlog-failure-analysis	Diagnose build failures from binlog only (no source files)	4.0/5 → 4.0/5	⚠️ NOT ACTIVATED	✅ 0.09	✅
msbuild-antipatterns	Review MSBuild files for anti-patterns and style issues	5.0/5 → 5.0/5	✅ msbuild-antipatterns; tools: skill / ⚠️ NOT ACTIVATED	✅ 0.09	❌ [11]
msbuild-antipatterns	Add a module to an F# project	5.0/5 → 5.0/5	⚠️ NOT ACTIVATED	✅ 0.09	❌ [12]
msbuild-antipatterns	Fix broken file order causing FS0039	4.0/5 → 4.0/5	⚠️ NOT ACTIVATED	✅ 0.09	❌ [13]
msbuild-antipatterns	Add a signature file to define public API	5.0/5 → 5.0/5	⚠️ NOT ACTIVATED	✅ 0.09	❌ [14]
property-patterns	Diagnose shared build property issues	5.0/5 → 5.0/5	✅ property-patterns; tools: skill	✅ 0.16	❌ [15]
property-patterns	Diagnose multi-level property hierarchy bugs	4.0/5 → 5.0/5 🟢	✅ property-patterns; tools: skill	✅ 0.16	✅
property-patterns	Fix shared property configuration	5.0/5 → 5.0/5	✅ property-patterns; tools: skill / ⚠️ NOT ACTIVATED	✅ 0.16	❌ [16]
target-authoring	Diagnose custom target build regression	3.0/5 → 5.0/5 🟢	✅ target-authoring; tools: skill, bash / ✅ target-authoring; tools: skill	🟡 0.21	✅
target-authoring	Diagnose broken SDK target chain across files	3.0/5 → 3.0/5	✅ target-authoring; tools: skill	🟡 0.21	❌ [17]
target-authoring	Fix custom target anti-patterns	4.0/5 → 5.0/5 🟢	✅ target-authoring; tools: skill / ⚠️ NOT ACTIVATED	🟡 0.21	✅

[1] (Plugin) Quality unchanged but weighted score is -3.7% due to: tokens (25703 → 43968)
[2] (Plugin) Quality unchanged but weighted score is -0.2% due to: tokens (138107 → 527176), quality, time (64.8s → 138.7s), tool calls (18 → 29)
[3] (Plugin) Quality unchanged but weighted score is -4.2% due to: tokens (26372 → 44627), time (16.9s → 22.1s)
[4] (Plugin) Quality unchanged but weighted score is -2.9% due to: tokens (71862 → 117654)
[5] (Plugin) Quality unchanged but weighted score is -7.1% due to: quality, tokens (54412 → 90374)
[6] (Plugin) Quality unchanged but weighted score is -4.7% due to: tokens (85882 → 264225), time (43.5s → 72.9s), tool calls (10 → 15)
[7] (Isolated) Quality unchanged but weighted score is -5.5% due to: tokens (27429 → 56800), time (22.2s → 26.6s)
[8] (Isolated) Quality improved but weighted score is -6.9% due to: quality, tokens (124272 → 186252), tool calls (11 → 18)
[9] (Plugin) Quality unchanged but weighted score is -0.3% due to: tokens (57731 → 217025), time (65.7s → 112.6s), tool calls (10 → 16)
[10] (Plugin) Quality unchanged but weighted score is -8.7% due to: tokens (42945 → 199607), tool calls (5 → 17), time (56.0s → 82.0s)
[11] (Plugin) Quality unchanged but weighted score is -11.4% due to: tokens (59956 → 190372), quality, time (44.2s → 111.0s), tool calls (15 → 19)
[12] (Plugin) Quality unchanged but weighted score is -3.4% due to: tokens (98559 → 162241)
[13] (Plugin) Quality unchanged but weighted score is -3.8% due to: tokens (68233 → 113622)
[14] (Plugin) Quality unchanged but weighted score is -4.2% due to: tokens (67584 → 114194)
[15] (Plugin) Quality unchanged but weighted score is -4.9% due to: tokens (158177 → 355582)
[16] (Plugin) Quality unchanged but weighted score is -4.0% due to: tokens (132842 → 252543)
[17] (Isolated) Quality unchanged but weighted score is -7.2% due to: tokens (72813 → 137912), time (37.3s → 111.3s)

Model: claude-opus-4.6 | Judge: claude-opus-4.6

🔍 Full Results - additional metrics and failure investigation steps

To investigate failures, paste this to your AI coding agent:

For PR 703 in dotnet/skills, download eval artifacts with gh run download 27010466421 --repo dotnet/skills --pattern "skill-validator-results-*" --dir ./eval-results, then fetch https://raw.githubusercontent.com/dotnet/skills/a8bb6fc544b5456f92c984c96db75c91dbd35437/eng/skill-validator/src/docs/InvestigatingResults.md and follow it to analyze the results.json files. Diagnose each failure, suggest fixes to the eval.yaml and skill content, and tell me what to fix first.

▶ Sessions Visualisation -- interactive replay of all evaluation sessions
📊 Session Analytics (preview) -- aggregated metrics across evaluation sessions

AbhitejJohn · 2026-06-08T16:20:18Z

@javiercn : Looks like the token consumption increased without much of a change in quality, based on the evals. Would you mind taking a deeper look please?

Add MCP servers to dotnet-blazor plugin

58c330f

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings May 29, 2026 15:19

javiercn requested review from JanKrivanek and ViktorHofer as code owners May 29, 2026 15:19

Copilot started reviewing on behalf of javiercn May 29, 2026 15:22 View session

Copilot AI reviewed May 29, 2026

View reviewed changes

Comment thread plugins/dotnet-blazor/plugin.json

Comment thread plugins/dotnet-blazor/plugin.json Outdated

Comment thread plugins/dotnet-blazor/plugin.json

Comment thread plugins/dotnet-blazor/plugin.json Outdated

javiercn requested review from danroth27 and lewing May 29, 2026 15:23

javiercn and others added 2 commits May 29, 2026 17:27

Handle non-stdio MCP servers in skill-validator

c12dfff

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Use stdio for Playwright MCP server

df9625d

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings May 29, 2026 15:51

Copilot started reviewing on behalf of javiercn May 29, 2026 15:52 View session

Copilot AI reviewed May 29, 2026

View reviewed changes

Comment thread plugins/dotnet-blazor/plugin.json

Potential fix for pull request finding

a0be15f

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings May 29, 2026 16:02

Copilot started reviewing on behalf of javiercn May 29, 2026 16:02 View session

Copilot AI reviewed May 29, 2026

View reviewed changes

Comment thread plugins/dotnet-blazor/plugin.json Outdated

Potential fix for pull request finding

463bb05

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings May 29, 2026 16:12

Copilot started reviewing on behalf of javiercn May 29, 2026 16:12 View session

javiercn commented May 29, 2026

View reviewed changes

Comment thread plugins/dotnet-blazor/plugin.json Outdated

Copilot AI reviewed May 29, 2026

View reviewed changes

Comment thread plugins/dotnet-blazor/plugin.json Outdated

Copilot started work on behalf of javiercn May 29, 2026 16:15 View session

fix: correct dotnet-blazor playwright MCP manifest format

579fcb7

Copilot finished work on behalf of javiercn May 29, 2026 16:18

github-actions Bot added a commit that referenced this pull request May 29, 2026

Update PR token usage data (PR #703)

3997b05

github-actions Bot added a commit that referenced this pull request May 29, 2026

Update session data (PR #703)

8bb1cc8

github-actions Bot added the waiting-on-review PR state label label Jun 3, 2026

danroth27 approved these changes Jun 3, 2026

View reviewed changes

AbhitejJohn approved these changes Jun 4, 2026

View reviewed changes

Comment thread plugins/dotnet-blazor/plugin.json Outdated

Comment thread plugins/dotnet-blazor/plugin.json

Copilot AI mentioned this pull request Jun 4, 2026

[WIP] Address feedback on MCP servers for dotnet-blazor plugin integration #723

Merged

github-actions Bot added waiting-on-author PR state label and removed waiting-on-review PR state label labels Jun 4, 2026

Copilot AI review requested due to automatic review settings June 4, 2026 08:01

Copilot started reviewing on behalf of javiercn June 4, 2026 08:01 View session

Copilot AI reviewed Jun 4, 2026

View reviewed changes

Merge branch 'main' into javiercn/add-mcp-servers-to-blazor-plugin

a8bb6fc

github-actions Bot added a commit that referenced this pull request Jun 5, 2026

Update PR token usage data (PR #703)

d9a7ba0

github-actions Bot added pr-state/ready-for-eval PR is mergeable and awaiting evaluation and removed waiting-on-author PR state label labels Jun 5, 2026

JanKrivanek added the evaluate-now Trigger evaluation.yml for current PR head (transient) label Jun 5, 2026

github-actions Bot removed the evaluate-now Trigger evaluation.yml for current PR head (transient) label Jun 5, 2026

JanKrivanek mentioned this pull request Jun 5, 2026

Fix the auto-evaluation triggering in pr review agentic workflows #728

Merged

github-actions Bot added a commit that referenced this pull request Jun 5, 2026

Update PR token usage data (PR #703)

eaed871

Conversation

javiercn commented May 29, 2026

Summary

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Skill Coverage Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

AbhitejJohn commented May 29, 2026

Uh oh!

github-actions Bot commented May 29, 2026

Skill Validation Results

Uh oh!

github-actions Bot commented Jun 3, 2026

Uh oh!

danroth27 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI commented Jun 4, 2026

Uh oh!

github-actions Bot commented Jun 4, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

AbhitejJohn commented Jun 5, 2026

Uh oh!

github-actions Bot commented Jun 5, 2026

Skill Validation Results

❌ Skill validation errors

Uh oh!

github-actions Bot commented Jun 5, 2026

Skill Validation Results

Uh oh!

AbhitejJohn commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

github-actions Bot commented May 29, 2026 •

edited

Loading