Skip to content

Add JSON output for skill-validator check#601

Open
aaronpowell wants to merge 4 commits into
dotnet:mainfrom
aaronpowell:validator-output-as-json
Open

Add JSON output for skill-validator check#601
aaronpowell wants to merge 4 commits into
dotnet:mainfrom
aaronpowell:validator-output-as-json

Conversation

@aaronpowell

@aaronpowell aaronpowell commented Apr 30, 2026

Copy link
Copy Markdown

Summary

  • refactor skill-validator check to build a shared structured report and render either console output or JSON
  • add a slim machine-readable JSON contract with structured warnings/errors for plugins, skills, and agents
  • update README guidance and extend check-command tests for JSON output and early failures

Fixes #600

Here's an example of the JSON output from running against awesome-copilot's skills list (truncated).
{
  "counts": {
    "pluginCount": 0,
    "skillCount": 325,
    "agentCount": 0
  },
  "plugins": [],
  "skills": [
    {
      "name": "acquire-codebase-knowledge",
      "path": "..\\awesome-copilot\\skills\\acquire-codebase-knowledge",
      "skillMdPath": "..\\awesome-copilot\\skills\\acquire-codebase-knowledge\\SKILL.md",
      "errors": [
        "File reference \u0027assets/templates/STACK.md\u0027 is 2 directories deep \u2014 maximum is 1 level from SKILL.md.",
        "File reference \u0027assets/templates/STRUCTURE.md\u0027 is 2 directories deep \u2014 maximum is 1 level from SKILL.md.",
        "File reference \u0027assets/templates/ARCHITECTURE.md\u0027 is 2 directories deep \u2014 maximum is 1 level from SKILL.md.",
        "File reference \u0027assets/templates/CONVENTIONS.md\u0027 is 2 directories deep \u2014 maximum is 1 level from SKILL.md.",
        "File reference \u0027assets/templates/INTEGRATIONS.md\u0027 is 2 directories deep \u2014 maximum is 1 level from SKILL.md.",
        "File reference \u0027assets/templates/TESTING.md\u0027 is 2 directories deep \u2014 maximum is 1 level from SKILL.md.",
        "File reference \u0027assets/templates/CONCERNS.md\u0027 is 2 directories deep \u2014 maximum is 1 level from SKILL.md.",
        "File reference \u0027assets/templates/STACK.md\u0027 is 2 directories deep \u2014 maximum is 1 level from SKILL.md.",
        "File reference \u0027assets/templates/STRUCTURE.md\u0027 is 2 directories deep \u2014 maximum is 1 level from SKILL.md.",
        "File reference \u0027assets/templates/ARCHITECTURE.md\u0027 is 2 directories deep \u2014 maximum is 1 level from SKILL.md.",
        "File reference \u0027assets/templates/CONVENTIONS.md\u0027 is 2 directories deep \u2014 maximum is 1 level from SKILL.md.",
        "File reference \u0027assets/templates/INTEGRATIONS.md\u0027 is 2 directories deep \u2014 maximum is 1 level from SKILL.md.",
        "File reference \u0027assets/templates/TESTING.md\u0027 is 2 directories deep \u2014 maximum is 1 level from SKILL.md.",
        "File reference \u0027assets/templates/CONCERNS.md\u0027 is 2 directories deep \u2014 maximum is 1 level from SKILL.md."
      ],
      "warnings": [],
      "profile": {
        "name": "acquire-codebase-knowledge",
        "chars4TokenCount": 2270,
        "bpeTokenCount": 2169,
        "complexityTier": "detailed",
        "sectionCount": 12,
        "codeBlockCount": 3,
        "numberedStepCount": 23,
        "bulletCount": 15,
        "hasFrontmatter": true,
        "hasWhenToUse": false,
        "hasWhenNotToUse": false
      },
      "profileLine": "acquire-codebase-knowledge: 2,169 BPE tokens [chars/4: 2,270] (detailed \u2713), 12 sections, 3 code blocks"
    },
    {
      "name": "add-educational-comments",
      "path": "..\\awesome-copilot\\skills\\add-educational-comments",
      "skillMdPath": "..\\awesome-copilot\\skills\\add-educational-comments\\SKILL.md",
      "errors": [],
      "warnings": [],
      "profile": {
        "name": "add-educational-comments",
        "chars4TokenCount": 1556,
        "bpeTokenCount": 1346,
        "complexityTier": "detailed",
        "sectionCount": 17,
        "codeBlockCount": 2,
        "numberedStepCount": 9,
        "bulletCount": 49,
        "hasFrontmatter": true,
        "hasWhenToUse": false,
        "hasWhenNotToUse": false
      },
      "profileLine": "add-educational-comments: 1,346 BPE tokens [chars/4: 1,556] (detailed \u2713), 17 sections, 2 code blocks"
    },
    {
      "name": "adobe-illustrator-scripting",
      "path": "..\\awesome-copilot\\skills\\adobe-illustrator-scripting",
      "skillMdPath": "..\\awesome-copilot\\skills\\adobe-illustrator-scripting\\SKILL.md",
      "errors": [],
      "warnings": [
        {
          "kind": "profile",
          "message": "Skill is 5,832 BPE tokens (chars/4 estimate: 5,794) \u2014 \u0022comprehensive\u0022 skills hurt performance by 2.9pp on average. Consider splitting into 2\u20133 focused skills."
        },
        {
          "kind": "profile",
          "message": "No numbered workflow steps \u2014 agents follow sequenced procedures more reliably."
        }
      ],
      "profile": {
        "name": "adobe-illustrator-scripting",
        "chars4TokenCount": 5794,
        "bpeTokenCount": 5832,
        "complexityTier": "comprehensive",
        "sectionCount": 64,
        "codeBlockCount": 26,
        "numberedStepCount": 0,
        "bulletCount": 43,
        "hasFrontmatter": true,
        "hasWhenToUse": true,
        "hasWhenNotToUse": false
      },
      "profileLine": "adobe-illustrator-scripting: 5,832 BPE tokens [chars/4: 5,794] (comprehensive \u2717), 64 sections, 26 code blocks"
    },
    {
      "name": "agent-governance",
      "path": "..\\awesome-copilot\\skills\\agent-governance",
      "skillMdPath": "..\\awesome-copilot\\skills\\agent-governance\\SKILL.md",
      "errors": [],
      "warnings": [
        {
          "kind": "profile",
          "message": "Skill is 4,203 BPE tokens (chars/4 estimate: 4,626) \u2014 approaching \u0022comprehensive\u0022 range where gains diminish."
        },
        {
          "kind": "profile",
          "message": "No numbered workflow steps \u2014 agents follow sequenced procedures more reliably."
        }
      ],
      "profile": {
        "name": "agent-governance",
        "chars4TokenCount": 4626,
        "bpeTokenCount": 4203,
        "complexityTier": "standard",
        "sectionCount": 33,
        "codeBlockCount": 14,
        "numberedStepCount": 0,
        "bulletCount": 19,
        "hasFrontmatter": true,
        "hasWhenToUse": true,
        "hasWhenNotToUse": false
      },
      "profileLine": "agent-governance: 4,203 BPE tokens [chars/4: 4,626] (standard ~), 33 sections, 14 code blocks"
    },
    {
      "name": "agent-owasp-compliance",
      "path": "..\\awesome-copilot\\skills\\agent-owasp-compliance",
      "skillMdPath": "..\\awesome-copilot\\skills\\agent-owasp-compliance\\SKILL.md",
      "errors": [],
      "warnings": [
        {
          "kind": "profile",
          "message": "Skill is 2,754 BPE tokens (chars/4 estimate: 2,970) \u2014 approaching \u0022comprehensive\u0022 range where gains diminish."
        }
      ],
      "profile": {
        "name": "agent-owasp-compliance",
        "chars4TokenCount": 2970,
        "bpeTokenCount": 2754,
        "complexityTier": "standard",
        "sectionCount": 22,
        "codeBlockCount": 7,
        "numberedStepCount": 10,
        "bulletCount": 43,
        "hasFrontmatter": true,
        "hasWhenToUse": false,
        "hasWhenNotToUse": false
      },
      "profileLine": "agent-owasp-compliance: 2,754 BPE tokens [chars/4: 2,970] (standard ~), 22 sections, 7 code blocks"
    },
    {
      "name": "agent-supply-chain",
      "path": "..\\awesome-copilot\\skills\\agent-supply-chain",
      "skillMdPath": "..\\awesome-copilot\\skills\\agent-supply-chain\\SKILL.md",
      "errors": [],
      "warnings": [
        {
          "kind": "profile",
          "message": "Skill is 2,515 BPE tokens (chars/4 estimate: 2,677) \u2014 approaching \u0022comprehensive\u0022 range where gains diminish."
        },
        {
          "kind": "profile",
          "message": "No numbered workflow steps \u2014 agents follow sequenced procedures more reliably."
        }
      ],
      "profile": {
        "name": "agent-supply-chain",
        "chars4TokenCount": 2677,
        "bpeTokenCount": 2515,
        "complexityTier": "standard",
        "sectionCount": 13,
        "codeBlockCount": 8,
        "numberedStepCount": 0,
        "bulletCount": 10,
        "hasFrontmatter": true,
        "hasWhenToUse": true,
        "hasWhenNotToUse": false
      },
      "profileLine": "agent-supply-chain: 2,515 BPE tokens [chars/4: 2,677] (standard ~), 13 sections, 8 code blocks"
    },
    {
      "name": "agentic-eval",
      "path": "..\\awesome-copilot\\skills\\agentic-eval",
      "skillMdPath": "..\\awesome-copilot\\skills\\agentic-eval\\SKILL.md",
      "errors": [],
      "warnings": [
        {
          "kind": "profile",
          "message": "No numbered workflow steps \u2014 agents follow sequenced procedures more reliably."
        }
      ],
      "profile": {
        "name": "agentic-eval",
        "chars4TokenCount": 1466,
        "bpeTokenCount": 1358,
        "complexityTier": "detailed",
        "sectionCount": 16,
        "codeBlockCount": 8,
        "numberedStepCount": 0,
        "bulletCount": 13,
        "hasFrontmatter": true,
        "hasWhenToUse": true,
        "hasWhenNotToUse": false
      },
      "profileLine": "agentic-eval: 1,358 BPE tokens [chars/4: 1,466] (detailed \u2713), 16 sections, 8 code blocks"
    },
    {
      "name": "ai-prompt-engineering-safety-review",
      "path": "..\\awesome-copilot\\skills\\ai-prompt-engineering-safety-review",
      "skillMdPath": "..\\awesome-copilot\\skills\\ai-prompt-engineering-safety-review\\SKILL.md",
      "errors": [],
      "warnings": [
        {
          "kind": "profile",
          "message": "No code blocks \u2014 agents perform better with concrete snippets and commands."
        }
      ],
      "profile": {
        "name": "ai-prompt-engineering-safety-review",
        "chars4TokenCount": 2540,
        "bpeTokenCount": 2185,
        "complexityTier": "detailed",
        "sectionCount": 19,
        "codeBlockCount": 0,
        "numberedStepCount": 20,
        "bulletCount": 104,
        "hasFrontmatter": true,
        "hasWhenToUse": false,
        "hasWhenNotToUse": false
      },
      "profileLine": "ai-prompt-engineering-safety-review: 2,185 BPE tokens [chars/4: 2,540] (detailed \u2713), 19 sections, 0 code blocks"
    },
    {
      "name": "ai-ready",
      "path": "..\\awesome-copilot\\skills\\ai-ready",
      "skillMdPath": "..\\awesome-copilot\\skills\\ai-ready\\SKILL.md",
      "errors": [],
      "warnings": [],
      "profile": {
        "name": "ai-ready",
        "chars4TokenCount": 516,
        "bpeTokenCount": 521,
        "complexityTier": "detailed",
        "sectionCount": 2,
        "codeBlockCount": 3,
        "numberedStepCount": 4,
        "bulletCount": 0,
        "hasFrontmatter": true,
        "hasWhenToUse": false,
        "hasWhenNotToUse": false
      },
      "profileLine": "ai-ready: 521 BPE tokens [chars/4: 516] (detailed \u2713), 2 sections, 3 code blocks"
    },
    {
      "name": "ai-team-orchestration",
      "path": "..\\awesome-copilot\\skills\\ai-team-orchestration",
      "skillMdPath": "..\\awesome-copilot\\skills\\ai-team-orchestration\\SKILL.md",
      "errors": [],
      "warnings": [],
      "profile": {
        "name": "ai-team-orchestration",
        "chars4TokenCount": 1400,
        "bpeTokenCount": 1436,
        "complexityTier": "detailed",
        "sectionCount": 13,
        "codeBlockCount": 5,
        "numberedStepCount": 17,
        "bulletCount": 12,
        "hasFrontmatter": true,
        "hasWhenToUse": true,
        "hasWhenNotToUse": false
      },
      "profileLine": "ai-team-orchestration: 1,436 BPE tokens [chars/4: 1,400] (detailed \u2713), 13 sections, 5 code blocks"
    },
    {
      "name": "appinsights-instrumentation",
      "path": "..\\awesome-copilot\\skills\\appinsights-instrumentation",
      "skillMdPath": "..\\awesome-copilot\\skills\\appinsights-instrumentation\\SKILL.md",
      "errors": [],
      "warnings": [
        {
          "kind": "profile",
          "message": "No code blocks \u2014 agents perform better with concrete snippets and commands."
        },
        {
          "kind": "profile",
          "message": "No numbered workflow steps \u2014 agents follow sequenced procedures more reliably."
        }
      ],
      "profile": {
        "name": "appinsights-instrumentation",
        "chars4TokenCount": 616,
        "bpeTokenCount": 547,
        "complexityTier": "detailed",
        "sectionCount": 9,
        "codeBlockCount": 0,
        "numberedStepCount": 0,
        "bulletCount": 7,
        "hasFrontmatter": true,
        "hasWhenToUse": true,
        "hasWhenNotToUse": false
      },
      "profileLine": "appinsights-instrumentation: 547 BPE tokens [chars/4: 616] (detailed \u2713), 9 sections, 0 code blocks"
    },
    {
      "name": "apple-appstore-reviewer",
      "path": "..\\awesome-copilot\\skills\\apple-appstore-reviewer",
      "skillMdPath": "..\\awesome-copilot\\skills\\apple-appstore-reviewer\\SKILL.md",
      "errors": [],
      "warnings": [
        {
          "kind": "profile",
          "message": "No code blocks \u2014 agents perform better with concrete snippets and commands."
        }
      ],
      "profile": {
        "name": "apple-appstore-reviewer",
        "chars4TokenCount": 2336,
        "bpeTokenCount": 2047,
        "complexityTier": "detailed",
        "sectionCount": 36,
        "codeBlockCount": 0,
        "numberedStepCount": 8,
        "bulletCount": 127,
        "hasFrontmatter": true,
        "hasWhenToUse": false,
        "hasWhenNotToUse": false
      },
      "profileLine": "apple-appstore-reviewer: 2,047 BPE tokens [chars/4: 2,336] (detailed \u2713), 36 sections, 0 code blocks"
    },
    {
      "name": "arch-linux-triage",
      "path": "..\\awesome-copilot\\skills\\arch-linux-triage",
      "skillMdPath": "..\\awesome-copilot\\skills\\arch-linux-triage\\SKILL.md",
      "errors": [],
      "warnings": [
        {
          "kind": "profile",
          "message": "No code blocks \u2014 agents perform better with concrete snippets and commands."
        }
      ],
      "profile": {
        "name": "arch-linux-triage",
        "chars4TokenCount": 232,
        "bpeTokenCount": 212,
        "complexityTier": "compact",
        "sectionCount": 4,
        "codeBlockCount": 0,
        "numberedStepCount": 6,
        "bulletCount": 8,
        "hasFrontmatter": true,
        "hasWhenToUse": false,
        "hasWhenNotToUse": false
      },
      "profileLine": "arch-linux-triage: 212 BPE tokens [chars/4: 232] (compact \u2713), 4 sections, 0 code blocks"
    },
    {
      "name": "architecture-blueprint-generator",
      "path": "..\\awesome-copilot\\skills\\architecture-blueprint-generator",
      "skillMdPath": "..\\awesome-copilot\\skills\\architecture-blueprint-generator\\SKILL.md",
      "errors": [],
      "warnings": [
        {
          "kind": "profile",
          "message": "Skill is 2,519 BPE tokens (chars/4 estimate: 3,340) \u2014 approaching \u0022comprehensive\u0022 range where gains diminish."
        },
        {
          "kind": "profile",
          "message": "No code blocks \u2014 agents perform better with concrete snippets and commands."
        },
        {
          "kind": "profile",
          "message": "No numbered workflow steps \u2014 agents follow sequenced procedures more reliably."
        }
      ],
      "profile": {
        "name": "architecture-blueprint-generator",
        "chars4TokenCount": 3340,
        "bpeTokenCount": 2519,
        "complexityTier": "standard",
        "sectionCount": 18,
        "codeBlockCount": 0,
        "numberedStepCount": 0,
        "bulletCount": 104,
        "hasFrontmatter": true,
        "hasWhenToUse": false,
        "hasWhenNotToUse": false
      },
      "profileLine": "architecture-blueprint-generator: 2,519 BPE tokens [chars/4: 3,340] (standard ~), 18 sections, 0 code blocks"
    },
    {
      "name": "arduino-azure-iot-edge-integration",
      "path": "..\\awesome-copilot\\skills\\arduino-azure-iot-edge-integration",
      "skillMdPath": "..\\awesome-copilot\\skills\\arduino-azure-iot-edge-integration\\SKILL.md",
      "errors": [],
      "warnings": [
        {
          "kind": "profile",
          "message": "No code blocks \u2014 agents perform better with concrete snippets and commands."
        }
      ],
      "profile": {
        "name": "arduino-azure-iot-edge-integration",
        "chars4TokenCount": 1171,
        "bpeTokenCount": 929,
        "complexityTier": "detailed",
        "sectionCount": 18,
        "codeBlockCount": 0,
        "numberedStepCount": 10,
        "bulletCount": 44,
        "hasFrontmatter": true,
        "hasWhenToUse": true,
        "hasWhenNotToUse": false
      },
      "profileLine": "arduino-azure-iot-edge-integration: 929 BPE tokens [chars/4: 1,171] (detailed \u2713), 18 sections, 0 code blocks"
    }
  ],
  "agents": []
}

Fixes dotnet#600

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 30, 2026 04:39
@github-actions

Copy link
Copy Markdown
Contributor

Note

This PR is from a fork and modifies infrastructure files (eng/ or .github/).

Changes to infrastructure typically need to be submitted from a branch in dotnet/skills (not a fork) so that CI workflows run with the correct permissions and secrets.

Please consider recreating this PR from an upstream branch. If you don't have push access to dotnet/skills, ask a maintainer to push your branch for you.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a --json output mode to skill-validator check by refactoring the command to build a structured in-memory report first, then render either human-readable console output or a slim JSON payload suitable for downstream tooling.

Changes:

  • Refactor check to generate a shared CheckReport and render it to console or JSON (--json).
  • Introduce new check result/report models (CheckReport, CheckJsonOutput, per-domain result types) and JSON source-gen support.
  • Update README usage guidance and extend check-command tests to cover JSON output and early failure scenarios.
Show a summary per file
File Description
eng/skill-validator/tests/Check/CheckCommandTests.cs Adds JSON-output tests and console capture utilities; introduces an xUnit collection to prevent parallel console mutation.
eng/skill-validator/src/SkillValidatorJsonContext.cs Registers CheckJsonOutput for source-generated JSON serialization and enables string-enum conversion.
eng/skill-validator/src/README.md Documents the new --json flag and clarifies stdout behavior.
eng/skill-validator/src/Check/PluginProfiler.cs Updates plugin validation to return the new PluginCheckResult model.
eng/skill-validator/src/Check/Models.cs Adds new report/result models and the CheckOutputMode enum used by check.
eng/skill-validator/src/Check/CheckCommand.cs Implements report building + rendering pipeline and adds --json flag handling.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comments suppressed due to low confidence (1)

eng/skill-validator/src/Check/CheckCommand.cs:329

  • When no agents are discovered, this method records an error but returns Result = 0. In the combined skills+agents flow, this can allow a successful exit code even though the report contains an error, which is especially problematic for --json consumers. Returning a failure result (when agent paths were explicitly provided) or making ExitCode reflect GeneralErrors would avoid this inconsistency.
        if (allAgents.Count == 0)
        {
            if (agentPaths.Count > 0)
            {
                var searched = string.Join(", ", agentPaths.Select(p => $"\"{Path.GetFullPath(p)}\""));
                builder.AddPlainError($"No agents found in the specified paths: {searched}");
            }

            return ([], [], 0);
        }
  • Files reviewed: 6/6 changed files
  • Comments generated: 2

Comment thread eng/skill-validator/src/Check/CheckCommand.cs
Comment thread eng/skill-validator/tests/Check/CheckCommandTests.cs
@ViktorHofer

Copy link
Copy Markdown
Member

@aaronpowell this looks great. Happy to take it after you resolved the copilot comments.

Address PR feedback by failing when explicit skill or agent paths discover nothing, including the combined check path.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@aaronpowell

Copy link
Copy Markdown
Author

@ViktorHofer done

@AbhitejJohn

Copy link
Copy Markdown
Contributor

Tagging @ViktorHofer for a re-review.

@ViktorHofer

Copy link
Copy Markdown
Member

@copilot resolve the merge conflicts in this pull request

Copilot AI review requested due to automatic review settings May 28, 2026 02:23
@aaronpowell

Copy link
Copy Markdown
Author

@ViktorHofer - done

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 6/6 changed files
  • Comments generated: 5

Comment on lines +219 to +222
var message = $"Plugin '{pluginName}' aggregate description size is {totalChars:N0} characters — maximum is {SkillProfiler.MaxAggregateDescriptionLength:N0}.";
if (builder.Plugins.FirstOrDefault(p => string.Equals(p.Name, pluginName, StringComparison.Ordinal)) is { } pluginResult)
pluginResult.Errors.Add(message);
return 1;
Comment on lines +725 to +734
if (string.Equals(fullCandidatePath, fullContainerPath, StringComparison.OrdinalIgnoreCase))
return true;

if (File.Exists(fullContainerPath))
fullContainerPath = Path.GetDirectoryName(fullContainerPath)!;

var normalizedContainer = fullContainerPath.TrimEnd(Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar)
+ Path.DirectorySeparatorChar;

return fullCandidatePath.StartsWith(normalizedContainer, StringComparison.OrdinalIgnoreCase);
Comment on lines 11 to +15
WriteIndented = true,
PropertyNamingPolicy = JsonKnownNamingPolicy.CamelCase,
PropertyNameCaseInsensitive = true,
DefaultIgnoreCondition = JsonIgnoreCondition.WhenWritingNull)]
DefaultIgnoreCondition = JsonIgnoreCondition.WhenWritingNull,
UseStringEnumConverter = true)]
Comment on lines +525 to +542
foreach (var dependency in report.ExternalDependencies)
{
switch (dependency.Kind)
{
case "plugin":
if (plugins.FirstOrDefault(plugin => string.Equals(plugin.Name, dependency.Name, StringComparison.Ordinal)) is { } plugin)
plugin.Warnings.Add(new CheckJsonWarning("externalDependency", dependency.Message));
break;
case "skill":
if (skills.FirstOrDefault(skill => string.Equals(skill.Name, dependency.Name, StringComparison.Ordinal)) is { } skill)
skill.Warnings.Add(new CheckJsonWarning("externalDependency", dependency.Message));
break;
case "agent":
if (agents.FirstOrDefault(agent => string.Equals(agent.Name, dependency.Name, StringComparison.Ordinal)) is { } agent)
agent.Warnings.Add(new CheckJsonWarning("externalDependency", dependency.Message));
break;
}
}
Comment on lines +751 to +753
public void AddPlainError(string text) => GeneralErrors.Add(text);

public void AddGeneralError(string text) => GeneralErrors.Add(text);
@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

✅ Automated diff scan completed for 7bbeb7f1 — no security concerns flagged.

This is an automated static analysis of the PR diff.

Note

🔒 Integrity filter blocked 51 items

The following items were blocked because they don't meet the GitHub integrity level.

  • #601 pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #601 search_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #95 list_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #92 list_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #91 list_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #89 list_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #88 list_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #87 list_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #86 list_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #85 list_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #80 list_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #79 list_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #73 list_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #53 list_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #51 list_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #34 list_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • ... and 35 more items

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

Generated by PR Malicious Code Scan · ● 3.1M ·

@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

✅ Automated diff scan completed for 7bbeb7f — no security concerns flagged.

This is an automated static analysis of the PR diff.

Note

🔒 Integrity filter blocked 7 items

The following items were blocked because they don't meet the GitHub integrity level.

  • #601 pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #601 search_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #694 search_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #686 search_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #598 search_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • Add JSON output for skill-validator check #601 pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • 7bbeb7f list_commits: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

Generated by PR Malicious Code Scan · ● 1.1M ·

@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

✅ Automated diff scan completed for 7bbeb7f — no security concerns flagged.

This is an automated static analysis of the PR diff.

Note

🔒 Integrity filter blocked 68 items

The following items were blocked because they don't meet the GitHub integrity level.

  • #601 pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #601 search_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #598 search_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #694 search_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #686 search_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #237 search_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #486 search_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #329 search_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • Add JSON output for skill-validator check #601 pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #694 list_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #686 list_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #682 list_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #609 list_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #607 list_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #601 list_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #598 list_pull_requests: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • ... and 52 more items

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

Generated by PR Malicious Code Scan · ● 2.1M ·

@ViktorHofer

Copy link
Copy Markdown
Member

/evaluate

@ViktorHofer

ViktorHofer commented Jun 9, 2026

Copy link
Copy Markdown
Member

@aaronpowell do you want to resolve the remaining Copilot feedback as well?

github-actions Bot added a commit that referenced this pull request Jun 9, 2026
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Skill Validation Results

Skill Scenario Quality Skills Loaded Overfit Verdict
convert-blazor-server-to-webapp Blazor Server app with CascadingAuthenticationState 4.0/5 → 5.0/5 🟢 ✅ convert-blazor-server-to-webapp; tools: report_intent, skill / ✅ convert-blazor-server-to-webapp; tools: skill ✅ 0.04
configuring-opentelemetry-dotnet Set up OpenTelemetry tracing and metrics with custom spans in ASP.NET Core 4.0/5 → 5.0/5 🟢 ✅ configuring-opentelemetry-dotnet; tools: report_intent, skill / ✅ configuring-opentelemetry-dotnet; tools: skill ✅ 0.15
configuring-opentelemetry-dotnet Configure all three OpenTelemetry signals with correct OTLP export 4.0/5 → 5.0/5 🟢 ✅ configuring-opentelemetry-dotnet; tools: skill / ✅ configuring-opentelemetry-dotnet; tools: report_intent, skill ✅ 0.15
configuring-opentelemetry-dotnet Propagate trace context across a message queue 5.0/5 → 5.0/5 ✅ configuring-opentelemetry-dotnet; tools: skill ✅ 0.15 [1]
minimal-api-file-upload Implement secure file upload in ASP.NET Core 8 minimal API 3.0/5 → 5.0/5 🟢 ✅ minimal-api-file-upload; tools: skill ✅ 0.09
minimal-api-file-upload Upload multiple files with metadata in minimal API 3.0/5 → 5.0/5 🟢 ✅ minimal-api-file-upload; tools: skill ✅ 0.09
minimal-api-file-upload Stream very large file uploads without buffering 5.0/5 → 5.0/5 ✅ minimal-api-file-upload; tools: skill ✅ 0.09 [2]
dotnet-webapi Create a CRUD Web API with minimal APIs, OpenAPI, and proper HTTP semantics 4.0/5 → 5.0/5 🟢 ✅ dotnet-webapi; tools: skill 🟡 0.29
dotnet-webapi Add error handling with ProblemDetails and IExceptionHandler 3.0/5 → 4.0/5 🟢 ✅ dotnet-webapi; tools: skill 🟡 0.29
dotnet-webapi Add a new API endpoint to an existing controller-based project 3.0/5 → 4.0/5 🟢 ✅ dotnet-webapi; tools: skill, view 🟡 0.29
binlog-generation Build project with /bl flag 2.0/5 → 5.0/5 🟢 ✅ binlog-generation; tools: skill, glob / ⚠️ NOT ACTIVATED 🟡 0.36
binlog-generation Build with /bl in PowerShell 4.0/5 → 5.0/5 🟢 ✅ binlog-generation; tools: skill / ✅ binlog-generation; tools: skill, glob 🟡 0.36
binlog-generation Build multiple configurations with unique binlogs 2.0/5 → 5.0/5 🟢 ✅ binlog-generation; tools: skill / ⚠️ NOT ACTIVATED 🟡 0.36 [3]
build-perf-baseline Establish build performance baseline and recommend optimizations 3.0/5 → 4.0/5 🟢 ✅ build-perf-baseline; tools: skill, binlog-binlog_overview, binlog-binlog_diagnose, binlog-binlog_build_graph, binlog-binlog_expensive_tasks, binlog-binlog_incremental_analysis, binlog-binlog_expensive_analyzers, binlog-binlog_target_reasons, binlog-binlog_expensive_projects, binlog-binlog_double_writes, binlog-binlog_warnings / ⚠️ NOT ACTIVATED 🟡 0.42
eval-performance Analyze MSBuild evaluation performance issues 5.0/5 → 5.0/5 ⚠️ NOT ACTIVATED / ✅ eval-performance; tools: skill ✅ 0.18
including-generated-files Diagnose generated file inclusion failure 3.0/5 → 4.0/5 🟢 ⚠️ NOT ACTIVATED / ✅ including-generated-files; tools: skill 🟡 0.22
incremental-build Analyze incremental build issues 3.0/5 → 3.0/5 ⚠️ NOT ACTIVATED ✅ 0.14 [4]
msbuild-modernization Modernize legacy project to SDK-style 5.0/5 → 5.0/5 ✅ msbuild-modernization; tools: skill ✅ 0.08 [5]
msbuild-server Recommend MSBuild Server for slow CLI incremental builds 3.0/5 → 5.0/5 🟢 ✅ msbuild-server; tools: skill ✅ 0.20
resolve-project-references Explain misleading ResolveProjectReferences time 3.0/5 → 5.0/5 🟢 ✅ resolve-project-references; tools: skill / ⚠️ NOT ACTIVATED ✅ 0.14
build-parallelism Analyze build parallelism bottlenecks 4.0/5 → 4.0/5 ⚠️ NOT ACTIVATED ✅ 0.20 [6]
build-perf-diagnostics Diagnose slow build for a small project 5.0/5 → 4.0/5 🔴 ⚠️ NOT ACTIVATED 🟡 0.28
check-bin-obj-clash Diagnose bin/obj output path clashes 4.0/5 → 5.0/5 🟢 ✅ check-bin-obj-clash; tools: skill, binlog-binlog_overview, binlog-binlog_double_writes, binlog-binlog_evaluations, binlog-binlog_properties, binlog-binlog_evaluation_global_properties, binlog-binlog_evaluation_properties / ✅ check-bin-obj-clash; tools: binlog-binlog_double_writes, skill, binlog-binlog_evaluations, binlog-binlog_properties, binlog-binlog_evaluation_properties ✅ 0.14
directory-build-organization Organize build infrastructure for a multi-project repo 4.0/5 → 5.0/5 🟢 ✅ directory-build-organization; tools: skill / ⚠️ NOT ACTIVATED 🟡 0.20
extension-points Diagnose build extension point failures 3.0/5 → 5.0/5 🟢 ✅ extension-points; tools: skill ✅ 0.09
extension-points Diagnose NuGet package and repo extension conflicts 3.0/5 → 3.0/5 ✅ extension-points; tools: skill / ✅ extension-points; tools: skill, edit ✅ 0.09 [7]
extension-points Fix extension point anti-patterns 4.0/5 → 4.0/5 ✅ extension-points; tools: skill, glob / ✅ extension-points; tools: skill ✅ 0.09 [8]
item-management Diagnose item group and batching issues 5.0/5 → 5.0/5 ✅ item-management; tools: skill / ⚠️ NOT ACTIVATED ✅ 0.15 [9]
item-management Diagnose cascading item and batching bugs in code generation pipeline 4.0/5 → 4.0/5 ✅ item-management; tools: skill, edit, bash / ⚠️ NOT ACTIVATED ✅ 0.15 [10]
item-management Fix item management anti-patterns 4.0/5 → 5.0/5 🟢 ✅ item-management; tools: skill / ⚠️ NOT ACTIVATED ✅ 0.15
binlog-failure-analysis Diagnose build failures from binlog only (no source files) 4.0/5 → 4.0/5 ⚠️ NOT ACTIVATED ✅ 0.12
msbuild-antipatterns Review MSBuild files for anti-patterns and style issues 5.0/5 → 5.0/5 ✅ msbuild-antipatterns; tools: skill / ⚠️ NOT ACTIVATED ✅ 0.09
msbuild-antipatterns Add a module to an F# project 5.0/5 → 5.0/5 ⚠️ NOT ACTIVATED ✅ 0.09
msbuild-antipatterns Fix broken file order causing FS0039 4.0/5 → 4.0/5 ⚠️ NOT ACTIVATED ✅ 0.09 [11]
msbuild-antipatterns Add a signature file to define public API 5.0/5 → 5.0/5 ⚠️ NOT ACTIVATED ✅ 0.09 [12]
property-patterns Diagnose shared build property issues 5.0/5 → 5.0/5 ✅ property-patterns; tools: glob, skill / ⚠️ NOT ACTIVATED 🟡 0.20 [13]
property-patterns Diagnose multi-level property hierarchy bugs 4.0/5 → 4.0/5 ✅ property-patterns; tools: skill / ✅ property-patterns; tools: skill, bash 🟡 0.20
property-patterns Fix shared property configuration 5.0/5 → 5.0/5 ✅ property-patterns; tools: skill / ⚠️ NOT ACTIVATED 🟡 0.20
target-authoring Diagnose custom target build regression 3.0/5 → 5.0/5 🟢 ✅ target-authoring; tools: skill, bash / ⚠️ NOT ACTIVATED 🟡 0.25
target-authoring Diagnose broken SDK target chain across files 3.0/5 → 3.0/5 ✅ target-authoring; tools: skill 🟡 0.25 [14]
target-authoring Fix custom target anti-patterns 4.0/5 → 4.0/5 ✅ target-authoring; tools: glob, skill / ⚠️ NOT ACTIVATED 🟡 0.25 [15]

[1] (Plugin) Quality unchanged but weighted score is -7.8% due to: tokens (13217 → 30098), tool calls (0 → 1)
[2] (Isolated) Quality unchanged but weighted score is -14.4% due to: judgment, tokens (13482 → 28916), tool calls (0 → 1)
[3] (Plugin) Quality unchanged but weighted score is -1.0% due to: tokens (38833 → 46490)
[4] (Plugin) Quality unchanged but weighted score is -5.6% due to: tokens (26482 → 45581), quality
[5] (Plugin) Quality unchanged but weighted score is -4.3% due to: tokens (73573 → 188493), time (29.9s → 48.1s), tool calls (10 → 14)
[6] (Plugin) Quality unchanged but weighted score is -0.2% due to: tokens (251708 → 296527)
[7] (Plugin) Quality unchanged but weighted score is -17.2% due to: quality, tokens (55825 → 216912), time (40.7s → 95.1s), tool calls (10 → 17)
[8] (Plugin) Quality unchanged but weighted score is -7.7% due to: tokens (87036 → 176376), time (44.1s → 90.5s)
[9] (Plugin) Quality unchanged but weighted score is -1.9% due to: tokens (26562 → 45891), time (22.1s → 35.2s)
[10] (Plugin) Quality unchanged but weighted score is -20.1% due to: quality, tokens (42193 → 120995), tool calls (5 → 10), time (48.5s → 59.9s)
[11] (Plugin) Quality unchanged but weighted score is -4.3% due to: tokens (67664 → 116144)
[12] (Plugin) Quality unchanged but weighted score is -6.6% due to: tokens (79795 → 137811), quality
[13] (Plugin) Quality unchanged but weighted score is -3.4% due to: tokens (110104 → 177717)
[14] (Plugin) Quality unchanged but weighted score is -0.1% due to: tokens (56631 → 156528), time (36.3s → 96.9s), tool calls (8 → 10)
[15] (Isolated) Quality unchanged but weighted score is -25.1% due to: judgment, tokens (70384 → 134422), quality, tool calls (6 → 13), time (33.0s → 59.9s)

Model: claude-opus-4.6 | Judge: claude-opus-4.6

🔍 Full Results - additional metrics and failure investigation steps

To investigate failures, paste this to your AI coding agent:

For PR 601 in dotnet/skills, download eval artifacts with gh run download 27213893938 --repo dotnet/skills --pattern "skill-validator-results-*" --dir ./eval-results, then fetch https://raw.githubusercontent.com/dotnet/skills/e5dae86172ba4e5b8590cc4f685758457c7c7e6f/eng/skill-validator/src/docs/InvestigatingResults.md and follow it to analyze the results.json files. Diagnose each failure, suggest fixes to the eval.yaml and skill content, and tell me what to fix first.

▶ Sessions Visualisation -- interactive replay of all evaluation sessions
📊 Session Analytics (preview) -- aggregated metrics across evaluation sessions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support JSON output for skills validator

4 participants