code-testing-agent: mention find-untested-sources for C# discovery#734
Open
Evangelink wants to merge 1 commit into
Open
code-testing-agent: mention find-untested-sources for C# discovery#734Evangelink wants to merge 1 commit into
Evangelink wants to merge 1 commit into
Conversation
Adds a conditional pointer (gated on 'when available') to the find-untested-sources skill in two places: - SKILL.md Step 3 (Research Phase): high-level note for C# / .NET multi-file scopes — prefer the helper over manual find/grep/glob walks. - code-testing-researcher.agent.md Section 7 (Discover Preexisting Tests): directive instruction telling the researcher to invoke the helper before manually pairing source <-> test files, and to use its source_to_tests / untested output to fill the research document. Both callouts are phrased as 'when available', so installations without the find-untested-sources skill continue to work via manual discovery. Adds no behavior for non-C# repos. Context: in a 5x136-instance internal experiment on the msbench .NET test bench, adding equivalent pointers to the routed code-testing-agent yielded a 15.67% input-token reduction at neutral pass rate — the model trusted the documented pairing heuristics and skipped its own discovery walk. The helper itself was not invoked in those runs (the Copilot CLI router did not auto-load the sibling skill), so the measured win comes from the doc text causing the model to short-circuit its manual exploration, not from the helper executing. Depends on #733 (which adds the find-untested-sources skill itself). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
Skill Coverage Report
Uncovered:
|
Contributor
There was a problem hiding this comment.
Pull request overview
Updates code-testing-agent documentation to recommend the find-untested-sources helper (when installed) for C#/.NET multi-file scopes, so the researcher can build source↔test pairing more efficiently without manual repo-wide discovery walks.
Changes:
- Add a C#/.NET-specific note in
SKILL.mdStep 3 (Research Phase) pointing researchers tofind-untested-sourceswhen available. - Add a directive in
code-testing-researcher.agent.mdStep 7 (Discover Preexisting Tests) to invokefind-untested-sourcesfirst and use itsuntested/source_to_testsoutput to populate the research doc.
Show a summary per file
| File | Description |
|---|---|
| plugins/dotnet-test/skills/code-testing-agent/SKILL.md | Adds a gated C#/.NET pointer to prefer find-untested-sources for pairing over manual find/grep/glob. |
| plugins/dotnet-test/agents/code-testing-researcher.agent.md | Instructs the researcher to invoke find-untested-sources (when available) before manual source↔test pairing and to use its JSON outputs in research. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 2/2 changed files
- Comments generated: 0
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a conditional pointer to the
find-untested-sourcesskill in two places insidecode-testing-agent:SKILL.mdStep 3 (Research Phase) — high-level note for C# / .NET multi-file scopes: prefer the helper over manualfind/grep/globwalks.code-testing-researcher.agent.mdSection 7 (Discover Preexisting Tests) — directive instruction telling the researcher to invoke the helper before manually pairing source ↔ test files, and to use itssource_to_tests/untestedoutput to fill the research document.Both callouts are gated on "when available in the workspace" — installations without
find-untested-sourcescontinue working via manual discovery. Adds no behavior change for non-C# repos.Diff
+4 lines, -0 lines across 2 files. No structural changes, no removed content.
Measurement — honest restatement
I originally claimed "−15.67 % input tokens at neutral pass rate" from a 5×136-instance internal msbench experiment. After re-doing the analysis correctly (per-task mean rather than volume-weighted aggregate, with a non-.NET control bucket), the picture is more modest:
Differential (.NET minus non-.NET): −8.60 pp, Welch's t ≈ −2.02 (right at p ≈ 0.05). Per-task std-dev is ~21 % in both buckets — 5 runs per task is not enough to nail down per-task effects precisely, but the across-task pattern is clean: non-.NET tasks are unchanged (which is the predicted no-effect under my hypothesis), .NET tasks shift modestly negative.
Pass rate was neutral (within noise) on both buckets.
Important caveats
−15 %total was dominated by a handful of very-high-token .NET tasks (e.g.ocelot-core-gen-detailedwent 19M → 11.6M); the per-task picture is the right unit of analysis.code-testing-agentSKILL.md, not from the helper executing. The runtime value of the helper is still unmeasured.The honest claim is therefore: a modest, marginally significant reduction on .NET tasks, isolated to .NET (control bucket flat), at the cost of higher variance.
Dependency
Depends on #733 (which adds the
find-untested-sourcesskill itself). Safe to merge before #733 lands — the "when available" gating means the pointer is a no-op until the helper is installed.Test plan
Doc-only change. No code paths affected. Verified diff is the two intended additions only.
Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com