Skip to content

test(uipath-troubleshoot): scrub diagnosis hints from legacy scenario manifests#1783

Open
Stefan-Virgil wants to merge 2 commits into
mainfrom
test/scrub-legacy-manifest-hints
Open

test(uipath-troubleshoot): scrub diagnosis hints from legacy scenario manifests#1783
Stefan-Virgil wants to merge 2 commits into
mainfrom
test/scrub-legacy-manifest-hints

Conversation

@Stefan-Virgil

@Stefan-Virgil Stefan-Virgil commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

What

Prevents and cleans up diagnosis-hint leaks in agent-visible scenario fixtures.

1. Scrub the two legacy manifests

Removes answer-leaking text from expected_calls[].description in the pre-existing null-reference-exception and argument-null-exception scenarios. The sandbox stages fixtures/, so a thorough agent could read these and skip the diagnosis.

  • null-reference-exception: "read the NullReferenceException message and the stack frame pointing at ERN.xaml/CopyFile"
  • argument-null-exception: "read the ArgumentNullException stack and InputArguments={}", "confirm the fault at CopyFile", "rule out caller-supplied In arguments (H3)"

Replaced with neutral procedural text (e.g. "Agent fetches the job's execution details."). Only description fields change — rules are untouched, so mock dispatch is identical and no re-validation is needed.

2. Guard against recurrence in the generator

generate_scenario.py now flags the rule in three places so it's caught at generation time, not review:

  • module docstring (a dedicated "NO DIAGNOSIS HINTS IN AGENT-VISIBLE FIXTURES" section)
  • the emitted manifest _doc (carried into every new scenario)
  • the post---apply next-steps output

Scope / safety

🤖 Generated with Claude Code

… manifests

Neutralize expected_calls descriptions in the null-reference-exception and
argument-null-exception manifests so the agent-visible fixtures no longer name
the exception type, fault location (ERN.xaml/CopyFile), evidence specifics
(InputArguments={}), or hypothesis labels. Mock dispatch is unaffected (rules
unchanged).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Claude finished @Stefan-Virgil's task in 1m 37s —— View job


PR Review: Scrub diagnosis hints from legacy scenario manifests

  • Read review criteria and project conventions
  • Read full PR diff
  • Read changed files in full context
  • Post review

Summary

Scrubs answer-leaking text from expected_calls[].description fields in two legacy manifest.json fixtures (null-reference-exception, argument-null-exception). Descriptions that named exception types, fault locations, hypothesis labels, and evidence specifics are replaced with neutral procedural text. This follows the same pattern applied to the Assign and If scenarios in #1780 and #1782.

Change-by-Change Review

1. tests/tasks/uipath-troubleshoot/runtime-exceptions/argument-null-exception/fixtures/mocks/responses/manifest.json

Severity: OK

Six description changes, all correct:

  • "read the ArgumentNullException stack and InputArguments={}""Agent fetches the job's execution details." — removes exception name and empty-args hint
  • "confirm the fault at CopyFile""Agent reads error-level job logs for additional execution detail." — removes fault location
  • "resolve the Shared folder key""Agent enumerates folders to locate the folder that owns the job under investigation." — removes folder name hint
  • "rule out caller-supplied In arguments (H3)""agent may inspect the process/release configuration." — removes hypothesis label
  • Traces description neutralized similarly

All pattern, min, and rules fields untouched — mock dispatch behavior is identical. The _doc field changed from escaped Unicode to literal ; both are valid JSON representing the same em-dash character (cosmetic, likely from re-serialization).

2. tests/tasks/uipath-troubleshoot/runtime-exceptions/null-reference-exception/fixtures/mocks/responses/manifest.json

Severity: OK

Four description changes, all correct:

  • "locate the Shared folder (folder-key 1965a46b-…)""locate the folder that owns the job under investigation." — removes folder key hint
  • "read the NullReferenceException message and the stack frame pointing at ERN.xaml/CopyFile""Agent fetches the job's execution details." — removes exception type, XAML name, and fault method
  • "corroborate the stack frame, but the job-get payload already contains the full Info string. min:0 means tracked-but-not-required.""Agent reads error-level job logs for additional execution detail." — removes stack-frame hint and implementation commentary

Same _doc em-dash encoding normalization as above. All pattern, min, and rules untouched.

What's Missing

Nothing identified. The PR is narrowly scoped to its stated goal and the changes are consistent with the prior scrub PRs (#1780, #1782).

Area Ratings

Area Status Notes
Frontmatter N/A No skill frontmatter changed
E2E Tests OK Fixture-only change; mock dispatch behavior identical
Skill Body N/A No skill body changed
References & Assets OK Descriptions are neutral and still convey coverage intent
Repo Hygiene OK Changes scoped to two files, no secrets, no cross-skill refs

Issues for Manual Review

None found.

Conclusion

Clean PR. The replacement descriptions are neutral and procedural — they preserve the coverage-analysis purpose of expected_calls without leaking diagnosis answers to the agent. The rules and min fields are untouched so mock dispatch is identical. No issues found; this is ready to merge.

…manifests

generate_scenario.py now flags — in the module docstring, the emitted manifest
_doc, and the post-apply next-steps output — that manifest.json is agent-visible,
so _doc and author-added expected_calls[].description must stay procedural (no
exception type, fault location, or root-cause hints). The diagnosis belongs only
in the jobs get/logs payloads.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant