Skip to content

test(troubleshoot): normalize task tags to sanctioned dimensions#1838

Open
MarinRzv wants to merge 1 commit into
mainfrom
chore/troubleshoot-tag-cleanup
Open

test(troubleshoot): normalize task tags to sanctioned dimensions#1838
MarinRzv wants to merge 1 commit into
mainfrom
chore/troubleshoot-tag-cleanup

Conversation

@MarinRzv

@MarinRzv MarinRzv commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator

Strip package/leaf tags that duplicate the directory path (word-activities, export-pdf, job-killed, exit-code-*, get-asset, ...) and non-vocab markers (faithful-replay, faulted-jobs, pending-job, healing-agent, licensing) from all troubleshoot scenarios. Keep only skill + tier + mode + lifecycle + product/domain tags from the closed vocabulary.

Reconcile the orchestrator tag: keep it only where the failure is genuinely Orchestrator-scoped (products/orchestrator, cross-system); remove it from activity-package/runtime scenarios that merely investigate via 'uip or'. Broaden the orchestrator tag definition in CLAUDE.md to match actual usage.

Strip package/leaf tags that duplicate the directory path (word-activities,
export-pdf, job-killed, exit-code-*, get-asset, ...) and non-vocab markers
(faithful-replay, faulted-jobs, pending-job, healing-agent, licensing) from
all troubleshoot scenarios. Keep only skill + tier + mode + lifecycle +
product/domain tags from the closed vocabulary.

Reconcile the orchestrator tag: keep it only where the failure is genuinely
Orchestrator-scoped (products/orchestrator, cross-system); remove it from
activity-package/runtime scenarios that merely investigate via 'uip or'.
Broaden the orchestrator tag definition in CLAUDE.md to match actual usage.
@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Claude finished @MarinRzv's task in 3m 58s —— View job


Coder-eval task lint (advisory)

  • Read lint rubric
  • Identify changed task YAMLs
  • Check PR body for evidence of passing run
  • Evaluate changed files against rubric
  • Post results

Coder-eval task lint (advisory)

182 task YAMLs modified (0 added); all changes are tags-only (single line per file). Verdict counts: 0 Critical, 0 High, 0 Medium, 0 Low, 182 OK.

Rubric: .claude/commands/lint-task.md. This check is advisory and never blocks merge.

Evidence of passing run

High — PR body does not claim the changed tasks have been run and passed. Since this PR only modifies the tags: line (no changes to prompts, criteria, sandbox, or structure), the risk is low — tag-only edits cannot break a passing run. Still, please consider editing the PR description to add a line like:

"Tag-only changes; tasks pass unchanged (no behavioral delta)."

Per-task lint

All 182 changed files modify only the tags: line — no changes to initial_prompt, success_criteria, sandbox, run_limits, or any other structural field. None of the seven rubric axes (Self-report anti-pattern, Prompt over-specification, Meaningful coverage, Could pass for the wrong reason, Near-duplicate, Validate-only flow tests, CLI verb reachability) are affected by tag edits. All redundant sandbox and run-limit checks also pass.

Verdict: OK for all 182 files.

Spot-checked groups (click to expand)
Group Files Tags removed Tags kept Verdict
activity-packages/classic-activities/ 2 faithful-replay, classic-activities, orchestrator uipath-troubleshoot, rpa, e2e, mode:diagnose OK
activity-packages/database-activities/ 18 faulted-jobs, orchestrator, database-activities, connect-to-database, execute-non-query, execute-query, excel kept lifecycle:discover where present OK
activity-packages/excel-activities/ 44 package/leaf tags (excel-activities, append-range, delete-range, excel-application-card, etc.) core vocab only OK
activity-packages/mail-activities/ 12 faithful-replay, mail-activities, leaf tags core vocab only OK
activity-packages/o365-activities/ 13 faithful-replay, o365-activities, leaf tags core vocab only OK
activity-packages/gsuite-activities/ 7 faithful-replay, gsuite-activities, leaf tags core vocab only OK
activity-packages/system-activities/ 9 orchestrator, get-asset, leaf tags core vocab only OK
activity-packages/ui-automation/ 12 healing-agent, licensing, leaf tags core vocab only OK
activity-packages/web-activities/ 6 faithful-replay, web-activities, leaf tags core vocab only OK
activity-packages/word-activities/ 30 word-activities, export-pdf, leaf tags core vocab only OK
activity-packages/python-activities/ 5 faulted-jobs, orchestrator, python-activities, python-scope kept lifecycle:discover OK
activity-packages/cv-activities/ 1 faithful-replay core vocab only OK
products/orchestrator/ 9 faulted-jobs, faithful-replay, pending-job, no-host, robot-credentials, logon-failure, credentials, foreground-already-running, concurrency, misconfigured-foreground, job-killed, exit-code-0x40010004 orchestrator kept ✅; rpa kept on foreground tasks ✅ OK
products/integration-service/ 6 faithful-replay, orchestrator (from non-Orch tasks) integration-service, rpa kept OK
products/maestro/ 1 faithful-replay maestro, bpmn kept OK
runtime-exceptions/ 2 faithful-replay, runtime-exception core vocab only OK
cross-system/ 2 faulted-jobs, faithful-replay, is orchestrator + integration-service + rpa kept on rpa-preflight-failure OK

Within-PR duplicates

Not applicable — no new tasks added (all 182 are modifications).

Note: tag consistency with updated CLAUDE.md

The updated CLAUDE.md (line 208) now states:

Add rpa too when an RPA process's execution is directly involved (e.g. foreground-slot, job-killed).

The rpa-foreground-* tasks correctly carry rpa, but the three rpa-job-killed-* tasks do not:

Task Current tags Expected per CLAUDE.md
products/orchestrator/rpa-job-killed-by-autocancel-trigger orchestrator, e2e, mode:diagnose + rpa
products/orchestrator/rpa-job-killed-by-operator-ui orchestrator, e2e, mode:diagnose + rpa
products/orchestrator/rpa-job-killed-by-watchdog-account orchestrator, e2e, mode:diagnose + rpa

These tasks' names start with rpa- and describe the killing of RPA jobs — the scenario the CLAUDE.md explicitly cites. Consider adding rpa for consistency. Fix this →

Conclusion

✅ All 182 changed tasks pass the rubric (tag-only edits, no test-design impact). One consistency note above (3 rpa-job-killed-* tasks missing rpa per the updated CLAUDE.md guidance). Evidence of passing run not claimed, but given tag-only scope, risk is minimal.


@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Claude finished @MarinRzv's task in 3m 8s —— View job


PR Review: Normalize troubleshoot task tags

  • Read review criteria and project conventions
  • Analyze full diff (182 YAML files + 1 CLAUDE.md)
  • Validate tag changes against sanctioned taxonomy
  • Check CLAUDE.md orchestrator tag definition change
  • Verify consistency across all modified files
  • Post review findings

Summary

Strips non-vocabulary tags (faithful-replay, faulted-jobs, healing-agent, licensing, runtime-exception, is, and package-leaf markers like database-activities, excel-activities, export-pdf, get-asset, exit-code-*, etc.) from all 182 troubleshoot task YAMLs, leaving only the sanctioned dimensions: skill + tier + mode:* + lifecycle:* + product/domain tags. Broadens the orchestrator tag definition in CLAUDE.md to match its actual scope.

Change-by-Change Review

1. tests/tasks/uipath-troubleshoot/CLAUDE.md (orchestrator tag definition)

Severity: OK

The old definition ("Orchestrator-only failures with no workflow execution involved") was too narrow — it excluded job-killed/foreground-slot/logon scenarios that are clearly Orchestrator control-plane issues. The new definition is specific and includes guidance on when to co-tag with rpa. Well written.

2. Activity-package task YAMLs (119 files → [uipath-troubleshoot, rpa, e2e, mode:diagnose])

Severity: OK

All non-vocab tags stripped correctly. orchestrator removed from activity-package scenarios where the failure is in the activity itself (database, excel, word, system, etc.) — correct per the updated definition and the "Tag ↔ group agreement" rule (activity-packages/*rpa, not orchestrator). The lifecycle:discover tag preserved in all 43 files that already had it.

3. products/orchestrator/* task YAMLs (9 files)

Severity: OK

orchestrator correctly retained. rpa correctly added to rpa-foreground-already-running and rpa-foreground-misconfigured per the new definition ("Add rpa too when an RPA process's execution is directly involved"). Non-vocab tags (faulted-jobs, job-killed, exit-code-0x40010004, pending-job, no-host, robot-credentials, foreground-already-running, logon-failure, credentials, concurrency, misconfigured-foreground) all correctly stripped.

4. products/integration-service/* task YAMLs (6 files)

Severity: OK

orchestrator correctly removed from connector-general-no-access (an IS permissions issue, not an Orchestrator control-plane failure).

5. cross-system/* task YAMLs (2 files)

Severity: OK

rpa-preflight-failure correctly retains orchestrator + integration-service + rpa (root cause spans multiple systems). faulted_excel_o365 correctly retains integration-service + rpa.

6. Tag ordering inconsistency — 2 files

Severity: Low

Two activity-package files place integration-service (a domain tag) after the tier/mode tags instead of alongside rpa:

  • activity-packages/gsuite-activities/gsuite-connection-invalid/task.yaml:15[uipath-troubleshoot, rpa, e2e, integration-service, mode:diagnose]
  • activity-packages/o365-activities/o365-trigger-connection-503/task.yaml:13[uipath-troubleshoot, rpa, e2e, mode:diagnose, integration-service]

The other 8 files with integration-service consistently use [uipath-troubleshoot, integration-service, rpa, ...] or [uipath-troubleshoot, rpa, ..., integration-service, ...] with the domain tag before e2e. The convention from tests/README.md examples places domain tags before tier.

Fix this →

What's Missing

  • lifecycle:* tag on 119 tasks: The tag taxonomy (tests/README.md) marks lifecycle:X as required, but 119 troubleshoot tasks lack it. This is pre-existing (all 43 that had lifecycle:discover before still have it), so not introduced by this PR. A follow-up to add lifecycle:discover to the remaining tasks would bring full compliance. Not blocking.

Area Ratings

Area Status Notes
Frontmatter OK All changes are to tags: lines only — no frontmatter breakage
E2E Tests OK No test structure changes; only tag normalization
Skill Body N/A No skill SKILL.md changes
References & Assets OK CLAUDE.md tag definition update is correct and well-scoped
Repo Hygiene OK Changes scoped to troubleshoot test suite only; no secrets, no cross-skill impact

Issues for Manual Review

  • The orchestrator tag removal from activity-package tasks assumes those scenarios don't genuinely involve Orchestrator control-plane failures. The PR description and updated CLAUDE.md definition make a clear case, but domain experts may want to spot-check a few edge cases (e.g., system-activities/getasset-* tasks that fail on Orchestrator asset/folder resolution — are those activity-package or Orchestrator-scoped?).

Conclusion

Clean tag normalization PR. The non-vocab tag removal is consistent and correct across all 182 files. The CLAUDE.md orchestrator definition update is well-reasoned and matches the actual tag application. One minor ordering inconsistency in 2 files (Low). Recommend approving with the optional tag-ordering fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant