Skip to content

docs(uipath-agents): document eval set schema key and filename convention (UV-14439)#1801

Open
mjnovice wants to merge 1 commit into
mainfrom
docs/uv-14439-eval-set-schema-notes
Open

docs(uipath-agents): document eval set schema key and filename convention (UV-14439)#1801
mjnovice wants to merge 1 commit into
mainfrom
docs/uv-14439-eval-set-schema-notes

Conversation

@mjnovice

@mjnovice mjnovice commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Why

UV-14439 — session retrospective reported doc gaps in low-code eval analysis. Most items in the ticket are already addressed by the lowcode/evaluations/ reference tree that landed since the ticket was filed:

  • uip agent eval run start/status/results/list/compare with --set (required) and --verboserunning-evaluations.md
  • SKILL.md task-navigation rows for low-code evals + Orchestrator offline evals
  • Cloud-only run results → evaluate.md Prerequisites
  • Pre-flight tenant check → authentication.md

What remains (this PR)

Two schema notes in evaluation-sets.md the retrospective flagged as wasted-introspection causes:

  • Test-case array is evaluations[], not testCases[]
  • Studio Web filename convention evaluation-set-<unix-ms>.json; match sets by name/id, not filename pattern

🤖 Generated with Claude Code

…tion

UV-14439 session retrospective: agent wasted introspection calls guessing
testCases[] instead of evaluations[], and inferred meaning from the Studio
Web unix-ms filename pattern. Add schema notes to evaluation-sets.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Claude finished @mjnovice's task in 54s —— View job


PR Review: docs(uipath-agents) — eval set schema notes

  • Read review criteria and project conventions
  • Read full diff and changed file in context
  • Check links, cross-skill references, secrets
  • Post review

Summary

Adds two schema notes to evaluation-sets.md clarifying that the test-case array key is evaluations[] (not testCases[]) and that Studio Web uses timestamp-based filenames for eval sets. Both address documented wasted-introspection causes from a session retrospective (UV-14439).

Change-by-Change Review

1. skills/uipath-agents/references/lowcode/evaluations/evaluation-sets.md (+5/−0)

Severity: OK

Two bullet points inserted at line 146–149, between the JSON example block and the existing source field paragraph. Placement is correct — the notes annotate the JSON schema shown directly above.

Content accuracy:

  • The evaluations[] key matches the JSON example at line 120. Good that it calls out the likely false guess (testCases[]).
  • The filename convention note (evaluation-set-<unix-ms>.json) is actionable — tells the agent to match by name/id rather than filename pattern.
  • The inline CLI command (uip agent eval list --set ... --output json) is consistent with the same command documented at line 66 and follows the --output json convention.

Tone and compression comply with token-optimization rules — prescriptive, no filler, fragments where appropriate.

No cross-skill references. No secrets or personal paths. No broken links.

What's Missing

Nothing identified.

Area Ratings

Area Status Notes
Frontmatter OK Not touched; no changes needed
E2E Tests OK Doc-only change to existing skill; no new capabilities requiring test coverage
Skill Body OK Not touched
References & Assets OK Two well-placed schema notes, consistent with surrounding content
Repo Hygiene OK CODEOWNERS already covers skills/uipath-agents/; scoped change, no secrets

Issues for Manual Review

  • The evaluation-set-<unix-ms>.json filename convention is domain knowledge from Studio Web behavior — reviewer should confirm this is the current/stable naming pattern and not subject to change.

Conclusion

Clean, well-scoped documentation addition. No issues found. Approve.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants