fix(uipath-admin): realistic audit prompts + whole-day export / LTS-schema CSV guidance by yadvender-uipath · Pull Request #1831 · UiPath/skills

yadvender-uipath · 2026-07-02T20:37:09Z

What & why

Reworks the uipath-admin audit skill docs and their coder-eval tasks so the tests measure what the skill teaches, not a hand-scripted prompt. The audit task prompts were heavily scripted ("Important: do NOT retry, do NOT login, run each command exactly once…"), which masked whether the skill itself guides the agent correctly. This replaces them with minimal, customer-style asks and tightens the supporting guidance — especially around export semantics and the LTS-schema CSV.

Changes

Skill docs (SKILL.md, references/audit-commands.md, references/audit-workflow-guide.md)

Whole-day export bounds — audit <scope> export bounds are whole UTC days, inclusive on both ends, distinct from the events --to-date "next-day" trick (added as an explicit anti-pattern rule). A next-day --to-date on export over-exports a full day.
LTS-lag caveat — the long-term store trails live events by ~24–48h; windows including today/yesterday can return empty trailing days.
LTS-schema CSV — document the CSV column ordering and disambiguate org/tenant audit export from uip or audit-logs (uipath-platform).
Non-interactive login — never launch the interactive uip login browser flow in an automated session; env-var sessions are already authenticated.
Relative dates — resolve "yesterday" / "last 7 days" against the real UTC clock and echo the resolved window; don't re-confirm mid-sequence when the user already asked for the follow-on steps.

Tasks — 16 audit_*.yaml prompts trimmed to minimal customer-style asks; criteria and date windows aligned. (Net −220/+80.)

Note: the stale UiPath/cli#2585 blocker referenced in the export tasks has shipped — @uipath/cli 1.197.0 creates the uniquely-named audit_<from>_<to>_<generatedAt> output as expected.

Test evidence

All 16 uipath-admin audit tasks pass locally via coder-eval 0.8.0, experiments/default.yaml (tempdir), against a live tenant:

Tier	Tasks	Result
smoke	10	✅ 10/10 (score 1.0)
e2e real-artifact	`audit_export_verify_e2e`, `audit_export_csv_verify_e2e`, `audit_org_export_verify_e2e`	✅ 3/3 (produce + verify real export files; CSV header carries LTS-schema columns)
e2e command-pattern	`audit_export_e2e`, `audit_login_history_e2e`, `audit_who_did_x_e2e`	✅ 3/3

Notes for reviewers reproducing locally on an authenticated dev box (Windows):

A few runs hit the claude-agent-sdk initialize handshake 60s timeout during sandbox spin-up (Defender scanning the fresh venv). Set CLAUDE_CODE_STREAM_CLOSE_TIMEOUT=300000 to avoid it. Purely a local-env timing issue.
audit_export_e2e is designed for CI's unauthenticated env (commands fail fast → command-surface graded). On an authenticated box the real 7-day LTS export exceeds its turn_timeout: 600; run it with -D run_limits.turn_timeout=1800 -D run_limits.task_timeout=3600 to let the live export complete. No change to the committed task is needed — it passes as-is in CI.

🤖 Generated with Claude Code

…chema CSV guidance Rework the uipath-admin audit skill docs and their coder-eval tasks so the tests measure the skill's guidance rather than a hand-scripted prompt. Skill docs (SKILL.md, references/audit-commands.md, audit-workflow-guide.md): - export bounds are whole UTC days, inclusive on both ends — distinct from the events `--to-date` next-day trick (new anti-pattern rule); a next-day --to-date on export over-exports a full day. - add the LTS-lag caveat (long-term store trails live events by ~24-48h; recent trailing days may come back empty). - document LTS-schema CSV column ordering and distinguish org/tenant audit export from `uip or audit-logs`. - non-interactive `uip login` handling (never launch the browser flow in an automated session; env-var sessions are already authenticated). - resolve relative windows ("yesterday", "last 7 days") against the real UTC clock and echo the resolved bounds; don't re-confirm mid-sequence when the user already asked for the follow-on steps. Tasks (16 audit_*.yaml): replace scripted "Important: do NOT ..." prompts with minimal customer-style asks; align criteria and date windows accordingly. Test: all 16 uipath-admin audit tasks pass locally via coder-eval 0.8.0 (experiments/default.yaml, tempdir) against a live tenant — 10 smoke + 3 real-artifact e2e + 3 command-pattern e2e, each score 1.0. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

github-actions · 2026-07-02T20:37:25Z

Claude finished @yadvender-uipath's task in 2m 45s —— View job

PR Review: fix(uipath-admin): realistic audit prompts + whole-day export / LTS-schema CSV guidance

Read review criteria and project conventions
Analyze full diff
Read changed skill files in full
Read changed test files
Check for cross-skill references, broken links, secrets
Post review findings

Summary

Replaces 16 heavily-scripted audit test prompts with minimal, customer-style asks so the tests measure what the skill teaches rather than what the prompt hand-holds. Simultaneously tightens the skill docs around whole-day export bounds vs. the events next-day trick, LTS lag caveats, LTS-schema CSV columns, non-interactive login safety, and relative date resolution.

Change-by-Change Review

1. `skills/uipath-admin/SKILL.md` — Critical Rules 2, 27, 30, Anti-pattern 14

Severity: OK

Rule 2: non-interactive login guidance is well-placed. Correctly tells the agent to stop and ask rather than launching uip login itself.
Rule 27: extended with the events-only next-day caveat and relative-date resolution (date -u). Clean addition without bloating the rule.
Rule 30: rewritten to clearly state whole-day inclusive semantics and distinguish from events. The confirmation-path logic ("if the user named a folder, pass it verbatim; confirm only when you picked a default") is a good UX refinement.
Anti-pattern 14: correctly cross-references Rule 30 and prevents the most likely mistake (carrying the events trick to export).

2. `skills/uipath-admin/references/audit-commands.md` — export section

Severity: OK

Examples updated from --to-date 2026-02-01 to --to-date 2026-01-31 consistently across all three code blocks and both output examples.
New "single day" example with date -u -d 'yesterday' is useful and includes the macOS/BSD variant.
LTS-schema disambiguation callout (blockquote at line 143) correctly delineates this surface from uip or audit-logs list --export — this is routing guidance, not a cross-skill dependency.
Flags table now explicitly documents whole-day semantics for --from-date and --to-date.
LTS lag caveat at line 217 is well-placed as the last implementation note.

3. `skills/uipath-admin/references/audit-workflow-guide.md` — export workflow + gotchas

Severity: OK

Relative-date resolution paragraph (lines 183–184) added before Step 2 — good placement.
Confirmation-path logic for --output-path (line 185) matches the SKILL.md Rule 30 wording.
Export examples updated consistently (2026-01-31).
Folder layout updated (line 227).
Three new gotcha bullets: export whole-day semantics, LTS lag, and relative date resolution — all accurate and consistent with the reference doc.
Output Etiquette item 6 (line 324): the "finish the full requested sequence" clause is a good refinement — read-only queries shouldn't pause mid-chain to re-confirm.

4. Test task YAMLs — 16 audit prompts simplified

Severity: OK

All 16 test files follow the same pattern: the initial_prompt is stripped to a natural, minimal customer ask. The description, tags, success_criteria, and run_limits are preserved unchanged. Spot-checks:

Prompt → criteria alignment: every prompt still provides the folder name that the command_pattern asserts (e.g., "audit-yesterday", "audit-window", "audit-last-7d", "audit-org-yesterday", "audit-org-window"). ✓
Scope signals: org-scope tasks say "organization-level" in the prompt; tenant-scope tasks say "tenant" or imply it. ✓
Format signals: CSV tasks say "CSV" or "Excel"; JSON tasks say "JSON files" or omit format (default). ✓
audit_scope_ambiguity_smoke.yaml: the removed hint said "if unsure about scope, ask to clarify — that is the correct behavior." Now the test relies entirely on the skill teaching Rule 23. This is exactly the PR's stated goal. ✓

5. `tests/tasks/uipath-admin/audit_org_export_verify_e2e.yaml` — LTS lag note inconsistency

Severity: Low

The description block (line 22) still says "We pin the window to dates at least 48h in the past so LTS has caught up" while the sibling tasks (audit_export_verify_e2e.yaml, audit_export_csv_verify_e2e.yaml) were updated to "The customer-style 'past week' prompt therefore includes trailing days that may come back empty…". The new prompt asks for "2 days ago — just that one day", so the old wording is technically accurate (it IS 48h+ back), but it's inconsistent with the updated sibling descriptions and no longer reflects the style shift to customer-style prompts.

Fix this →

What's Missing

Nothing identified. All relative links resolve. No secrets or personal paths. CODEOWNERS is not affected (no new skill paths). No cross-skill file dependencies introduced.

Area Ratings

Area	Status	Notes
Frontmatter	OK	No frontmatter fields changed
E2E Tests	OK	10 smoke + 6 e2e; all criteria preserved; prompts properly minimal
Skill Body	OK	New rules and anti-patterns are accurate and properly cross-referenced
References & Assets	OK	Examples, flags tables, and gotchas all internally consistent
Repo Hygiene	OK	Changes scoped to uipath-admin; no secrets, no cross-skill deps

Issues for Manual Review

Domain accuracy of whole-day export semantics: the PR states that export --to-date is whole-day inclusive (the server truncates times to the calendar day) and that the events next-day trick over-exports a full day. This is a domain claim about the CLI/API behavior — the reviewer should verify this matches the shipped @uipath/cli 1.197.0 behavior.
LTS lag estimate (~24–48h): cited in three places across the skill docs. If the actual lag differs materially, all three need updating.

Conclusion

Clean, well-scoped PR. The prompt simplification is the right approach — it shifts test fidelity from "can the agent follow a script" to "does the skill teach the agent correctly." The doc changes around export semantics, LTS lag, and CSV schema are internally consistent and well cross-referenced. One Low inconsistency in the org-export verify task description (stale LTS lag note wording). Recommend approve after addressing or acknowledging the Low.

github-actions · 2026-07-02T20:37:27Z

Claude finished @yadvender-uipath's task in 2m 26s —— View job

Coder-eval task lint (advisory)

16 task YAMLs changed; verdict counts: 0 Critical, 0 High, 0 Medium, 0 Low, 16 OK.

Rubric: .claude/commands/lint-task.md. This check is advisory and never blocks merge.

Evidence of passing run

✅ Author confirms passing run: "All 16 uipath-admin audit tasks pass locally via coder-eval 0.8.0, experiments/default.yaml (tempdir), against a live tenant" — with a tier-by-tier table (10 smoke ✅, 3 real-artifact e2e ✅, 3 command-pattern e2e ✅).

Per-task lint

`tests/tasks/uipath-admin/audit_events_basic_smoke.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_events_pagination_smoke.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_export_basic_smoke.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_export_csv_smoke.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_export_csv_verify_e2e.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_export_e2e.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_export_verify_e2e.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_login_history_e2e.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_org_events_smoke.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_org_export_csv_smoke.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_org_export_smoke.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_org_export_verify_e2e.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_scope_ambiguity_smoke.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_sources_smoke.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_status_filter_smoke.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_who_did_x_e2e.yaml` — verdict: OK

Within-PR duplicates

No duplicate clusters detected. Task pairs that share scaffold structure (e.g. tenant-export-json vs org-export-json, tenant-export-csv vs org-export-csv) exercise materially distinct scope code paths (/tenantaudit_ vs /orgaudit_ basePath, different event sets) — scaffold reuse, not duplication.

Conclusion

✅ All 16 changed tasks pass the rubric. Evidence of passing run confirmed. The prompt trimming consistently replaces the heavy-handed Important: do NOT retry… boilerplate with minimal, customer-style asks — exactly the change described in the PR body.

…ever `uip or audit-logs` CI (Bedrock backend) surfaced that generic audit prompts ("audit logs", "export the audit trail", "login history") were routed to `uip or audit-logs` (Orchestrator-operational audit, wrong schema — its --export even returns a ZIP) instead of `uip admin audit <scope>`. Local runs (anthropic_direct) masked it; the surface choice was backend-dependent. Strengthen the skill's surface routing so the boundary is unambiguous: - Rule 1: uipath-admin owns org/tenant audit via `uip admin audit <scope>`; never `uip or audit-logs` for audit events/history/export/login-history/ who-did-what. - "When to Use > Audit" + workflow-guide: same boundary as a prominent callout; and when asked WHAT events/sources are visible, discover live by running `audit <scope> sources` rather than answering from memory (fixes the sources-smoke conceptual-answer failure). - Rule 26: reinforce discover-by-execution. Skill-doc only; no task/prompt changes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ver uipath-platform CI (Bedrock) showed a perfect correlation: every audit task that loaded uipath-admin passed (used `uip admin audit`); every one that loaded uipath-platform failed (used `uip or audit-logs`). Root cause is skill *selection*, not in-skill guidance — and it's non-deterministic run to run. Strengthen the uipath-admin `description` so generic audit phrasings route here: name the trigger words ('audit logs', 'audit trail', 'audit events', 'export the audit', login history, 'who did what') and assert the boundary — org/tenant/identity audit is `uip admin audit`, not `uip or audit-logs`. uipath-admin description only (per scope); uipath-platform untouched. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Root cause of the CI flakiness (Bedrock): audit prompts trimmed to pure customer phrasing lost the signal that pins skill selection, so the agent sometimes loaded uipath-platform and used `uip or audit-logs` (Orchestrator audit) instead of `uip admin audit`. Perfect correlation across 4 runs: pass ⟺ uipath-admin loaded; fail ⟺ uipath-platform loaded. Restore a light, realistic surface anchor to each of the 16 audit prompts — "Admin audit" / "UiPath Admin" / "Admin audit trail" — which maps to the `uip admin audit` product surface and disambiguates it from Orchestrator's audit logs, without re-scripting the steps. scope-ambiguity stays scope- ambiguous (org vs tenant) while pinning the surface. Pairs with the description sharpening in a196bc9. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

yadvender-uipath requested review from CalinMPopa, IsabellaCapriottiUIPath, ZuerWang99, ashleyetheridge, bansal-anushree, chandhanshanth, jianjunwang2, litheon, sriramva-uipath, t-hsia and uipathswapnil as code owners July 2, 2026 20:37

yadvender-uipath and others added 3 commits July 2, 2026 14:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(uipath-admin): realistic audit prompts + whole-day export / LTS-schema CSV guidance#1831

fix(uipath-admin): realistic audit prompts + whole-day export / LTS-schema CSV guidance#1831
yadvender-uipath wants to merge 4 commits into
mainfrom
fix/audit-csv-verify-e2e-prompt

yadvender-uipath commented Jul 2, 2026

Uh oh!

github-actions Bot commented Jul 2, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jul 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

yadvender-uipath commented Jul 2, 2026

What & why

Changes

Test evidence

Uh oh!

github-actions Bot commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: fix(uipath-admin): realistic audit prompts + whole-day export / LTS-schema CSV guidance

Summary

Change-by-Change Review

1. skills/uipath-admin/SKILL.md — Critical Rules 2, 27, 30, Anti-pattern 14

2. skills/uipath-admin/references/audit-commands.md — export section

3. skills/uipath-admin/references/audit-workflow-guide.md — export workflow + gotchas

4. Test task YAMLs — 16 audit prompts simplified

5. tests/tasks/uipath-admin/audit_org_export_verify_e2e.yaml — LTS lag note inconsistency

What's Missing

Area Ratings

Issues for Manual Review

Conclusion

Uh oh!

github-actions Bot commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coder-eval task lint (advisory)

Evidence of passing run

Per-task lint

tests/tasks/uipath-admin/audit_events_basic_smoke.yaml — verdict: OK

tests/tasks/uipath-admin/audit_events_pagination_smoke.yaml — verdict: OK

tests/tasks/uipath-admin/audit_export_basic_smoke.yaml — verdict: OK

tests/tasks/uipath-admin/audit_export_csv_smoke.yaml — verdict: OK

tests/tasks/uipath-admin/audit_export_csv_verify_e2e.yaml — verdict: OK

tests/tasks/uipath-admin/audit_export_e2e.yaml — verdict: OK

tests/tasks/uipath-admin/audit_export_verify_e2e.yaml — verdict: OK

tests/tasks/uipath-admin/audit_login_history_e2e.yaml — verdict: OK

tests/tasks/uipath-admin/audit_org_events_smoke.yaml — verdict: OK

tests/tasks/uipath-admin/audit_org_export_csv_smoke.yaml — verdict: OK

tests/tasks/uipath-admin/audit_org_export_smoke.yaml — verdict: OK

tests/tasks/uipath-admin/audit_org_export_verify_e2e.yaml — verdict: OK

tests/tasks/uipath-admin/audit_scope_ambiguity_smoke.yaml — verdict: OK

tests/tasks/uipath-admin/audit_sources_smoke.yaml — verdict: OK

tests/tasks/uipath-admin/audit_status_filter_smoke.yaml — verdict: OK

tests/tasks/uipath-admin/audit_who_did_x_e2e.yaml — verdict: OK

Within-PR duplicates

Conclusion

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Jul 2, 2026 •

edited

Loading

1. `skills/uipath-admin/SKILL.md` — Critical Rules 2, 27, 30, Anti-pattern 14

2. `skills/uipath-admin/references/audit-commands.md` — export section

3. `skills/uipath-admin/references/audit-workflow-guide.md` — export workflow + gotchas

5. `tests/tasks/uipath-admin/audit_org_export_verify_e2e.yaml` — LTS lag note inconsistency

github-actions Bot commented Jul 2, 2026 •

edited

Loading

`tests/tasks/uipath-admin/audit_events_basic_smoke.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_events_pagination_smoke.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_export_basic_smoke.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_export_csv_smoke.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_export_csv_verify_e2e.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_export_e2e.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_export_verify_e2e.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_login_history_e2e.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_org_events_smoke.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_org_export_csv_smoke.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_org_export_smoke.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_org_export_verify_e2e.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_scope_ambiguity_smoke.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_sources_smoke.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_status_filter_smoke.yaml` — verdict: OK

`tests/tasks/uipath-admin/audit_who_did_x_e2e.yaml` — verdict: OK