fix(ixp-skill): drop document image viewing from configure-model step by cezara98t · Pull Request #1828 · UiPath/skills

cezara98t · 2026-07-02T15:05:40Z

Problem

The e2e full-lifecycle test was flaky (~2–3 of every 8 runs failed). Root cause: Step 2 of the project-setup guide told the agent to download 2–3 sample document images and Read them before configure-model. Each image is 50K–170K chars of base64, driving the context to 700–760K tokens before the lifecycle even got going. The model then emitted a text-only turn, the SDK reported end_turn, and the run ended mid-lifecycle — typically right at or after configure-model/get-predictions, well before confirm, update-prompts, and publish.

This fix was originally made on fix/ixp-e2e-drop-f1-direction-gate (commit 110f1ff07) but was never shipped — PR #1814 merged without it, so main still carried the image-dumping Step 2.

Fix

Step 2 now applies the model configuration directly and explicitly instructs the agent not to download or Read document images to decide. gemini_2_5_flash + table_mini is the default (correct for invoices and most structured docs); a compact table covers the override cases. Image reading remains where it's genuinely needed — per-document labelling and the improve-prompts phase — both of which the passing runs already tolerate.

Evidence

Guide state	e2e result
Before (image dump in Step 2)	6/8 passing — 2 runs died at ~700–760K-token context
After (this change)	8/8 passing, every run 13/13, score 1.000

🤖 Generated with Claude Code

The guide told the agent to download 2-3 sample document images and Read them before configuring the model. Each image injects ~300k base64 tokens into context; with 3 invoices the context overflows and the agent emits an end_turn text response mid-lifecycle, scoring 0 on everything past the confirm step. The default recommendation (gemini_2_5_flash + table_mini) is already correct for invoices — no visual inspection needed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-actions · 2026-07-02T15:05:58Z

Claude finished @cezara98t's task in 1m 28s —— View job

PR Review: fix(ixp-skill): drop document image viewing from configure-model step

Read review criteria and project conventions
Analyze the full diff
Read the full changed file for context
Check for cross-skill references, broken links, secrets
Post review findings

Summary

This PR removes the instruction for agents to download and Read 2–3 sample document images before running configure-model, which was bloating the context window (700–760K tokens) and causing flaky e2e failures. Instead, the step now applies gemini_2_5_flash + table_mini directly as the default and provides a compact override table.

Change-by-Change Review

1. `skills/uipath-ixp/references/project-setup-guide.md`

Severity: OK

The change is clean and well-motivated:

Lines 51–52: The new instruction is prescriptive ("Apply the configuration directly — do NOT download or Read document images to decide") — good LLM usability; the negative instruction prevents the agent from reverting to the old behavior.
Lines 53–58: The configure-model command is now shown first with concrete default values (gemini_2_5_flash, table_mini) instead of a placeholder <none|table_mini|table>. This is an improvement — the agent gets a copy-paste-ready command for the happy path.
Lines 60–67: The override table is simplified from 4 rows to 4 rows with compressed descriptions ("simple or multiple tables", "nested/merged/multi-page tables"). Table still covers all cases.
The removed documents list / documents download commands are still documented in the CLI reference and used elsewhere (labelling, improve-prompts), so they're not lost — just removed from this step where they caused harm.
All relative links (cli-reference.md, label-documents-guide.md) resolve correctly.
No cross-skill references, no secrets, no hardcoded paths.

What's Missing

Nothing identified. The change is tightly scoped to the root cause (image downloads in Step 2) and the e2e test (tests/tasks/uipath-ixp/e2e/full_lifecycle.yaml) does not assert on the removed commands, so no test updates are needed.

Area Ratings

Area	Status	Notes
Frontmatter	OK	Not touched
E2E Tests	OK	Existing e2e test unaffected; PR description shows 8/8 passing
Skill Body	OK	Not touched
References & Assets	OK	Clean simplification, all links valid
Repo Hygiene	OK	Scoped change, no secrets, no cross-skill deps

Issues for Manual Review

None found.

Conclusion

Clean, well-scoped fix with strong empirical evidence (8/8 passing vs. 6/8 before). The change improves both reliability and LLM usability by giving the agent a direct default command instead of requiring expensive image inspection. Approve.

cezara98t requested review from andrei-uipath, misupantea-uipath and paul-ciobanu as code owners July 2, 2026 15:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(ixp-skill): drop document image viewing from configure-model step#1828

fix(ixp-skill): drop document image viewing from configure-model step#1828
cezara98t wants to merge 1 commit into
mainfrom
fix/ixp-guide-drop-image-download

cezara98t commented Jul 2, 2026

Uh oh!

github-actions Bot commented Jul 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

cezara98t commented Jul 2, 2026

Problem

Fix

Evidence

Uh oh!

github-actions Bot commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: fix(ixp-skill): drop document image viewing from configure-model step

Summary

Change-by-Change Review

1. skills/uipath-ixp/references/project-setup-guide.md

What's Missing

Area Ratings

Issues for Manual Review

Conclusion

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Jul 2, 2026 •

edited

Loading

1. `skills/uipath-ixp/references/project-setup-guide.md`