Skip to content

Pi: emit "input": ["text", "image"] for vision-capable models#2128

Open
davanstrien wants to merge 4 commits into
mainfrom
pi-snippet-vision-input
Open

Pi: emit "input": ["text", "image"] for vision-capable models#2128
davanstrien wants to merge 4 commits into
mainfrom
pi-snippet-vision-input

Conversation

@davanstrien
Copy link
Copy Markdown
Member

@davanstrien davanstrien commented Apr 27, 2026

Summary

  • For models with pipeline_tag === "image-text-to-text", the Pi "Use this model" snippet now writes "input": ["text", "image"] into the generated ~/.pi/agent/models.json. Without this field, Pi can't pass images to the model — so users had to discover the field manually from the docs to get vision working.
  • Behavior for text-only models is unchanged (existing test extended; new pi - vision test added).
  • Cross-ref: Add vision support note to Pi + llama.cpp guide hub-docs#2408 — the documentation PR that explains the same field. @gary149's review there suggested surfacing it directly in the snippet.

Test plan

  • pnpm test in packages/tasks (17 tests pass, including new pi - vision)
  • pnpm check, pnpm lint:check, pnpm format:check all clean
  • Verified generated JSON output via local script: text models produce identical output to before; vision models gain the input field correctly nested

🤖 Generated with Claude Code


Note

Low Risk
Low risk: only adjusts Pi snippet JSON generation for models flagged as vision-capable and adds a focused unit test; no auth, persistence, or runtime execution paths change beyond emitted config text.

Overview
Updates the Pi “Use this model” snippet to include "input": ["text", "image"] in the generated models.json entry when the selected model has pipeline_tag === "image-text-to-text", enabling image inputs for vision-capable models.

Adds a new pi - vision test to assert the input field is emitted (and preserves existing behavior for text-only and MLX-backed Pi snippets).

Reviewed by Cursor Bugbot for commit bf73c01. Bugbot is set up for automated code reviews on this repo. Configure here.

When the model has pipeline_tag === "image-text-to-text", the Pi
"Use this model" snippet now writes the required input field into
the generated ~/.pi/agent/models.json, so users get a working
config without having to read the docs to discover the flag.

Pre-existing behavior is unchanged for text-only models.

Cross-ref: huggingface/hub-docs#2408 (and @gary149's review
suggestion there to surface this in the snippet directly).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread packages/tasks/src/local-apps.spec.ts
Per Cursor Bugbot review: a regression dropping "text" from
["text", "image"] would have gone undetected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@julien-c julien-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

};

// Step 2: Pi config — port and provider name differ
const modelEntry: Record<string, unknown> = { id: isMLX ? model.id : modelName };
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing --jinja flag for vision models in server command

High Severity

The test at line 152 asserts that the llama-server command for vision models includes --jinja, but the implementation on line 481 of local-apps.ts never conditionally appends this flag. The isVision variable is computed but only used for the modelEntry.input field — it's never used to modify the server command. This means either the test will fail, or vision models won't get the --jinja flag they need for proper Jinja template handling with llama.cpp.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 671b44f. Configure here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cursoragent IIUC jinja is default in llamacpp now?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unable to authenticate your request. Please make sure to connect your GitHub account to Cursor. Go to Cursor

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit bf73c01. Configure here.

};
const snippet = snippetFunc(model);

expect(snippet[0].content).toContain(`llama-server -hf unsloth/Qwen3.6-35B-A3B-GGUF:{{QUANT_TAG}} --jinja`);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test expects --jinja flag that implementation never produces

High Severity

The new vision test asserts that snippet[0].content contains --jinja in the llama-server command, but the snippetPi implementation at line 481 builds the content as `llama-server -hf ${model.id}${getQuantTag(filepath)}` with no --jinja flag appended — and no other code path adds it. The --jinja string appears nowhere in local-apps.ts. This test will fail at runtime. Given the PR discussion confirming jinja is now default in llama.cpp, the test expectation is likely stale and the assertion needs to drop --jinja.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit bf73c01. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants