Consolidate Keboola App skills into dataapp-development by davidesner · Pull Request #76 · keboola/ai-kit

davidesner · 2026-05-18T08:33:39Z

Summary

Consolidates two legacy skills (dataapp-dev for Streamlit, dataapp-deployment for Python/JS) into a single new dataapp-development skill at plugins/dataapp-developer/skills/dataapp-development/. The new skill is a router (SKILL.md ≈ 100 lines + 13 references + 5 runnable templates) that covers the full app lifecycle across both app types and all three client paths.

What's covered

App types: Streamlit, single-Node + static (the dashboarding default), combined Python+Node
Client paths: MCP-only (Claude Desktop / web), Claude Code with filesystem + MCP, kbagent CLI — with sequential detection so an agent always offers every available path instead of silently picking
Storage access: read-only workspace + Query Service SDKs (keboola-query-service / @keboola/query-service), DuckDB caching as default for read-only apps, RW Storage Access for writes, input mapping discouraged. BigQuery's workspace_query path documented separately
Local dev: proactive .env.local pre-fill from get_project_info, ask user only for KBC_TOKEN, with explicit guidance that narrow-scoped tokens don't work with Query Service
Styling: default Keboola palette across all three stacks, "Powered by Keboola" footer with brand-override removal instructions, heavier React+Vite+shadcn option for complex UIs
Other: authentication, dashboard patterns, optional Kai chat integration, troubleshooting (10+ live-test-driven entries), dev workflow

Templates

templates/streamlit/ — keboola-query-service SDK + Plotly + footer
templates/nodejs-app/ — Express + Tailwind CDN + Chart.js + footer (dashboarding default)
templates/python-app/ — Flask + uv
templates/python-node-app/ — FastAPI backend + Express frontend
templates/duckdb-cache/ — Python and Node DuckDB cache helpers

Hard rules

9 SKILL.md hard rules that emerged from live test sessions:

Never commit secrets
RO workspace before input mapping
Apps must handle POST /
No pip install (PEP 668 — use uv)
No [program:nginx] declaration
Validate data first, code second (semantic layer check)
Pick one Keboola path per session (sequential detection, list all candidates)
MCP-only flows: compose source directly into the tool call — don't pre-write a local copy
For local-dev credentials: pre-fill what you can, ask only for what's missing, never grep the filesystem

Validated against

Three end-to-end test sessions against a real Keboola project. Each surfaced a specific footgun that's now codified in the skill (Path A doubled emit, kbagent omission from path detection, legacy workspace-query 404, narrow-scoped token auth error, agent scanning filesystem for tokens).

Removed

plugins/dataapp-developer/skills/dataapp-dev/
plugins/dataapp-developer/skills/dataapp-deployment/

Test plan

Spot-check SKILL.md decision tree routes to the right reference for each task type
Verify the three templates that ship a "Powered by Keboola" footer render correctly (nodejs-app, python-node-app, streamlit)
One more end-to-end run from a fresh Claude Code session against a real project: build a new Streamlit app, deploy, debug
One more end-to-end run for a Python/JS app via the kbagent path
Plugin version + marketplace version bumped and root README feature list updated

Merges dataapp-dev (Streamlit) and dataapp-deployment (Python/JS) into a single skill covering both app types, three client paths (MCP-only, Claude Code, kbagent CLI), storage access patterns, authentication, DuckDB caching, styling defaults, and Kai integration placeholder.

…section - Remove standalone managed-git-mcp.md reference; the placeholder now lives as a subsection in python-js-apps.md alongside the customer-git story. - Move the row-level user filtering pattern out of authentication.md (it's a storage-access concern, already covered there). - Reference count: 13 -> 12.

Permission scoping by user was sourced from one customer app's CLAUDE.md ("X-Kbc-User-Email injected by Keboola/Nginx") and not corroborated by official docs. Replace with an explicit data-access-management placeholder in storage-access.md noting that JS/Python and legacy Streamlit patterns differ and will be documented once verified.

- Expand single-bullet mention into a full subsection in python-js-apps.md covering nginx dual-location, supervisord per-process programs, parallel setup.sh, local-dev with frontend-proxy-to-backend, and the pre-built frontend convention from profitline-js-app. - Add templates/python-node-app/ as a fifth template (FastAPI backend + Vite/React frontend with full keboola-config wiring). - Bump template count 4 -> 5 in directory layout and acceptance criteria.

- Reframe Python+Node combined as "when you need it" (Python backend / ML / existing codebase), not "most common". - Promote single Node.js + static frontend (kai-pricing-calculator-app nodejs-pricing-simulator branch) as the preferred dashboarding shape: one process, no bundler, Chart.js & Tailwind via CDN. - choosing-app-type.md gets a 3-level decision hierarchy (Streamlit -> single Node -> Python+Node). - styling-guide.md leads with the lightweight CDN stack; React/Vite/shadcn is positioned as the heavier-framework alternative. - templates/nodejs-app/ expanded from hello-world to a real dashboarding starter (server.js + api/ + public/ + keboola-config/) modeled on kai-pricing.

kai-client is more mature than the placeholder framing suggested: it ships Python lib with async + SSE + tool approval, Streamlit and JS examples in-repo, and dedicated kai-dataapp plugin with kai-js / kai-streamlit skills. The reference now documents: - Service discovery via Storage API /v2/storage services list - Auth via x-storageapi-token (no separate Kai token) - Streamlit embed pattern based on examples/streamlit_app.py - JS embed pattern based on examples/js-dataapp/server.js (SSE proxy) - DIY alternative via Anthropic SDK directly (FI app pattern) - Pointer to kai-client's own plugins for deeper integration work Open-questions entry updated accordingly.

…L.md

…yment skills Their content has been consolidated into the new dataapp-development skill (SKILL.md + 12 references + 5 templates).

…p-development - plugin.json + marketplace.json: 1.1.0 -> 1.2.0, updated descriptions. - Plugin README: rewritten to describe the single dataapp-development skill, 12 references, and 5 templates. - Root README: refreshed Data App Developer Plugin feature list.

The skill should distill generic patterns, not cite specific repos the agent or user has no access to. Removes "Reference app: X" lines, GitHub/help.keboola.com/pypi/npm URLs, and repo-specific attributions ("Modeled on Y", "Adapted from Z"). Keeps: - CDN URLs in functional template code (Tailwind, Chart.js) - localhost / 127.0.0.1 URLs in dev instructions - Example KBC_URL values in secrets templates - Library/package names without URLs (kai-client, keboola-query-service, etc.)

…dation Some Keboola projects have a semantic layer (metrics, datasets, glossary, relationships). When it exists and matches the user intent, the app's query should use those definitions verbatim — not reinvent the calculation. dev-workflow.md Validate phase gains a "Semantic layer check (when available)" sub-step that runs BEFORE the standard schema/data validation: 1. search_semantic_context to find relevant metrics/datasets. 2. get_semantic_context to read the metric SQL and dataset FQNs. 3. Use the definitions verbatim. 4. validate_semantic_query before embedding. 5. Only then query_data to verify. If no semantic model matches, say so explicitly and proceed with the standard validation path on raw tables. SKILL.md Hard rule 6 gets a tail line pointing at the semantic-layer tools so agents see it from the router without loading dev-workflow.

…latform logs Streamlit silently swallows uncaught exceptions into its UI without writing them to stdout/stderr. The MCP get_data_apps log tail and the Terminal Log tab therefore show nothing for errors that are clearly visible to the user. Remote debugging fails. streamlit-apps.md gains a new section "Capturing errors for platform logs" with the @log_exceptions decorator pattern: catch, log to stderr with full traceback, then re-raise so Streamlit still shows the error in the UI. Python/JS frameworks (Flask/FastAPI/Express) don't need this — their default behavior already logs to stderr. troubleshooting.md "Reading logs" section is reworked: - MCP get_data_apps tail is now positioned as the preferred remote debugging path (agent-friendly, no UI navigation needed). - A new "Streamlit-specific footgun" subsection points back at the decorator pattern in streamlit-apps.md for agents who reach troubleshooting first when the log tail is mysteriously empty.

…ation Cross-referenced from a parallel PR's production-deployment lessons. KBC_WORKSPACE_ID section: split guidance by app type. - Read-only data apps: reuse the MCP session's workspace via the workspace_id field returned by mcp__keboola__get_project_info. No need to provision a new workspace for local dev. - Read-write data apps: must create a dedicated local workspace (UI or kbagent) with grants matching the direct-grant output mapping. The platform's ephemeral production workspace doesn't exist locally. Storage Access section: add "Bucket stage doesn't restrict writes." The destination can be in any stage (out., in., otherwise) as long as the workspace has write privileges — the out. examples are convention, not a constraint. Confirmed by independent testing in the parallel PR.

Cross-referenced from a parallel PR's production-deployment lessons. Replaces the previous inline SDK examples with: 1. A Storage wrapper module pattern (Python class, TS module) that concentrates env-var reads and Client construction in one place. Module-level singleton fails fast on missing env vars; route handlers call select(sql) / execute(sql) without touching env or the raw Client. 2. A validation module pattern (validation.py, validation.ts) with ValidationError, type-coerced parsers, allowlist enforcement, and text escaping. The rest of the app routes user input through these parsers before SQL interpolation. 3. Five rules of thumb for SQL values (numeric / date / categorical / free-text / generated IDs) — concise checklist applicable in any language. The planned SQL.literal() / SQL.ident() / sql.format() SDK helpers note stays — when those ship, they replace the manual sanitization pattern.

…errors streamlit-apps.md §Storage access from Streamlit: add a "Cache the Storage client across reruns" subsection. Streamlit reruns the script top-to-bottom on every interaction; without @st.cache_resource the SDK client is reconstructed each time, re-reading env vars and the workspace manifest, and opening a new HTTP client. Pair with @st.cache_data(ttl=60) for cached read results within a session. troubleshooting.md: add two new Storage-Access-specific entries: 1. KeyError: 'BRANCH_ID' (or other Storage Access env var) on app start — Storage Access not enabled on the config, or local .env missing the variable. Fix: toggle on + direct-grant output mapping in production; add to .env locally. 2. Insufficient privileges / write blocked by the Query Service — destination table missing from direct-grant output mapping, or local workspace missing grants. Includes the bucket-stage-doesn't-matter clarification.

…nt IS the artifact First test run in Claude Desktop surfaced that the skill's "Local development" sections led the agent into wasted work: drafting a .py file in the sandbox FS, then re-emitting the same content as the modify_data_app source_code argument. The local file was never the deployment artifact — the tool argument was — so the redraft doubled output tokens for no benefit. The fix: distinguish "tool-argument-is-the-artifact" (Path A) from "local-file-is-the-artifact" (Paths B and C). deployment-paths.md Path A: new "Don't write the source to a local file first" subsection. The default is compose-in-tool. The one legitimate exception is using a sandbox file as a scratchpad for cheap iterative str_replace edits before a single expensive emit — but only when the iteration savings beat the redundant emit. For small apps (<100 lines), compose-in-tool always wins. deployment-paths.md "How to choose" table: Path A row now warns agents not to drift into local-dev mode even when a sandbox FS is available. streamlit-apps.md §Local development: opener qualifies the section as Path B/C only. Path A agents should skip it.

Second test run surfaced a real risk: in a Claude Code session with both a project-local MCP and kbagent (and possibly a global MCP too), the agent has multiple ways to talk to "the project" — but they may resolve to different branches or even different projects. Mixing them in one session produces silent inconsistencies (validate via MCP on branch X, deploy via kbagent on branch Y, get confusing errors). deployment-paths.md: new top-level "Pick one path per session — don't mix" section. Covers: - Detection: scan tool surface for mcp__*Keboola* (not just .mcp.json presence — the MCP config could come from user-level or org-level settings too); check kbagent CLI availability with `kbagent project list`. - When multiple paths are present, ask the user upfront which one to use, with concrete phrasing. - Trade-off table: MCP-only iteration loop (modify_data_app + deploy + container spin-up + log check) vs kbagent + filesystem + local iteration (streamlit run + .env.local). For non-trivial apps the local loop is editor-speed; the MCP loop pays the platform spin-up cost on every cycle. - Once chosen, commit — don't use the other path even for unrelated operations like data validation. SKILL.md: new Hard rule 7 carrying the one-line version + pointer to the full guidance in deployment-paths.md. Surfaces the discipline from the router so agents see it before loading any reference.

… the only channel to Keboola The previous Path A intro claimed "no local files, no git, no shell." That's outdated — Claude Desktop now has a sandbox filesystem, Python runner, Bash tool. Those tools exist; they just don't connect to Keboola. The real constraint is **MCP is the only channel to your Keboola project**. The sandbox FS is isolated — anything written there doesn't reach Keboola, it just sits in the agent's workspace. Source code that should end up in the data app has to go through the modify_data_app source_code argument; writing to /home/claude/foo.py first and then re-emitting the same content doubles output tokens for no benefit. This pairs with the existing "Don't write the source to a local file first" footgun guidance (c7e355f) which was added after the first real-world test surfaced exactly this mistake. The reframed opener makes the underlying principle visible instead of relying on a no-longer-true "no filesystem" framing.

…y flows" to hard rule Three test runs now confirmed: when filesystem + MCP are both available, the agent's default impulse is "write file → read file → pass to modify_data_app." The doubled-emit footgun. The Path A intro already calls this out (commit 2ad97ed) and there's a dedicated "Don't write the source to a local file first" subsection (commit c7e355f). But both live in deployment-paths.md, which the agent only loads if it specifically reads that reference. Tests 2 and 3 both hit the footgun because the agent never loaded the reference. Promote the rule to SKILL.md hard rule 8 so it's seen from the router every time, regardless of which references the agent loads.

… then MCP Live test: when an agent had multiple MCP servers plus kbagent available, it asked "Which Keboola MCP should I use?" — limiting the options to MCP and silently dropping kbagent. The previous wording of hard rule 7 listed both detection signals but didn't enforce the order or mandate enumeration before asking. Rewrite the rule and the deployment-paths.md detection section to: 1. Step 1: run `which kbagent` (Bash). If it exists, run `kbagent project list` and capture every alias as a candidate path. Explicit "don't skip this just because you've already noticed MCP tools" — kbagent is a separate path the user may prefer. 2. Step 2: scan tool surface for mcp__*[Kk]eboola* prefixes. Each distinct prefix is a candidate. 3. If the combined list has more than one item, ask the user — list ALL of them. Don't omit kbagent just because MCP was found first. Detection is now a numbered, ordered sequence. Both files updated so the SKILL.md hard rule and the reference are consistent.

…ever scan Live test: agents on the kbagent/local-iteration path were greppinng the filesystem and probing env vars to "discover" Storage tokens. Auto-mode caught some of these attempts, but not all. The behaviour is wrong twice over: a security smell (scanning for secrets), AND unlikely to succeed (the user has to provide tokens regardless). SKILL.md: new hard rule 9 — for local-dev credentials, ASK the user, NEVER scan. Lists the workflow: state the required env vars, point the user at storage-access.md for where to find each, ask them to populate .env/.env.local, wait for confirmation before running. storage-access.md §Getting the env vars for local development: add a prominent "Agent: don't try to discover credentials on your own" block at the top of the section with a four-step ask-the-user procedure. Marks the rule as non-negotiable across all paths.

User feedback during live testing: agents tend to pick kbagent because the skill positions it as "often faster for non-trivial apps," but the CLI workflow has dangers for non-developer users (analysts, data scientists, business users) — local env management, shell command failures, token-shaped errors, debugging CLI output. For users who just want a working dashboard, MCP is safer. deployment-paths.md: - §When more than one is present: the user-question phrasing template now annotates MCP-only as "recommended for most users" and kbagent + local iteration as "developer-oriented, best for users comfortable with the CLI." Audience note follows the example, spelling out that non-developers should default to MCP. - §Path C — CLI agent (kbagent): new opening paragraph marking it as developer-oriented and not recommended for non-developers. Added "you can handle CLI errors" as an explicit prerequisite. SKILL.md hard rule 7: adds a clause to the same effect — kbagent is developer-oriented; prefer MCP for non-developers; surface this in the user-question so the choice is informed.

…-offs, don't gatekeep Walking back the "prefer MCP for non-developer users" steering that the previous commit (c378c90) introduced. The right behaviour isn't "hide kbagent from non-developers" — it's "always offer both when both are available, label their costs honestly, let the user decide." The user knows their own context better than the agent does. deployment-paths.md: - §When more than one is present: drop the "(recommended for most users)" and "Not recommended for non-developers" labels. Keep the factual description of what each path costs the user. Replace the audience-note paragraph telling the agent to "pre-pick MCP" with the opposite directive: surface the trade-off, don't steer. - §Path C: open with "Heads-up: this path expects CLI comfort" — factual, not gatekeeping. Removed "Not recommended for non-developer users" wording. SKILL.md hard rule 7: same de-escalation. Removed the "For non-developers, prefer the MCP path" clause. Added explicit "Don't pre-pick MCP 'to be safe' — the user knows their own context."

…e API workspace endpoint to Query Service Live test session 6b856018 surfaced the actual failure: the streamlit template's data_loader.py was POST-ing to {KBC_URL}/v2/storage/branch/<b>/workspaces/<w>/query and getting back 404 workspace.workspaceNotFound on a Snowflake project. That endpoint survives only for BigQuery projects today — Snowflake projects must use the Query Service (https://query.<stack>.keboola.com/api/v1/...). Templates: - templates/streamlit/utils/data_loader.py: rewrite to use the official keboola-query-service Python SDK. Module-level Client cached with @st.cache_resource. Derives QUERY_SERVICE_URL from KBC_URL by swapping connection. -> query. when not set. Reads workspace ID from KBC_WORKSPACE_MANIFEST_PATH (preferred) or env fallback. Explicit errors when BRANCH_ID is missing — the Query Service rejects "default" so the value MUST be numeric. - templates/streamlit/pyproject.toml: replace `requests` dep with `keboola-query-service>=0.2.0`. - templates/streamlit/.streamlit/secrets.toml.example: add BRANCH_ID (numeric) and optional QUERY_SERVICE_URL; reference the skill's env-vars section for where to find each. - templates/nodejs-app/api/keboola-client.js: rewrite to use @keboola/query-service SDK. Same env-var resolution pattern, same workspace-prefix normalization, but the actual call now goes through Client.executeQuery against query.<stack>.keboola.com. - templates/nodejs-app/package.json: add @keboola/query-service dep. References: - references/storage-access.md §Direct RO workspace queries: replace the "Direct API call shape" section (which showed a raw POST to the legacy endpoint) with the Query Service SDK call shape for both Python and JS. Adds an explicit "Do NOT post to /v2/storage/.../ workspaces/<id>/query" warning. Clarifies that the legacy endpoint is only used by BigQuery projects today. - references/troubleshooting.md: new entry "workspace.workspaceNotFound 404 from /v2/storage/.../workspaces/<id>/query" mirroring the exact failure from the live session. Existing "WORKSPACE_<id> prefix" entry updated to mention Query Service instead of Storage API.

…full code Previous edit told agents "don't post to /v2/storage/.../query — use Query Service" but left BigQuery users with no concrete alternative. That endpoint IS the right path for BigQuery; Query Service just doesn't support BQ yet. Adds two subsections after the Snowflake / Query Service SDK examples: 1. "How to know which backend you're on" — call get_project_info, read sql_dialect. Snowflake → Query Service. BigQuery → Storage API workspace-query. 2. "BigQuery path — Storage API workspace-query endpoint" — concrete Python and JS code. Documents the differences from Query Service: - Rows arrive as objects keyed by column name (not arrays + cols). - Cell values are native types (no string coercion needed). - Synchronous single response; no submit/poll/paginate. - BRANCH_ID accepts the string "default" here. - Templates are wired for Snowflake; swap data_loader / keboola-client and remove the keboola-query-service dep when on BQ.

…idate the skill Companion to 2026-05-13-dataapp-development-design.md and 2026-05-13-dataapp-development.md (the plan). Lists, by category: 1. Original brief (Obsidian note + Linear AI-3147) 2. Plugin's prior skills that were merged in and deleted 3. Keboola code repos read directly (mcp-server, data-app-python-js, both Query Service SDKs, kai-client, kai-pricing-calculator-app, profitline-js-app, FI app, agent-usage-data-app, keboola_agent_cli) with the specific files inspected for each 4. Connection documentation pages read locally 5. External-team contribution (PR #71) — what was adopted and what was deliberately rejected 6. Companion skill keboola-js-data-app — per-item adopt/reject log 7. Live verifications (MCP get_project_info live call + three test sessions in data_app_testing) — each test mapped to the commit it drove 8. Anthropic / Claude Code platform sources Not loaded by the skill at runtime — provenance / audit only. Next iteration uses this to know which sources were authoritative and which were superseded.

…bute keboola-js-data-app to Fisa, drop private FS paths

…7 comments

…ask only for token, offer to run JS-app test feedback: when an agent is wiring up local dev, the ideal UX is "I created .env.local with everything I could resolve; the only thing missing is your Storage API token — please paste it here and I'll start the app for you." The previous rule wording ("ask the user to populate .env.local") put the entire job on the user when MCP can resolve KBC_URL, BRANCH_ID, KBC_WORKSPACE_ID, and QUERY_SERVICE_URL automatically via get_project_info. SKILL.md hard rule 9: rewrite as a four-step proactive flow. - Step 1: pre-create the file with every required key, MCP-resolved values pre-filled. - Step 2: check whether KBC_TOKEN is already set (named-lookup is fine; indiscriminate scanning is not). - Step 3: ask only for the missing token, with a UI-navigation pointer. - Step 4: offer to run the app once complete. storage-access.md §Agent: don't try to discover credentials: same four-step flow. Adds an explicit "the goal is: user provides the one secret value, the agent does everything else" closing line.

… with Query Service Live test surfaced the wrong guidance: skill told users to "scope it minimally" with read access only to the needed buckets/tables. But the Query Service evaluates access at the workspace level — narrow- scoped tokens get rejected with auth errors regardless of whether the SQL touches data inside the scope. storage-access.md §KBC_TOKEN: rewrite the token-creation guidance. The token MUST be project-wide. Two options now spelled out: 1. User's master token, refreshed via UI or grabbed via the Keboola Dev Tools Chrome extension. 2. Dedicated Storage API token with Full Access to all buckets and components. templates/streamlit/.streamlit/secrets.toml.example: comment header updated to call out the token-scope requirement and link both acquisition options. Placeholder renamed from "your-storage-api- token" to "your-project-wide-storage-api-token" so the constraint is visible at the value site. troubleshooting.md: new entry "Query Service auth error with a narrow-scoped Storage API token" mirroring the live failure mode. Fix points at the two acquisition options.

… templates Ship a small, low-contrast attribution footer with the Keboola wordmark in every template (Streamlit, single-Node + static, combined Python+Node). Streamlit embeds the SVG as a base64 data URI to avoid needing the static-serving config flag; the JS templates reference `/keboola-logo.svg` served from each template's `public/` directory. Document the pattern in references/styling-guide.md so brand overrides have a clear extension point.

… Keboola footer Palette overrides don't touch the footer markup — without an explicit instruction, an agent applying customer branding would ship the app with both brands stacked. Spell out the removal steps for HTML and Streamlit templates.

…atch platform-injected env var The platform injects the workspace ID as `WORKSPACE_ID` (no prefix), per the official Storage Access docs and the manifest fallback contract. The skill was telling agents to use `KBC_WORKSPACE_ID`, which the runtime never sets — production lookups would silently fall back to the manifest file or fail. - Rename every occurrence across SKILL.md, references, and templates. - Drop the redundant `KBC_WORKSPACE_ID || WORKSPACE_ID` fallback chains in data_loader.py and keboola-client.js — there's only one name now. - Add a naming-note callout in storage-access.md §WORKSPACE_ID so anyone migrating from older code or earlier drafts of the skill sees the correction explicitly. - Tighten the troubleshooting entry for `WORKSPACE_<id>` values so the "env var name vs value prefix" distinction is unambiguous.

The note only existed to justify the previous KBC_WORKSPACE_ID mistake. A fresh reader doesn't need that context — the section heading and the rest of the prose already name the env var.

…t practices Cut redundancy across SKILL.md and references so each file pulls less context when the skill triggers. Net: -362 lines across the corpus, larger per-task savings (e.g. troubleshooting.md reads 118 lines instead of 262). - SKILL.md: hard rules #7–9 collapsed to one-line pointers; redundant "Reference index" table dropped (Decision tree already covers all 14 refs). - troubleshooting.md: every entry rewritten as symptom → cause → pointer; duplicate code blocks (POST handlers, nginx snippets, etc.) removed. - styling-guide.md split: default Keboola palette stays in styling-guide.md; bundled React+Vite+shadcn+ECharts stack moved to styling-react-bundled.md so RO dashboard tasks don't pull in 220 lines of CSS-variable tokens. - streamlit-apps.md Theming → points to styling-guide.md for palette/snippets. - deployment-paths.md Path A auth note → points to authentication.md. - python-js-apps.md Bootstrap hook + kai-integration.md Pre-built skills shortened to one-line forward references. - ToC blocks added to the seven references >200 lines per Anthropic's guidance for partial-read fidelity. - READMEs updated: "14 topical references" (was 12), styling line clarified.

Captures what the skill is currently working around: blocked Linear issues (AI-3219 / AI-3218 / PROF-114), Keboola MCP gaps that force fallbacks to kbagent or filesystem paths, kbagent's missing log command, deferred placeholder sections, the two Max suggestions not yet picked up, and live-test coverage gaps. Lives at the skill root so each item has an obvious owner.

linear · 2026-05-18T12:42:54Z

AI-3147

ottomansky · 2026-05-21T10:06:16Z

Hey @davidesner summarizing my findings
Overall this is really solid — I got a working data app running locally in ~15–20 minutes end-to-end, which is genuinely impressive for a first pass. I didn't push it into Keboola, just ran it locally, so this is purely feedback from the build path. A handful of rough edges worth flagging:

Stack selection (App type → Project step)

Would be nice if the list sorted by most-used stacks first — I started not sure which one I needed.
"AWS US (default)" says connection.keboola.com, but the bundled MCP URL is mcp.us-east4.gcp.keboola.com — a GCP hostname under the AWS US option, which threw me off (is this AWS or GCP?). Worth either tweaking the labeling or adding a note that the MCP runtime lives on GCP regardless of the storage stack.

Bundled MCP URL is US-only

plugin.json hardcodes mcp.us-east4.gcp.keboola.com. I'm on GCP EU (europe-west3), so it returned nothing and I fell back to kbagent. Either make it stack-aware (derive from the user's project URL) or call it out in the README so non-US users know to expect the kbagent path.

storage-access.md assumes newer MCP fields

It says to call get_project_info and read workspace_id / branch_id, but on keboola-mcp-server v1.32.0 those fields aren't returned — only project_id, project_name, sql_dialect, etc. I had to detour through kbagent branch list + kbagent workspace list to find them. Either bump the minimum MCP version in the skill or document the fallback.

kbagent --json is fragile

Every kbagent call prints Updating keboola-mcp-server v1.32.0 -> v1.61.3 (via uv_tool)... to stdout, which breaks json.loads. I had to regex-strip the leading lines before every parse. The upgrade nag should go to stderr or sit behind a --quiet flag, otherwise scripting against --json is painful. -> I will create an issue in https://github.com/padak/keboola_agent_cli and try to create a PR for fixing this or discussing with Padak

kbagent project setup isn't mentioned

When I picked the kbagent path, kbagent project list showed 3 unrelated projects, not the one I was working on. The skill never mentioned I might need kbagent project add first. The path-selection step could say: "if you chose kbagent, confirm kbagent project list includes your project; if not, run kbagent project add first."

FQN fallback in api/queries.js

The template comment tells you to copy the FQN from mcp__keboola__get_table.fully_qualified_name, but on the older MCP that field doesn't exist (table-detail returned fully_qualified_name: None). I ended up constructing it manually as "<DATABASE>"."<BUCKET>"."<TABLE>" using the database name from kbagent workspace list. The comment should at least mention the manual construction rule as a fallback.

Playwright version pin

npx playwright install defaults to latest, but the MCP server pins playwright@1.57.0 (Chromium build v1200), so the first two screenshot attempts failed. Took an npm view @executeautomation/playwright-mcp-server dependencies to figure out the right version. Could either pin it in the setup instructions or auto-detect.

Token-in-chat ergonomics

One of the AskUserQuestion options for the token was effectively "paste it here" — and I did, which means it lives in the transcript even after I rotate. Would be cleaner if the agent always pointed at .env first and treated "paste in chat" as the discouraged escape hatch, with a stronger warning. 😬

kbagent project add --url default

Defaults to https://connection.keboola.com (legacy AWS US stack), which is wrong for most projects today. Should default to empty/required, or auto-detect from a provided project URL.

None of these blocked me from getting to a working app, but smoothing them out would make the first-run experience a lot tighter — especially for non-US-AWS users. Happy to help test a follow-up pass.

ottomansky · 2026-05-21T11:46:40Z

Also deployed into keboola overall took 3-5 minutes from the time I asked it on the locally running app CC pointed this out
What this run added to the PR feedback bucket

Three meaningful gaps caught:

templates/nodejs-app/ is not deployable as shipped — validate-repo fails with 3 BLOCKING issues. Missing keboola-config/nginx/sites/default.conf, keboola-config/supervisord/services/app.conf, and pyproject.toml. The python-app and python-node-app
templates have these; the Node template should too. I copied + adapted from python-app.
pyproject.toml is required even for pure-Node apps — validate-repo --type python-js enforces it. Either the validator should skip this check when no Python is detected, or the docs should call out the stub-pyproject pattern explicitly.
data-app password ergonomics with manage-token blocking — the security model (default-deny KBC_MANAGE_API_TOKEN) is good, but the failure message "No manage token available. Run interactively, or pass --allow-env-manage-token" surfaces only after
the call, not as a hint when kbagent data-app create --auth password was chosen. Could nudge earlier: "you picked password auth — to fetch the password later, you'll need --allow-env-manage-token or the UI."

ottomansky · 2026-05-21T14:13:31Z

One more finding is that kbagent currently doesnt have a way to look into terminal logs so it cant debug the apps when there is some issue starting them and gets into loop, already working on a PR for that cc: @jordanrburger

davidesner added 30 commits May 13, 2026 14:30

docs(dataapp): add local-development sections per app type to spec

347208e

docs(dataapp): implementation plan for dataapp-development skill

3fd05af

feat(dataapp-development): add skill router (SKILL.md)

80263e5

fix(dataapp-development): use references/ prefix consistently in SKIL…

4e5490d

…L.md

feat(dataapp-development): add choosing-app-type reference

55db244

feat(dataapp-development): add streamlit-apps reference

33b2612

feat(dataapp-development): add python-js-apps reference

6107d32

feat(dataapp-development): add deployment-paths reference

a02c967

feat(dataapp-development): add storage-access reference

2c57aef

feat(dataapp-development): add authentication reference

27f005b

feat(dataapp-development): add duckdb-caching reference

ce577e6

feat(dataapp-development): add styling-guide reference

6efc634

feat(dataapp-development): add dashboard-patterns reference

ee888fe

feat(dataapp-development): add kai-integration reference

ee43088

feat(dataapp-development): add dev-workflow reference

d77b454

feat(dataapp-development): add troubleshooting reference

2a16ba4

feat(dataapp-development): add streamlit starter template

4213bde

feat(dataapp-development): add python-app (Flask) starter template

af09d01

feat(dataapp-development): add nodejs-app dashboarding starter template

e5c8475

feat(dataapp-development): add python-node-app dual-server template

2b60f0c

feat(dataapp-development): add duckdb-cache template

6b05fdb

chore(dataapp-developer): remove legacy dataapp-dev and dataapp-deplo…

c603bfa

…yment skills Their content has been consolidated into the new dataapp-development skill (SKILL.md + 12 references + 5 templates).

davidesner added 26 commits May 15, 2026 11:38

docs(dataapp-development): simplify sources file to plain list, attri…

6891cf3

…bute keboola-js-data-app to Fisa, drop private FS paths

docs(dataapp-development): credit Max Ottomansky + Miro Linear AI-314…

e613283

…7 comments

docs(dataapp-development): drop the WORKSPACE_ID naming-note paragraph

db8f41c

The note only existed to justify the previous KBC_WORKSPACE_ID mistake. A fresh reader doesn't need that context — the section heading and the rest of the prose already name the env var.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consolidate Keboola App skills into dataapp-development#76

Consolidate Keboola App skills into dataapp-development#76
davidesner wants to merge 71 commits into
mainfrom
feat/dataapp-development-skill

davidesner commented May 18, 2026

Uh oh!

linear Bot commented May 18, 2026

Uh oh!

ottomansky commented May 21, 2026

Uh oh!

ottomansky commented May 21, 2026

Uh oh!

ottomansky commented May 21, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

davidesner commented May 18, 2026

Summary

What's covered

Templates

Hard rules

Validated against

Removed

Test plan

Uh oh!

linear Bot commented May 18, 2026

Uh oh!

ottomansky commented May 21, 2026

Uh oh!

ottomansky commented May 21, 2026

Uh oh!

ottomansky commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ottomansky commented May 21, 2026 •

edited

Loading