Skip to content

feat: add kai verify diagnostic and clean error rendering in JS data app#43

Open
ottomansky wants to merge 3 commits into
keboola:mainfrom
ottomansky:fix/kai-verify-and-js-data-app-errors
Open

feat: add kai verify diagnostic and clean error rendering in JS data app#43
ottomansky wants to merge 3 commits into
keboola:mainfrom
ottomansky:fix/kai-verify-and-js-data-app-errors

Conversation

@ottomansky
Copy link
Copy Markdown
Contributor

Summary

  • Adds kai verify — a read-only diagnostic that confirms token identity, kai-assistant service discovery, reachability (ping + info), and current monthly message usage via the existing GET /api/usage endpoint. Renders upstream errors as code: message instead of raw JSON. Supports --json-output; exits non-zero on any failed check.
  • Fixes the JS data app example (examples/js-dataapp/) so it parses upstream KAI error bodies and returns a structured { error: { status, code, message } }; the frontend renders the message instead of a raw JSON blob.
  • Documents the env var name (STORAGE_API_TOKEN, not KAI_TOKEN) with .env.example comments and a troubleshooting README.md next to the JS example.
  • Bumps the package to 0.13.0.

Why

Reported in Slack: a user's JS data app failed with

Kai API error: 429 — {"code":"rate_limit:chat","message":"You have exceeded your maximum number of messages for this month. Please contact support to raise your limit or try again next month."}

while the same master token chatted fine in the Keboola platform UI (counter showed 19 / 150). Two problems:

  1. There was no way for the user to confirm which project/token/quota they were actually hitting — kai info is unauth and kai history doesn't surface usage. Martin Vaško asked "did you run token verify?" — that command didn't exist. It does now.
  2. The JS example dumped the upstream JSON as a string into { error: text }, so the frontend rendered the literal JSON blob you see above.

Test plan

  • Unit tests: 5 new tests in tests/test_cli_verify.py covering the full-success path (matches the platform UI's 19/150 shape), the exact 429 rate_limit:chat body from the Slack screenshot, invalid-token 401, missing env vars, and --json-output shape. Full suite stays at 253 passing.
  • Live kai verify against https://connection.europe-west3.gcp.keboola.com:
    ✓ token: project 1143 (99_Playground_Max), token 'max.ottomansky@keboola.com' [master]
    ✓ service-discovery: kai-assistant at https://kai-assistant.europe-west3.gcp.keboola.com
    ✓ ping: server alive at 2026-05-26T11:38:04+00:00
    ✓ info: kai-assistant vproduction-… (server v3.2.0)
    ✓ usage: 2/150 messages used (resets 2026-06-01, 148 left)
    
  • Live kai verify with an invalid token renders HTTP 401 storage.tokenInvalid: Invalid access token and exits 1.
  • kai verify --json-output returns a clean machine-readable report (token / service-discovery / ping / info / usage blocks each with ok, summary, and structured fields).
  • examples/js-dataapp/server.js started with a bad token returns { "error": { "status": 401, "message": "Unauthorized" } } from POST /api/chat (instead of the previous raw text).
  • ruff format + ruff check clean.

Out of scope

  • The platform UI's "X / 150" counter disagreeing with the API's 429 (Max's original symptom) is a server-side accounting question for the KAI backend team, not a client fix. kai verify now makes that mismatch visible to the user — that's the contribution here.
  • The branchId payload differences between the Python client and the JS example aren't the cause of the reported bug; left alone.

🤖 Generated with Claude Code

ottomansky and others added 2 commits May 26, 2026 13:35
…a app

When the Keboola platform UI works but a Kai-using app errors out, there was
no way to tell why without reading source. `kai verify` now reports token
identity (which project/owner the token resolves to), the auto-discovered
kai-assistant URL, ping/info reachability, and current monthly message usage
via the existing `GET /api/usage` endpoint. On a `429 rate_limit:chat` it
prints `code: message` cleanly instead of dumping raw JSON.

The JS data app example previously surfaced upstream errors as a raw JSON
blob (`Kai API error: 429 — {"code":"rate_limit:chat","message":"..."}`).
The proxy now parses the upstream body and returns a structured
`{error: {status, code, message}}`, and the frontend renders the message
text rather than the JSON envelope. Adds `.env.example` and a
troubleshooting README so the `STORAGE_API_TOKEN` (not `KAI_TOKEN`) env var
name is unambiguous.

Bumps the package to 0.13.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Extract `_verify()` into focused module-level helpers (`_check_token`,
  `_check_service_discovery`, `_check_reachability`, `_check_usage`,
  `_emit_verify_footer`) plus a `_VerifyRecorder` class. The orchestrator
  is now ~20 lines instead of 125.
- Replace `assert isinstance(parsed, dict)` with a proper runtime check
  that records the failure and exits non-zero (asserts are stripped
  under python -O).
- Rephrase the misleading comment on the usage error handler — the
  handler catches any `KaiError`, not specifically 429.
- JS app: fall back to `e.code` before "Request failed" when an upstream
  body has a code but no message.
- Revert two unrelated `ruff format` hunks in `send_and_display` that
  slipped into the previous commit.

Test coverage extended:
- scoped (non-master) token rendering
- Storage API connection error (`httpx.RequestError` branch)
- `--base-url` skips discovery (talks directly to the local URL)
- discovery failure when `kai-assistant` is absent from the services list

Full suite: 257 passing (253 + 4 new). Live `kai verify` against
europe-west3 still reports `2/150 messages used (148 left)` identically.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ottomansky
Copy link
Copy Markdown
Contributor Author

Addressed self-review feedback in 86ca38e:

  • Extracted _verify() into module-level helpers (_VerifyRecorder, _check_token, _check_service_discovery, _check_reachability, _check_usage, _emit_verify_footer, _safe_json_body). Orchestrator is now ~20 lines.
  • Replaced assert isinstance(parsed, dict) with a proper runtime check that records the failure (assert is stripped under python -O).
  • Fixed misleading comment on the usage error handler — it catches any KaiError, not just 429.
  • JS frontend falls back to e.code before "Request failed" when an upstream body has {code} but no message.
  • Reverted two unrelated ruff format hunks in send_and_display that had slipped in.

Test coverage extended from 5 → 9:

  • scoped (non-master) token rendering
  • Storage API connection error (httpx.RequestError branch)
  • --base-url bypassing /v2/storage discovery
  • discovery failure when kai-assistant is absent from the services list

Full suite: 257 passing (was 253). Live kai verify against europe-west3 still reports 2/150 messages used (148 left) — refactor is behaviour-preserving.

Note on uv.lock: the diff shows 0.11.0 → 0.13.0 rather than 0.12.0 → 0.13.0 because the prior 0.12.0 bump (#41) didn't regenerate the lock file. This PR happens to fix it as a side effect.

…e tests

- `_check_token`: guard `resp.json()` on the 2xx path against
  `JSONDecodeError` so a malformed Storage API response surfaces as a
  recorded check failure instead of crashing.
- `_check_reachability`: expanded docstring justifying why ping/info are
  recorded independently (partial-outage diagnostic visibility) rather
  than short-circuiting on ping failure.
- Two new tests:
  - `test_json_output_on_failure_path` — the JSON report is well-formed
    on a token-verify failure and includes only the checks that actually
    ran (no spurious entries for later phases).
  - `test_reachability_records_ping_and_info_independently` — when
    `/ping` returns 503 but `/api` and `/api/usage` succeed, all three
    checks are still recorded; verify exits non-zero overall because
    `ok=false` propagates.

Suite is now 259 passing (was 257).

Correction to the previous commit (86ca38e): its message claimed to
revert two `ruff format` hunks in `send_and_display`. That revert did
not stick — the subsequent `ruff format` pass re-applied them, so the
hunks remain in this PR's diff. Accepting that outcome: `ruff format`
is the project's declared formatter (`pyproject.toml:56`), so those
two lines are now the project style. The previous longer form was
stale because nobody had run the formatter recently. Apologies for
the inaccurate claim in the prior commit message.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ottomansky
Copy link
Copy Markdown
Contributor Author

Second pass of review feedback in 4451e86:

Code hardening

  • _check_token: guarded resp.json() on the 2xx path against JSONDecodeError. A malformed-but-200 Storage API body now surfaces as a recorded check failure instead of crashing — matches the symmetry the 4xx path already had via _safe_json_body.
  • _check_reachability: expanded docstring with the rationale for recording ping/info independently (partial-outage diagnostic visibility, at the cost of one extra HTTP request on a totally-down service).

Coverage 9 → 11 tests

  • test_json_output_on_failure_path — JSON report stays well-formed on a token-verify failure; later phases must not appear in checks.
  • test_reachability_records_ping_and_info_independently — when /ping 503s but /api and /api/usage succeed, all three are recorded; overall ok=false.

Suite: 259 passing (was 257).

Correction to commit 86ca38e: that commit's message claimed to "Revert two unrelated ruff format hunks in send_and_display". The revert didn't stick — the subsequent ruff format pass re-applied them, so the two single-line collapses (and the __init__.py trailing-blank-lines cleanup) remain in the PR diff. Accepting that outcome: ruff format is the project's declared formatter (pyproject.toml:56), so those lines are the project style now — the previous longer form was stale because the formatter hadn't been run in a while. The 4451e86 commit message documents this explicitly. Apologies for the inaccurate prior claim.

No behaviour change from this second pass — live kai verify against europe-west3 still reports 2/150 messages used.

@jordanrburger
Copy link
Copy Markdown
Collaborator

@ottomansky - Fix tests pls, then I'll merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants