From a6b78e833ee459585f14baf3c82b3d2592d82104 Mon Sep 17 00:00:00 2001 From: Josh Lambert Date: Thu, 30 Apr 2026 00:39:02 -0400 Subject: [PATCH 1/3] docs: add per-message feedback proposal MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Initial high-level design proposal for thumbs up/down feedback on assistant messages. Covers the problem motivation, scope, payload differences between Kilo Gateway and direct providers, and open questions. Implementation details intentionally kept out — those belong in the PR diff. --- packages/kilo-docs/lib/nav/contributing.ts | 4 + .../architecture/per-message-feedback.md | 76 +++++++++++++++++++ 2 files changed, 80 insertions(+) create mode 100644 packages/kilo-docs/pages/contributing/architecture/per-message-feedback.md diff --git a/packages/kilo-docs/lib/nav/contributing.ts b/packages/kilo-docs/lib/nav/contributing.ts index 74e75c5ede6..14f4dae5e63 100644 --- a/packages/kilo-docs/lib/nav/contributing.ts +++ b/packages/kilo-docs/lib/nav/contributing.ts @@ -70,6 +70,10 @@ export const ContributingNav: NavSection[] = [ href: "/contributing/architecture/voice-transcription", children: "Voice Transcription", }, + { + href: "/contributing/architecture/per-message-feedback", + children: "Per-Message Feedback", + }, ], }, ], diff --git a/packages/kilo-docs/pages/contributing/architecture/per-message-feedback.md b/packages/kilo-docs/pages/contributing/architecture/per-message-feedback.md new file mode 100644 index 00000000000..fadfc7d3efc --- /dev/null +++ b/packages/kilo-docs/pages/contributing/architecture/per-message-feedback.md @@ -0,0 +1,76 @@ +--- +title: "Per-Message Feedback" +description: "Thumbs up/down feedback on assistant messages sent to Kilo via telemetry" +--- + +# Per-Message Feedback (Thumbs Up / Down) + +## Problem + +We have no signal on which assistant responses are helpful and which aren't. Without per-response feedback, we can't: + +- Correlate model or prompt changes to user-perceived quality +- Identify specific bad responses in the Kilo Gateway logs for investigation +- Detect patterns where certain providers, models, or prompt paths consistently underperform + +Aggregate metrics like session completion rate or token cost are too coarse to understand individual response quality. A lightweight thumbs-up/down on each message gives us the missing feedback loop. + +## Proposal + +Add a thumbs-up / thumbs-down widget next to the existing copy button on every assistant message. Ratings are sent to Kilo via the existing PostHog telemetry pipeline. The UI is hidden entirely when telemetry is disabled. + +### Scope + +| Surface | Approach | +|---|---| +| VS Code extension | Thumbs buttons inline next to the copy button | +| TUI | Keybinds (`+` / `-`) on the last assistant message | +| Desktop / `kilo web` | Out of scope for the first pass | + +### Telemetry Payload + +We deliberately collect fewer identifiers for non-Kilo providers, since those IDs can't be correlated to upstream data and add tracking surface without product benefit. + +**Direct providers (Anthropic, OpenAI, etc.):** +`providerID`, `modelID`, `variant?`, `rating`, `previousRating?` + +**Kilo Gateway turns (`providerID` starts with `"kilo"`):** +Same fields plus `sessionID`, `messageID`, and `parentMessageID` (= the `x-kilo-request` header the gateway already saw). This lets backend analysts join feedback against gateway logs to diagnose specific bad responses. + +Event name: `"Feedback Submitted"` — a single event string in both telemetry enum registries so PostHog sees one event regardless of source. + +### UX + +- **Toggleable**: click the same button again to clear, or click the opposite to switch. Each change fires a new event with `rating` and `previousRating`. +- **In-memory state**: ratings are keyed by message ID and held in the webview/TUI session. Persisting across reloads is deferred to a follow-up. +- **Gated on telemetry**: if the user has VS Code telemetry disabled, the buttons don't render at all. The TUI keybinds are no-ops with a brief toast. + +### Architecture (high level) + +``` +[webview button / TUI keybind] + → existing telemetry proxy or Telemetry.track() + → POST /telemetry/capture (webview path) + → Telemetry.track("Feedback Submitted", {…}) + → PostHog +``` + +No new server endpoints, no SDK regeneration, no PostHog-side changes. The `/telemetry/capture` route and both telemetry proxy paths already exist and accept arbitrary event names. + +### Kilo Gateway Detection + +The webview uses `providerID.startsWith("kilo")` to decide whether to include correlation IDs — this matches the outbound header gating in `packages/opencode/src/session/llm.ts`. The TUI can use the more precise `model.api.npm === "@kilocode/kilo-gateway"` check since it has access to the full provider resolution in-process. + +## What's Out of Scope + +- Free-text comments on thumbs-down +- 1–5 scale or star rating +- Persisting ratings across page reloads / session switches +- Changing prior-message actions (copy + thumbs) to hover-only +- Shared web UI / desktop surface + +## Open Questions + +- Should ratings persist on the `MessageV2.Assistant` schema so they survive reloads? +- Confirm with the PostHog dashboard owner that the proposed event + property names fit existing conventions. +- Whether to add free-text comments for thumbs-down in a follow-up. From bd34334dd8e4cc61daa112062409ba904d36aed0 Mon Sep 17 00:00:00 2001 From: Joshua Lambert <25085430+lambertjosh@users.noreply.github.com> Date: Thu, 30 Apr 2026 16:24:50 -0400 Subject: [PATCH 2/3] Apply suggestions from code review Co-authored-by: Joshua Lambert <25085430+lambertjosh@users.noreply.github.com> --- .../contributing/architecture/per-message-feedback.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/packages/kilo-docs/pages/contributing/architecture/per-message-feedback.md b/packages/kilo-docs/pages/contributing/architecture/per-message-feedback.md index fadfc7d3efc..900a68363d7 100644 --- a/packages/kilo-docs/pages/contributing/architecture/per-message-feedback.md +++ b/packages/kilo-docs/pages/contributing/architecture/per-message-feedback.md @@ -13,7 +13,7 @@ We have no signal on which assistant responses are helpful and which aren't. Wit - Identify specific bad responses in the Kilo Gateway logs for investigation - Detect patterns where certain providers, models, or prompt paths consistently underperform -Aggregate metrics like session completion rate or token cost are too coarse to understand individual response quality. A lightweight thumbs-up/down on each message gives us the missing feedback loop. +Aggregate metrics like session completion rate or token cost are too coarse to understand individual response quality. A lightweight thumbs-up/down on each message can help close the feedback loop. ## Proposal @@ -25,13 +25,12 @@ Add a thumbs-up / thumbs-down widget next to the existing copy button on every a |---|---| | VS Code extension | Thumbs buttons inline next to the copy button | | TUI | Keybinds (`+` / `-`) on the last assistant message | -| Desktop / `kilo web` | Out of scope for the first pass | ### Telemetry Payload We deliberately collect fewer identifiers for non-Kilo providers, since those IDs can't be correlated to upstream data and add tracking surface without product benefit. -**Direct providers (Anthropic, OpenAI, etc.):** +**Third party providers (Anthropic, OpenAI, local, etc.):** `providerID`, `modelID`, `variant?`, `rating`, `previousRating?` **Kilo Gateway turns (`providerID` starts with `"kilo"`):** @@ -43,7 +42,7 @@ Event name: `"Feedback Submitted"` — a single event string in both telemetry e - **Toggleable**: click the same button again to clear, or click the opposite to switch. Each change fires a new event with `rating` and `previousRating`. - **In-memory state**: ratings are keyed by message ID and held in the webview/TUI session. Persisting across reloads is deferred to a follow-up. -- **Gated on telemetry**: if the user has VS Code telemetry disabled, the buttons don't render at all. The TUI keybinds are no-ops with a brief toast. +- **Gated on telemetry**: if the user has VS Code telemetry disabled, the buttons don't render at all. For the CLI when telemetry is off, the keybinds are no-ops. ### Architecture (high level) From 702496318c8bcd04d220c2d0208b28679f5eaa9c Mon Sep 17 00:00:00 2001 From: Joshua Lambert <25085430+lambertjosh@users.noreply.github.com> Date: Thu, 30 Apr 2026 16:25:42 -0400 Subject: [PATCH 3/3] Apply suggestion from @lambertjosh --- .../pages/contributing/architecture/per-message-feedback.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/packages/kilo-docs/pages/contributing/architecture/per-message-feedback.md b/packages/kilo-docs/pages/contributing/architecture/per-message-feedback.md index 900a68363d7..b638dba13d2 100644 --- a/packages/kilo-docs/pages/contributing/architecture/per-message-feedback.md +++ b/packages/kilo-docs/pages/contributing/architecture/per-message-feedback.md @@ -28,7 +28,7 @@ Add a thumbs-up / thumbs-down widget next to the existing copy button on every a ### Telemetry Payload -We deliberately collect fewer identifiers for non-Kilo providers, since those IDs can't be correlated to upstream data and add tracking surface without product benefit. +We deliberately collect fewer identifiers for non-Kilo providers, since those IDs can't be correlated to upstream data and add tracking surface without product benefit. Users of non-Kilo GW models would also not expect or want us to collect that information in Kilo GW from other providers. **Third party providers (Anthropic, OpenAI, local, etc.):** `providerID`, `modelID`, `variant?`, `rating`, `previousRating?`