Skip to content

feat(billing): drop trace table, make 'analyse my spend' clickable#2331

Open
pauldambra wants to merge 9 commits into
mainfrom
posthog-code/spend-banner-drop-traces-add-analyse-cta
Open

feat(billing): drop trace table, make 'analyse my spend' clickable#2331
pauldambra wants to merge 9 commits into
mainfrom
posthog-code/spend-banner-drop-traces-add-analyse-cta

Conversation

@pauldambra
Copy link
Copy Markdown
Member

@pauldambra pauldambra commented May 23, 2026

Problem

Two issues with the PostHog Code token spend banner:

  1. The Top traces table is noise. Trace IDs are opaque strings (UUIDs or JSON-shaped device blobs), the list is long, and there's no actionable next step from seeing it.
  2. The "drop the exploring-llm-costs skill into your agent" line is a dead link. It links to the SKILL.md on GitHub, which doesn't actually move the user toward the analysis they want.

Changes

  1. Drop the TraceTable component, the trace branch in generateSuggestions, and the SpendAnalysisTraceRow / top_traces types on the renderer side. Backend PR PostHog/posthog#59796 keeps the field in the wire response (empty) for backward-compat; TS structural typing tolerates the extra property.
  2. Replace the static GitHub link with a button: "Open a task to analyse this with an agent". Click calls navigateToTaskInput({ initialPrompt }) (same pattern as ReportDetailPane's "Act on this signal report") with a prompt that bundles:
    • The spend data as markdown tables (summary, by_product, by_tool top 10, by_model)
    • A self-contained playbook of cost-reduction levers (input tokens, model choice, subagent hygiene, no-tool replies, MCP registry overhead) — embedded directly so the agent doesn't need any external skill or data source
    • Explicit "do not query external analytics" instruction (the data is everything)
    • A request for ranked recommendations with motivating data points and savings estimates
  3. Close the Settings dialog before navigating. The banner lives inside the Settings modal; without closing it first the new task input opens behind the still-mounted dialog.
  4. Escape pipes, newlines, and backticks in markdown table cells (escapeTableCell). Tool/model/product names come from event properties — in multi-tenant projects those can be attacker-influenced. Newlines or backticks would let an attacker break out of the table row into what reads as a fresh instruction block on the new task's first turn (the agent has full Bash/Edit/Write/MCP access). Replacing rather than escaping keeps cells readable while neutralising the structural attack.
  5. Analytics: new SPEND_ANALYSIS_TASK_OPENED event captures total / scoped cost, generation count, window days, and row counts when the button is clicked — enables a funnel from "saw the banner" → "clicked the button" → "submitted the task" → "what came of it".
  6. Extract prompt + format helpers to utils/spendAnalysisPrompt.ts and utils/spendAnalysisFormat.ts. The format helpers are shared between the React tables and the markdown prompt so the agent sees the same shape the user sees. 18 unit tests cover pipe escaping, the canonical prompt-injection shape, empty-breakdown fallbacks, the 10-row tool cap, and the no-external-data instruction.
  7. Tighten playbook copy to be model-agnostic ("most expensive tier" instead of "Opus/Sonnet/Haiku") and pull it to a module constant so product can iterate on it without touching the data-shaping logic.

Why the playbook is embedded, not a skill reference

An earlier version asked the agent to load the exploring-llm-costs skill from the PostHog skill store. That skill is internal-team focused — it teaches the agent to query PostHog LLM analytics directly via MCP, which only works for users with access to the underlying analytics project (PostHog staff). For arbitrary users the skill doesn't help and may actively mislead the agent into hunting for data it can't reach. Baking the advice into the prompt directly means it works for every user regardless of their PostHog project access.

How did you test this?

I'm an agent — Claude Code via PostHog Code.

  • pnpm exec vitest run apps/code/src/renderer/features/billing/utils/spendAnalysisPrompt.test.ts — 18 passed.
  • pnpm exec biome check apps/code/src/renderer/features/billing/ apps/code/src/shared/types/analytics.ts — clean.
  • pnpm -F @posthog/code exec tsc --noEmit — only pre-existing unrelated auth/service.test.ts errors.
  • Manual UI test by the human reviewer (paul): confirmed prefill works, settings-dialog-not-closing was caught and fixed in 5928b824.

Publish to changelog?

no

🤖 Agent context

Tool: Claude Code via PostHog Code harness.

Path the iteration took:

  1. First cut dropped traces + added the button with a "load the skill" instruction.
  2. paul-reviewer flagged: button copy, MCP soft-dependency on the skill, missing analytics, format duplication.
  3. Greptile P2: markdown pipe injection — escape helper added.
  4. Human review caught the staff-only exploring-llm-costs skill is useless to most users — replaced with embedded self-contained playbook.
  5. Human UI test caught the settings dialog stayed mounted over the new task input — fixed by closing the dialog first.
  6. Second qa-swarm pass surfaced a convergent prompt-injection finding (security-audit elevated to MEDIUM): escapeTableCell only handled |, not newlines/backticks — expanded coverage.
  7. Final follow-up pass addressed paul's twice-flagged analytics gap, paul+xp's "extract buildAnalysisPrompt" suggestion, and qa-team+paul+xp's "drop dead SpendAnalysisTraceRow type" finding.

Pair with backend PR PostHog/posthog#59796 (drops top_traces content to empty) and PostHog/posthog#59445 / #59619 / #59784 (the upstream auth + endpoint fixes that made this banner reachable in the first place).


Created with PostHog Code

Two improvements to the PostHog Code token spend banner:

1. Drop the Top traces table. Trace IDs are opaque strings (UUIDs or
   JSON-shaped device blobs) that aren't actionable, and the list is
   too long to scan. The corresponding API field is being deprecated
   on the backend in PR #59796 (returns empty there); removing the
   rendering means we don't show an empty section.

2. Replace the static GitHub link to `exploring-llm-costs` with a
   button that opens a new task prefilled with a markdown report of
   the user's spend (summary, by_product, by_tool top 10, by_model)
   plus a prompt asking the agent to load the skill from the PostHog
   skill store (`mcp__posthog__exec` -> `llma-skill-get`) and rank
   reduction advice by impact.

The prefill saves the new task an API round-trip and gives the agent
the full breakdown in its initial context, so it can answer the
'what should I do to spend less' question immediately without
fetching data first.

Generated-By: PostHog Code
Task-Id: f9d5d152-49c6-46cf-8fde-079105ba2e67
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 23, 2026

Comments Outside Diff (1)

  1. apps/code/src/renderer/features/billing/components/TokenSpendAnalysisBanner.tsx, line 422-425 (link)

    P2 The initial callout (shown before data loads) still advertises "by trace" — but the trace table was removed in this PR, so the description will be stale. Consider dropping "by trace" from the copy to match the current feature set.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: apps/code/src/renderer/features/billing/components/TokenSpendAnalysisBanner.tsx
    Line: 422-425
    
    Comment:
    The initial callout (shown before data loads) still advertises "by trace" — but the trace table was removed in this PR, so the description will be stale. Consider dropping "by trace" from the copy to match the current feature set.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
apps/code/src/renderer/features/billing/components/TokenSpendAnalysisBanner.tsx:422-425
The initial callout (shown before data loads) still advertises "by trace" — but the trace table was removed in this PR, so the description will be stale. Consider dropping "by trace" from the copy to match the current feature set.

### Issue 2 of 2
apps/code/src/renderer/features/billing/components/TokenSpendAnalysisBanner.tsx:242-261
Markdown table injection from unescaped data values — if a `product`, `tool`, or `model` name contains a pipe character `|`, the generated markdown table will have misaligned cells, causing the agent to misread row boundaries. For example a tool name like `bash | grep` would silently split into extra columns. Consider replacing `|` with `\|` (or a similar escape) in all cell values before interpolation.

Reviews (1): Last reviewed commit: "feat(billing): drop trace table, make 'a..." | Re-trigger Greptile

Comment thread apps/code/src/renderer/features/billing/components/TokenSpendAnalysisBanner.tsx Outdated
…task

Paul-reviewer flagged the previous "Ask an agent to analyse this and
suggest reductions" copy as reading like it kicks off something inline,
when it actually navigates to a new task input with a prefilled prompt.
"Open a task to analyse this with an agent" makes the destination
explicit.

Generated-By: PostHog Code
Task-Id: f9d5d152-49c6-46cf-8fde-079105ba2e67
@pauldambra
Copy link
Copy Markdown
Member Author

Note

🤖 Automated comment by QA Swarm — not written by a human

Single-reviewer pass: paul-reviewer (focused review chosen for this small frontend-only change).

Verdict: 💬 APPROVE WITH NITS → 1 fix pushed, rest deferred

The diff is doing the right things — drop the unactionable trace table, swap the static GitHub link for a "send the spend data into a fresh agent task" affordance that mirrors the inbox `ReportDetailPane` pattern. No blockers.

Fixed in `a457d50e`

  • 🟢 (NIT) Button copy — "Ask an agent to analyse this and suggest reductions" read like it kicked off something inline. Changed to "Open a task to analyse this with an agent" to make the destination explicit.

Deferred (non-blocking)

  • 🟡 (MEDIUM) MCP soft dependency — the prompt asks the new agent to load `exploring-llm-costs` via `mcp__posthog__exec` → `llma-skill-get`. If the receiving agent doesn't have the posthog MCP wired up (or uses a different server name), the instruction silently no-ops. Mitigation: the prompt already includes the full data in the initial context, so even without the skill the agent has the numbers and can give generic advice. The skill is the playbook; the data is the substance.
  • 🟢 (LOW) Missing analytics event — no `track(ANALYTICS_EVENTS.X)` call in the button handler. This repo's analytics layer would need a new event added to `ANALYTICS_EVENTS` + `EventPropertyMap`, which is a bigger change than the finding warrants. Worth a follow-up so we can measure: did anyone click the button → did it open a task → did the task produce a useful answer?
  • 🟢 (LOW) Format-string duplication — `buildAnalysisPrompt` re-derives the same `formatUsd` / `formatTokens` / "(no tool)" / "(unknown)" fallback strings the React tables use. If someone tweaks the tables, the prompt drifts silently. Acceptable for now; consider extracting shared format helpers if a second prompt-builder appears.
  • (NIT) Skill name string coupling — `SKILL_NAME = "exploring-llm-costs"` is a magic-string contract with the PostHog skill store. If the skill gets renamed, this silently asks for one that doesn't exist. Future fragility note, not a fix today.

Reviewer summary

Reviewer Assessment
👤 paul Ship as you see fit. Diff is the right shape; the main concerns are about resilience of the MCP-skill-load handshake and a missing analytics event so we can measure adoption.

Automated by QA Swarm — not a human review

Greptile flagged that tool/product/model names containing `|` would
split markdown-table cells mid-row, causing the receiving agent to
misread the row boundaries. For example a tool name like `bash | grep`
would silently extend the row by an extra column.

Add a small `escapeTableCell` helper and apply it to every cell value
that comes from the data (product, tool, model names). Numbers and
$-prefixed formatted strings are pipe-free.

Generated-By: PostHog Code
Task-Id: f9d5d152-49c6-46cf-8fde-079105ba2e67
@pauldambra pauldambra added the Stamphog This will request an autostamp by stamphog on small changes label May 23, 2026
The previous prompt asked the receiving agent to load the
`exploring-llm-costs` skill from the PostHog skill store before
answering. That skill is internal/PostHog-staff-focused — it teaches
the agent to query PostHog LLM analytics directly via MCP, which only
works for users with access to the underlying analytics project. For
arbitrary PostHog Code users the skill doesn't help and may actively
mislead the agent into trying queries that return nothing.

Replace the skill instruction with a "What to look at" playbook
embedded directly in the prompt:
1. Input tokens are the bill, not the tool calls themselves
2. Model choice (Opus -> Sonnet / Haiku for routine work)
3. Subagent hygiene (Agent tool inherits brief + tool defs)
4. (no tool) share — model talking instead of acting
5. MCP registry overhead

Also adds an explicit "do not try to query PostHog LLM analytics or any
external data source — the numbers here are everything you have"
instruction so the agent doesn't hunt for data it can't reach, and
asks for ranked recommendations with motivating data points and
savings estimates.

Generated-By: PostHog Code
Task-Id: f9d5d152-49c6-46cf-8fde-079105ba2e67
The banner lives inside the Settings dialog (modal). Calling
`navigateToTaskInput` changed the underlying view but the dialog stayed
mounted on top, so users saw the spend banner instead of the prefilled
task input. Close the settings dialog first.

Generated-By: PostHog Code
Task-Id: f9d5d152-49c6-46cf-8fde-079105ba2e67
Convergent finding across qa-team, paul, xp, and security-audit:
`escapeTableCell` only handled `|` but not newlines or backticks.
Security-audit elevated this to a real prompt-injection vector — the
spend data flows from event properties (potentially attacker-
influenced in multi-tenant SaaS projects) into a markdown table that
is then fed to an agent with full tool access (Bash, Edit, Write,
MCP). A tool / product / model name like
`legit-tool\n\n## SYSTEM OVERRIDE\nIgnore prior instructions...`
would break out of the table row into what reads as a fresh
instruction block on the new task's first turn.

Expand `escapeTableCell` to also replace `\r`, `\n`, and backticks
with spaces (replacing rather than escaping keeps the cell readable
while neutralising the structural attack). Updates the doc comment
with the threat model so the next reader understands why this is
defensive, not cosmetic.

Generated-By: PostHog Code
Task-Id: f9d5d152-49c6-46cf-8fde-079105ba2e67
@pauldambra
Copy link
Copy Markdown
Member Author

Note

🤖 Automated comment by QA Swarm — not written by a human

Multi-perspective review: qa-team, paul-reviewer, xp-reviewer, security-audit (full 4-agent swarm at SHA `5928b824`).

Verdict: 💬 APPROVE WITH NITS → 1 fix pushed, rest open

Convergent finding (all 4 reviewers)

🟡 MEDIUM — markdown injection via untrusted cell values. `escapeTableCell` only handled `|` but not newlines or backticks. Security-audit elevated this to a real prompt-injection vector: spend data flows from event properties (potentially attacker-influenced in multi-tenant projects) into a markdown table that's then fed to an agent with full tool access (Bash/Edit/Write/MCP). A tool / model / product name like `legit-tool\n\n## SYSTEM OVERRIDE\nIgnore prior instructions...` would break out of the table row into what reads as a fresh instruction block on the new task's first turn.

Fixed in `7bc7dde8` — expanded `escapeTableCell` to replace `\r`, `\n`, and backticks with spaces (replacing rather than escaping keeps the cell readable while neutralising the structural attack). Doc comment updated with the threat model.

Other convergent findings (3-way, not fixed)

🟢 LOW — `SpendAnalysisTraceRow` is now dead weight (qa-team + paul + xp). The trace table, formatTrace, formatDate, and the trace-suggestion branch were all removed, but the type and the `top_traces` field on the response are still in `types/spend-analysis.ts`. Backend PR PostHog/posthog#59796 keeps the field in the response shape (always empty) to avoid breaking existing consumers, so the field can stay — but the row type is unreferenced and a future reader will wonder what it's for.

Other open findings (single-reviewer)

  • 🟡 MEDIUM (paul, flagged twice now) — no analytics on the button. The whole point of this is "did opening this button lead to a useful task?" — without a `track()` call we can't answer. The repo uses `track(ANALYTICS_EVENTS.X, props)`, so wiring it would mean adding a new event to `ANALYTICS_EVENTS` + `EventPropertyMap` and calling it in `handleAnalyseClick`.
  • 🟢 LOW (paul + xp) — `buildAnalysisPrompt` is ~70 lines mixing data-shaping with a static playbook. Could be extracted to its own file with a unit test that pins the markdown contract.
  • 🟢 LOW (xp) — `closeSettings()` couples this banner to the Settings dialog. Fine for one caller; if reused elsewhere, an `onBeforeNavigate` prop would push the modal concern up.
  • NIT (paul + xp) — playbook copy is Today-shaped (Opus/Sonnet/Haiku tiers, "(no tool)" naming, MCP specifics). Will rot quietly. A `// keep in sync with ...` pointer would help.
  • NIT (qa-team) — no test coverage for `escapeTableCell` or `buildAnalysisPrompt`. Parameterised escape test is cheap.

Reviewer summaries

Reviewer Assessment
🔍 qa-team LOW risk; APPROVE WITH NITS. Clean deprecation thread-pull; the escape helper should cover newlines (now fixed).
👤 paul Stamp once analytics is in. Skill-removal and button-copy fixes from the previous pass landed cleanly.
📐 xp Clean deprecation; `SpendAnalysisTraceRow` dead-weight is the main loose end. Playbook would simplify extracted.
🛡 security-audit Prompt-injection sink via event-property data flow now mitigated by the broader escape (was MEDIUM, now neutralised).

Automated by QA Swarm — not a human review

@pauldambra pauldambra removed the Stamphog This will request an autostamp by stamphog on small changes label May 24, 2026
@pauldambra pauldambra added the Stamphog This will request an autostamp by stamphog on small changes label May 24, 2026
…rop dead type

Addresses three QA-swarm follow-ups from PR review:

1. **Analytics** (paul, flagged twice): add `SPEND_ANALYSIS_TASK_OPENED`
   analytics event captured in `handleAnalyseClick`. Carries the headline
   numbers (total / scoped cost, generation count, window in days, row
   counts) so we can build a funnel from "saw the banner" -> "clicked
   the button" -> "submitted the task".

2. **Extract & test** (paul + xp): `buildAnalysisPrompt` and `escapeTableCell`
   move from inline-in-component to `utils/spendAnalysisPrompt.ts`, with
   `formatUsd`/`formatTokens`/`formatWindow` in `utils/spendAnalysisFormat.ts`
   as a shared single source of truth for both the React tables and the
   markdown prompt. Adds `spendAnalysisPrompt.test.ts` with 18 cases
   covering pipe escaping, newline/backtick neutralisation, the canonical
   prompt-injection shape, empty-breakdown fallbacks, 10-row tool cap,
   and the no-external-data instruction.

3. **Drop dead type** (qa-team + paul + xp): remove SpendAnalysisTraceRow
   and the top_traces field from the response interface. Backend still
   ships the field empty (PostHog/posthog#59796) but the renderer doesn't
   consume it, and TypeScript's structural typing tolerates the extra
   property at runtime.

Also tightens the playbook copy in the prompt: drops Anthropic-specific
model names (Opus/Sonnet/Haiku) in favour of "most expensive tier" /
"mid-tier" / "cheapest tier" so the advice ages better, and pulls the
playbook to a module constant so product can iterate on it without
touching the data-shaping code.

Generated-By: PostHog Code
Task-Id: f9d5d152-49c6-46cf-8fde-079105ba2e67
@pauldambra pauldambra added Stamphog This will request an autostamp by stamphog on small changes and removed Stamphog This will request an autostamp by stamphog on small changes labels May 24, 2026
GitHub Advanced Security (CodeQL) flagged the spend-analysis escape
helper for "incomplete string escaping or encoding" — when escaping
`|` to `\|` we weren't first escaping the backslash itself. Input
`foo\|bar` would become `foo\\|bar`, which a markdown parser reads
as "literal backslash, literal pipe", defeating the pipe escape we
just applied.

Fix is the standard escape-the-escape-character-first pattern: replace
`\` with `\\` before replacing `|` with `\|`. Order matters; the regex
chain now handles `\` -> `|` -> newlines/backticks in that order.

Adds three test cases pinning the new behaviour (plain backslash,
backslash-before-pipe round-trip, double backslash).

Generated-By: PostHog Code
Task-Id: f9d5d152-49c6-46cf-8fde-079105ba2e67
@pauldambra pauldambra added Stamphog This will request an autostamp by stamphog on small changes and removed Stamphog This will request an autostamp by stamphog on small changes labels May 24, 2026
Greptile flagged that the initial callout (shown before data loads)
still advertised "by tool, by model, by trace" even though the trace
table was dropped earlier in this PR. Update the copy to "by product,
tool, and model" so it matches what the loaded banner actually shows.

Generated-By: PostHog Code
Task-Id: f9d5d152-49c6-46cf-8fde-079105ba2e67
@pauldambra pauldambra added Stamphog This will request an autostamp by stamphog on small changes and removed Stamphog This will request an autostamp by stamphog on small changes labels May 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Stamphog This will request an autostamp by stamphog on small changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants