Skip to content

feat(emails): add purchase transactional emails#3044

Open
evanjacobson wants to merge 23 commits intomainfrom
feat/transactional-emails_kiloclaw_topup
Open

feat(emails): add purchase transactional emails#3044
evanjacobson wants to merge 23 commits intomainfrom
feat/transactional-emails_kiloclaw_topup

Conversation

@evanjacobson
Copy link
Copy Markdown
Contributor

@evanjacobson evanjacobson commented May 5, 2026

Summary

KiloClaw and credit purchases now send the transactional confirmations users expect after money or credits move, closing a gap where successful top-ups and subscription activations were only reflected in-app.

This adds Mailgun-backed transactional emails for credit top-ups and Stripe-based KiloClaw subscription starts, with durable send markers so webhook retries can recover missed sends without double-emailing users. The implementation follows the existing server-rendered email template conventions and extends the current KiloClaw email-log pattern for activation-period dedupe.

Credit-based KiloClaw subscription starts will follow in a separate PR.

Implementation notes
  • Adds reusable transactional email rendering/sending helpers for the new purchase emails.
  • Adds a generic transactional_email_log table for idempotent non-KiloClaw purchase email markers.
  • Extends kiloclaw_email_log with period_start so subscription-started emails dedupe per activation period (in practice, either trial-->paid or canceled-->paid) rather than forever per instance.
  • Handles duplicate webhook/retry paths by using persisted purchase/change-log state to send once when the first attempt completed the business transaction but missed the email side effect.
  • Leaves terminal address failures deduped, while clearing provider-not-configured markers so retries can attempt delivery after configuration is fixed.

Verification

  • Not manually verified; this change is primarily webhook/email side-effect behavior and was covered with automated tests rather than a browser flow.

Visual Changes

N/A - no in-app UI changes. This adds new transactional email templates.

Reviewer Notes

Suggested review focus
  • Idempotency behavior around webhook retries and duplicate settlements.
  • The transactional_email_log and kiloclaw_email_log.period_start schema changes.
  • Provider failure handling: retryable provider_not_configured versus terminal invalid-recipient outcomes.
  • Copy and links in the new transactional email templates.

kilo-code-bot Bot and others added 23 commits May 5, 2026 08:41
Send exactly-once transactional emails for two purchase events:
- Credit top-up (manual Checkout or auto-top-up), hooked into
  processTopUp's idempotent post-processing block and gated by the
  existing unique constraint on credit_transactions.stripe_payment_id.
- KiloClaw subscription started (first paid billing period only),
  hooked into applyStripeFundedKiloClawPeriod and gated by an
  insert-before-send into kiloclaw_email_log so renewals are skipped.
…eriod

Existing paid subscribers have no kiloclaw_email_log row for the new
email_type, so their first renewal after deploy would insert a marker
and send a misleading 'subscription is active' email.

Gate the send on isFirstPaidPeriod — true only when the subscription
was previously in 'trialing' status (trial → paid transition). Renewals
and reactivations have before.status === 'active' and are skipped.
… settlement

Adds best-effort duplicate-settlement email recovery for the KiloClaw
subscription-started email when a webhook replay hits the duplicate-credit
path. Eligibility is derived from a durable subscription change-log entry
proving a prior paid activation for the same subscription, plan, and
period, with a 31-day window guard.

Also broadens activation eligibility to cover canceled rows (including
canceled paid rows that resubscribe), and strengthens tests to exercise
the production settlement path end-to-end with realistic kiloclaw_email_log
idempotency shapes.
… identity

Replaces the created_at ordering used in didStripeSubscriptionCreatedRecordEligibleActivation with a direct plan + period_start + period_end identity match against the stripe_subscription_created row's after_state. created_at defaults to now() (transaction-start), which under concurrent webhook transactions can reorder relative to commit chronology and let a stale activation log re-fire the email on a later renewal when the email-log marker is absent (e.g. after the intentional rollback in maybeSendKiloClawSubscriptionStartedEmail error path).

The handler in stripe-handlers.ts already stamps the Stripe-derived plan and period boundaries onto both the subscription row and the change-log after_state, so an activation log can only match the specific period it activated. A later renewal covers a different period and cannot match. Mirrors the existing identity-match approach in didPriorSettlementRecordPaidActivation.

Tests updated to stamp period boundaries when simulating handleKiloClawSubscriptionCreated and to seed the prior-activation renewal case with a different period than the current settlement.
…subscription_id

The 'stale duplicate recovery guard' test backdated every matching period_advanced / stripe_invoice_settlement row without scoping to the test's subscription, mutating shared DB state from earlier tests in the file and potentially corrupting later assertions.

seedSubscription already returns the subscription row; destructure it and add eq(kiloclaw_subscription_change_log.subscription_id, subscription.id) to the WHERE clause alongside the existing action / reason predicates.
… period

Replaces the per-instance-lifetime idempotency key on `kiloclaw_email_log`
with a per-activation key so that users who cancel and resubscribe on the
same KiloClaw instance actually receive a second subscription-started email.

- `packages/db/src/schema.ts` + migration `0106_noisy_pete_wisdom.sql`:
  add `period_start timestamptz NOT NULL DEFAULT 'epoch'` to
  `kiloclaw_email_log`; drop `UQ_kiloclaw_email_log_user_instance_type`;
  add `UQ_kiloclaw_email_log_user_instance_type_period` on
  `(user_id, instance_id, email_type, period_start)` WHERE
  `instance_id IS NOT NULL`.
- `apps/web/src/lib/kiloclaw/credit-billing.ts`: the insert-before-send
  in `maybeSendKiloClawSubscriptionStartedEmail` now writes `period_start`,
  and the delete-on-error branch now scopes by `period_start` so a failed
  send only clears its own marker.
- `apps/web/src/lib/purchase-emails.test.ts`: rewrote the unique-index
  unit test for the new shape, added a sibling test proving the index
  admits a second row for a new period, and added an end-to-end
  regression test that activates, cancels, and resubscribes on the same
  instance and asserts two sends and two log rows.

The KiloClaw transactional email work introduced a subscription-started
email gated on `kiloclaw_email_log` as the durable idempotency surface.
The existing unique index on `(user_id, instance_id, email_type)` was
designed for one-per-instance-lifetime emails (e.g. `claw_instance_ready`)
and is wrong for activation-event emails: after the first activation
wrote a row, every future resubscribe on the same instance would conflict
on the index, `onConflictDoNothing` would return `rowCount=0`, and the
function would exit early without sending. Users who cancel and rejoin
— a normal lifecycle — silently lost their activation email.

The adversarial review captured this as cloud-ib7 with the explicit
instruction to pick one of two product semantics and make the code
consistent with the test expectation. We picked "one email per activation
event" because (a) the existing test at purchase-emails.test.ts:440
already asserts a canceled-paid resubscribe should send, and (b) removing
canceled-paid rows from eligibility would deprive returning customers of
a confirmation they reasonably expect.

**Why a `period_start` column and not a `subscription_id` column.**
Resubscribe (both Stripe checkout and credit enrollment) UPDATEs the
existing `kiloclaw_subscriptions` row in place rather than inserting a
new one — Stripe path at `stripe-handlers.ts:870+` (allowed when
`existingRow.status === 'canceled'`), credit path at
`credit-billing.ts:1147` via `onConflictDoUpdate` on `instance_id`. Our
internal `kiloclaw_subscriptions.id` is therefore stable across every
activation on the instance and adds no discriminative power as a dedupe
key. `stripe_subscription_id` is NULL for pure-credit subscriptions
(`enrollWithCredits` explicitly writes `stripe_subscription_id: null`),
so it cannot serve as the key either without special-casing. What
actually differs across activations on both paths is the period
boundary: Stripe stamps fresh `current_period_start` from the invoice
line item; credits stamp `nowIso` on every enrollment. One column
handles both paths.

**Why `NOT NULL DEFAULT 'epoch'` instead of nullable.** Postgres treats
`NULL` as distinct in unique indexes by default, which would let any
other email type that omits `period_start` insert multiple rows and
break the existing one-per-instance-lifetime contract for
`claw_instance_ready`, `claw_suspended_*`, and friends. Drizzle's
`nullsNotDistinct()` is only available on `unique()` constraints, which
do not support partial `WHERE`. Defaulting to `'epoch'` lets every
existing writer keep working unchanged — they all collapse onto the
same `(user, instance, type, 'epoch')` index row — while only the
subscription-started email path opts in to per-activation keying by
explicitly writing `periodStart`.

**Why not a synthetic `dedupe_key text`.** A natural timestamp column
is queryable, self-documenting, and makes admin tooling easier
("show me all activation emails for period X"). A synthetic string key
forces every reader to parse it.

**Why the delete-on-error also got tightened.** The previous delete
cleared every row for `(user, instance, type)`, which was fine when
only one row could exist. With per-activation keying it would be a
foot-gun: a failed send on activation N could erase activation N-1's
durable marker. The new scope is `(user, instance, type, periodStart)`
so a failure only touches its own insert.

**Other writers of `kiloclaw_email_log` are unaffected.** The kiloclaw
billing worker (`services/kiloclaw-billing/src/lifecycle.ts`), the
KiloClaw router instance-destroy cleanup, the admin trial-reset flow,
and the admin instance-reset flow all write and delete rows without
referencing `period_start`. The `DEFAULT 'epoch'` fills in a stable
value so their inserts still collapse one-per-(user, instance, type)
and their deletes (filtered by `email_type IN (...)`) still match every
relevant row regardless of `period_start`.

**Existing production rows get `period_start='epoch'` on backfill.**
For `kiloclaw_subscription_started` rows written before the migration,
this means the first post-deploy activation on the same instance will
write a row with a real `periodStart` and succeed. For renewals that is
correctly suppressed upstream by
`shouldSendSubscriptionStartedEmailForActivation` (before we ever reach
the email helper), so existing active subscribers do not get duplicate
emails. For resubscribes — the exact cohort this fix exists for — a
second email correctly fires.

**Coordinates with adjacent beads.**
- cloud-4lb (enrollWithCredits does not send subscription-started on
  credit activation) becomes trivial to land: pass the new period start
  to `maybeSendKiloClawSubscriptionStartedEmail` and per-activation
  dedupe already works for the credit path.
- cloud-j1o (template copy hard-codes "first billing period") was
  previously moot because resubscribes never received the email. It is
  now a real product bug and should be addressed.

**Does not touch.** Email rendering or templates, credit accounting,
Stripe webhook parsing, subscription lifecycle state machine, top-up
email flow (cloud-0zq), `softDeleteUser`/GDPR retention (the new
column is a billing-period boundary, not PII; the retention test at
`user.test.ts:1536` still passes).

Manually verified:
- `pnpm --filter web typecheck` passes
- `pnpm --filter kiloclaw-billing typecheck` passes
- `pnpm --filter web test -- purchase-emails` — 20/20 pass, including
  the new per-period admit-second-row test and the end-to-end
  activate→cancel→resubscribe test
- `pnpm --filter web test -- user.test` — 59/59 pass, including the
  GDPR retention test
- `pnpm --filter kiloclaw-billing test` — 60/60 pass
- `pnpm format` clean

Closes cloud-ib7.
Per KiloClaw billing spec (Stripe-Funded Credit Settlement rule 10),
$0 invoices must still run settlement and transition the row into the
activated hybrid state. The subscription-started email is an activation
notification, not a revenue side effect, so it must fire regardless of
invoice amount. Revenue side effects (analytics, affiliate sale events)
apply their own amount_paid > 0 guard in stripe-handlers.ts.

Drops the amountMicrodollars > 0 gate on the email so users activated
by a full coupon or promo still receive the activation notification.
The existing '$0 Stripe settlement' test in purchase-emails.test.ts
locks in this behavior.
…der not configured

maybeSendKiloClawSubscriptionStartedEmail inserted the kiloclaw_email_log
marker before calling sendKiloClawSubscriptionStartedEmail and only
deleted it if the send threw. When the provider returned
{sent: false, reason: 'provider_not_configured'} without throwing (e.g.
Mailgun env missing in a preview environment), the marker persisted and
permanently suppressed the email on future webhook retries via the
unique index guard.

Inspect the SendResult and clear the marker on provider_not_configured
so a retry can re-attempt. Mirrors the proven pattern in
services/kiloclaw-billing/src/lifecycle.ts:879-884.

neverbounce_rejected is deliberately left in place: the verdict is
terminal for that address (invalid / disposable), so retrying would
loop forever. Leaving the row keeps the outcome idempotent — we tried
once, the address was rejected, we do not try again.

Refactored the delete branch into deleteSubscriptionStartedEmailLog,
reused by both the non-throwing failure path and the existing catch.

Tests: one asserting the log row is cleared on provider_not_configured
so a retry can re-send, one asserting the row persists on
neverbounce_rejected so we do not retry a terminally invalid address.
Widened sendMock's return type to SendResult so mockImplementationOnce
can return {sent: false, reason: ...}.
…mport in purchase-emails test

oxlint's consistent-type-imports rule forbids inline import() type
annotations. Convert to a top-level 'import type * as
creditBillingModule' at the file header, matching the existing pattern
used for emailModule.
Refactors purchase-emails.test.ts to mock sendViaMailgun and verifyEmail
so every test exercises the real sendCreditsTopUpEmail and
sendKiloClawSubscriptionStartedEmail code paths — including formatUsd
rounding, formatDate formatting, subjectOverride selection, and
credits_url / manage_url / receipt_section construction. Previously
the helpers themselves were mocked with synthetic implementations, so
a rename like receipt_url → receipt_section would ship green.

Adds direct payload tests for both helpers covering the happy path,
neverbounce rejection, provider_not_configured, null receipt URL, and
the zero-cent price case.
…tarted email

The template hard-coded 'Your first billing period for KiloClaw hosting has
started', but the subscription-started email is intentionally sent on every
activation — including resubscribes after cancellation (per the per-activation
period_start dedupe landed in cloud-ib7). For those resubscribers the 'first'
language is factually wrong.

Replace with neutral wording ('A KiloClaw hosting billing period has started')
that is correct for both trial→paid and canceled→resubscribe activations and
aligns with the transactional content guideline in apps/web/src/emails/AGENTS.md.

Closes cloud-j1o.
…etry

processTopUp commits the credit_transactions row and then fires the top-up
confirmation email via after(). If the process exited between those two
steps, the credit-transactions unique index deduped the credit on webhook
retry but the email was lost — the retry bailed early on the duplicate
insert before reaching the email step.

Add a top_up_email_log outbox marker keyed by stripe_payment_id and run the
same marker-gated send on both first attempts and duplicate-webhook retries.
Mirrors the existing maybeSendKiloClawSubscriptionStartedEmail pattern in
credit-billing.ts:

- First-attempt send inserts the marker before sending.
- Duplicate-webhook path observes the committed credit, attempts the same
  marker-gated send, and only fires if no prior send has been recorded.
- skipPostTopUpFreeStuff is respected on the retry path so Kilo-Pass-style
  internal reuses of processTopUp cannot send user-facing top-up emails.
- provider_not_configured clears the marker so future retries can re-attempt;
  neverbounce_rejected is intentionally kept as a terminal state.

Retain top_up_email_log rows on softDeleteUser (financial outbox record, no
PII beyond user_id which references the anonymized user row). Added GDPR
retention test.

Generated migration 0107_magical_rattler.
Assert applyStripeFundedKiloClawPeriod does not send a subscription-started
email (or write a kiloclaw_email_log row) when settling a successful renewal
retry on a past_due or unpaid subscription. Pins the
shouldSendSubscriptionStartedEmailForActivation contract: dunning recoveries
are not new activations.

Closes cloud-7gh.
…n check

Remove SUBSCRIPTION_STARTED_RECOVERY_WINDOW_MS and the created_at window
guard in didPriorSettlementRecordPaidActivation. The identity match
(subscription_id + action/reason scope + exact plan + period boundaries
on after_state) is already unique per activation: stripe_invoice_settlement
rows are written only by applyStripeFundedKiloClawPeriod once per
successful settlement, and KiloClaw never uses Stripe proration, so
renewals move period boundaries forward and two settlements on the same
subscription cannot share plan+period.

Removing the window lets legitimately delayed webhook replays (long
outage, manual Stripe-dashboard resend) still recover the
subscription-started email. The kiloclaw_email_log unique index remains
the final idempotency guard.

Also drops the now-obsolete 'stale duplicate recovery guard' test.

Refs: cloud-ymg
/claw redirects active users to /claw/chat and inactive users to
/claw/new, so the 'Manage subscription' CTA landed on the wrong page.
Point it at /claw/subscription, the personal subscription management
route.
… log

If kilocode_users returned empty after the marker was inserted, the
marker persisted and permanently suppressed the subscription-started
email on retry via the unique index. Move the user lookup before the
insert so a missing user returns without writing a marker. Mirrors the
ordering in apps/web/src/app/api/internal/kiloclaw/instance-ready/route.ts,
which does the same user-lookup-before-marker check on a sibling
kiloclaw_email_log path.
Soften the comments on the two new email dedupe paths to call out the

shared at-most-once-marker gaps (crash between insert and send; catch-block

rollback after ambiguous provider errors). References the sibling sites

that share the same pattern so the fix is scoped to a shared outbox across

all of them rather than a one-off here.
Narrow the catch in resolveStripeReceiptUrl to only silence
StripeInvalidRequestError (the expected outcome when the payment was
refunded/voided before the webhook arrived). Route every other error —
rate-limit, API 5xx, authentication, non-Stripe programmer faults —
through captureException so systemic failures become visible instead of
being silently swallowed. Matches the autoTopUp.ts / admin-router.ts
convention of swallowing a specific known-benign Stripe subclass and
reporting the rest. The email flow still never fails on receipt-lookup
errors.
On webhook replays against an already-emailed, already-settled period,
applyStripeFundedKiloClawPeriod was running a kiloclaw_subscription_change_log
scan plus application-side JSONB filtering on every retry, only for the
subsequent marker-insert to no-op against the kiloclaw_email_log unique
index. Gate the expensive scan on a fast existence check covered by the
UQ_kiloclaw_email_log_user_instance_type_period index: if the activation
already has an email-log row, we know the send is handled and can skip
both the change-log recovery logic and the downstream send call.
Correctness is unchanged — the unique index remains the authoritative
idempotency guard.
@evanjacobson evanjacobson marked this pull request as ready for review May 5, 2026 16:42
Comment thread apps/web/src/lib/user.ts
* - referral_code_usages (financial, references anonymized user)
* - kiloclaw_subscriptions, kiloclaw_earlybird_purchases, kiloclaw_email_log (retained records)
* - kiloclaw_scheduled_action_targets (retained operational records;
* - transactional_email_log (retained outbox marker, financial record)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I landed on this decision considering that kiloclaw_email_log above is retained. Let me know if I should do the GDPR soft delete flow for this table instead.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention behind creating this table was to do the following:

  1. Create a table that can be used for idempotency checks for non-KiloClaw emails
  2. Allow the table to be used by future unknown, idempotent emails (or wherever we want to track emails sent)

Comment thread packages/db/src/schema.ts
.where(isNull(table.instance_id)),
uniqueIndex('UQ_kiloclaw_email_log_user_instance_type')
.on(table.user_id, table.instance_id, table.email_type)
uniqueIndex('UQ_kiloclaw_email_log_user_instance_type_period')
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was on the fence regarding keeping this in the business logic only. I'd be happy to remove this and do that instead.


if (!didInsertCreditTransaction) {
if (!skipPostTopUpFreeStuff) {
await recoverTopUpConfirmationEmailIfMissing({
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Duplicate recovery can send a confirmation before credits are applied

processTopUp inserts the credit_transactions row before updating kilocode_users.total_microdollars_acquired. If the process crashes in that window, the retry lands here, treats the duplicate transaction as a completed top-up, schedules the confirmation email, and returns without repairing the balance update. That can send a successful top-up email for credits that were never reflected on the user's account; make the transaction insert and balance update atomic, or verify/repair the credited balance before recovering the email.

ALTER TABLE "transactional_email_log" ADD CONSTRAINT "transactional_email_log_user_id_kilocode_users_id_fk" FOREIGN KEY ("user_id") REFERENCES "public"."kilocode_users"("id") ON DELETE no action ON UPDATE no action;--> statement-breakpoint
CREATE UNIQUE INDEX "UQ_transactional_email_log_type_idempotency_key" ON "transactional_email_log" USING btree ("email_type","idempotency_key");--> statement-breakpoint
CREATE INDEX "IDX_transactional_email_log_user_id" ON "transactional_email_log" USING btree ("user_id");--> statement-breakpoint
CREATE UNIQUE INDEX "UQ_kiloclaw_email_log_user_instance_type_period" ON "kiloclaw_email_log" USING btree ("user_id","instance_id","email_type","period_start") WHERE "kiloclaw_email_log"."instance_id" is not null; No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: The replacement index build can block production writes

kiloclaw_email_log is an existing billing email-log table, and CREATE UNIQUE INDEX without CONCURRENTLY takes a lock while scanning the table. On a populated table this can block webhook/background-job writes during deployment; use a staged/concurrent index migration (and drop the old index concurrently after the replacement is ready) to avoid write downtime.

@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot Bot commented May 5, 2026

Code Review Summary

Status: 2 Issues Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 2
SUGGESTION 0
Issue Details (click to expand)

WARNING

File Line Issue
apps/web/src/lib/credits.ts 97 Duplicate top-up recovery can send a confirmation email after only the transaction row committed, before the balance update is repaired.
packages/db/src/migrations/0110_giant_gabe_jones.sql 14 Replacement unique index on existing kiloclaw_email_log is created without CONCURRENTLY, which can block production writes during deployment.
Other Observations (not in diff)

Issues found in unchanged code that cannot receive inline comments:

File Line Issue
N/A N/A None
Files Reviewed (14 files)
  • apps/web/src/emails/AGENTS.md - 0 issues
  • apps/web/src/emails/creditsTopUp.html - 0 issues
  • apps/web/src/emails/kiloClawSubscriptionStarted.html - 0 issues
  • apps/web/src/lib/credits.ts - 1 issue
  • apps/web/src/lib/email.ts - 0 issues
  • apps/web/src/lib/kiloclaw/credit-billing.ts - 0 issues
  • apps/web/src/lib/purchase-emails.test.ts - 0 issues
  • apps/web/src/lib/user.test.ts - 0 issues
  • apps/web/src/lib/user.ts - 0 issues
  • apps/web/src/routers/admin/email-testing-router.ts - 0 issues
  • packages/db/src/migrations/0110_giant_gabe_jones.sql - 1 issue
  • packages/db/src/migrations/meta/0110_snapshot.json - 0 issues
  • packages/db/src/migrations/meta/_journal.json - 0 issues
  • packages/db/src/schema.ts - 0 issues

Fix these issues in Kilo Cloud


Reviewed by gpt-5.5-20260423 · 1,247,813 tokens

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant