Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions .claude/memory/MEMORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@ Keep this file under 200 lines — anything longer is content bloat, not memory.
- [learnings/sentry-mcp-no-comment-tool](learnings/2026-05-26-sentry-mcp-no-comment-tool.md) — Sentry MCP can't comment on issues; back-link from Linear only
- [learnings/vercel-blocks-unknown-author-email](learnings/2026-05-26-vercel-blocks-unknown-author-email.md) — Vercel preview deploys block when commit author email has no GitHub account; use the noreply alias
- [learnings/sandbox-cant-clone-private-repo](learnings/2026-05-26-sandbox-cant-clone-private-repo.md) — Don't `git clone` from sandbox bash; the `github_repository` resource is auth'd, raw `git clone` is not
- [learnings/github-mcp-strips-html-comments](learnings/2026-05-26-github-mcp-strips-html-comments.md) — `update_pull_request` silently strips `<!-- ... -->` from PR bodies; session-id marker can't be set by agent (ENG-25)
- [learnings/github-mcp-strips-html-comments](learnings/2026-05-26-github-mcp-strips-html-comments.md) — `create_pull_request` AND `update_pull_request` strip `<!-- ... -->` from PR bodies; ENG-25 resolved by moving the session-id marker to a plain-text line
- [learnings/regex-last-match-semantics](learnings/2026-05-26-regex-last-match-semantics.md) — `String.match(/.../m)` returns FIRST match; for "last line of body" use `matchAll(...).at(-1)` (PR #20 bug)
- [learnings/recheck-open-prs-at-pr-open](learnings/2026-05-26-recheck-open-prs-at-pr-open.md) — session-start grep is insufficient; re-grep `list_pull_requests` right before `create_pull_request` to catch parallel-session races (PR #18 vs #20 ENG-25)
- [learnings/closed-pr-receives-review-after-close](learnings/2026-05-26-closed-pr-receives-review-after-close.md) — Closed PR can still get a REQUEST_CHANGES review seconds after dup-close; don't reopen, surface to sibling PR + Linear (PR #18 vs #20 ENG-25 race)

Expand All @@ -16,6 +17,6 @@ Keep this file under 200 lines — anything longer is content bloat, not memory.
- [decisions/two-agent-builder-reviewer](decisions/2026-05-25-two-agent-builder-reviewer.md) — Separate agents for build vs. review so the reviewer reads diffs cold

## Conventions
- [conventions/pr-session-id-marker](conventions/pr-session-id-marker.md) — PR body MUST end with `<!-- session-id: sesn_... -->` (or legacy `sthr_...`) so webhooks can resume
- [conventions/pr-session-id-marker](conventions/pr-session-id-marker.md) — PR body MUST end with plain-text `session-id: sesn_...` on its own line; legacy `<!-- session-id: ... -->` shape also matched but stripped by MCP (ENG-25)
- [conventions/agent-review-marker](conventions/agent-review-marker.md) — Reviewer's verdict goes on the first line as `AGENT_REVIEW: APPROVED|REQUEST_CHANGES|ESCALATE — <rationale>`
- [conventions/check-open-pr-before-ticket-pickup](conventions/check-open-pr-before-ticket-pickup.md) — Before branching for a Linear ticket, grep open PRs for the ticket ID; abort if one already exists (PR #15 vs #16 ENG-26 race)
32 changes: 18 additions & 14 deletions .claude/memory/conventions/pr-session-id-marker.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,32 @@
# PR body session-id marker

Every PR opened by the manager MUST end with this HTML comment as the
**last line** of the PR body:
Every PR opened by the manager MUST end with a `session-id:` marker on
its own line as the last non-empty line of the PR body:

```
<!-- session-id: sesn_xxxxxxxxxxxxxxxx -->
session-id: sesn_xxxxxxxxxxxxxxxx
```

The `/api/github-webhook` route extracts this marker when a webhook
fires for the PR (e.g. `issue_comment.created` with
`AGENT_REVIEW: APPROVED`). With it, the webhook resumes the original
manager session — full implementation context, no re-explaining.

The webhook regex accepts both the legacy `sthr_` prefix and the
current `sesn_` prefix returned by `client.beta.sessions.create()`.
Use whichever prefix the kickoff `user.message` carries.
## Accepted shapes

Without it, the webhook falls back to creating a fresh session. The
fresh session loses all design rationale and re-derives everything.
Functional but wasteful.
The webhook regex accepts:

The kickoff `user.message` includes the actual session id. Substitute
it verbatim; don't paraphrase or omit.
- Plain-text line: `session-id: sesn_...` — **preferred**. Survives
the GitHub MCP `update_pull_request` and `create_pull_request` body
filters (ENG-25). Always use this on new PRs.
- Legacy HTML comment: `<!-- session-id: sesn_... -->` — still
matched for back-compat with PRs opened before ENG-25 landed, but
the agent cannot write this shape on a new PR because MCP strips it
on body create AND update. Read-only acceptance.

The regex in `app/api/github-webhook/route.ts` accepts either
`sesn_` (current SDK prefix) or `sthr_` (legacy) — both are valid.
Use whatever the kickoff hands you.
Both `sesn_` (current SDK prefix) and `sthr_` (legacy) are accepted.
Use whatever the kickoff `user.message` carries, verbatim.

Without the marker, the webhook falls back to creating a fresh
session. The fresh session loses all design rationale and re-derives
everything. Functional but wasteful.
Original file line number Diff line number Diff line change
@@ -1 +1 @@
The GitHub MCP `update_pull_request` tool unconditionally strips `<!-- ... -->` HTML comments from the `body` field before persisting — the PATCH fires (other prose changes land, `updated_at` advances), but any HTML comment line silently disappears. Confirmed twice on PR #12 review-feedback round 2 with both `sesn_...` and `sthr_...` prefixes; the filter isn't gated on content. This makes the `pr-session-id-marker` convention unenforceable by the agent via the standard transport — there's no `GITHUB_TOKEN` in the sandbox env and `api.github.com` is outside the egress allowlist, so you can't route around MCP with a direct PATCH. Don't waste review rounds on it — see ENG-25 for the fix proposal and the open question of whether `create_pull_request` shares the same filter (worth opportunistic testing whenever you open a new PR).
The GitHub MCP `update_pull_request` AND `create_pull_request` tools both unconditionally strip `<!-- ... -->` HTML comments from the `body` field before persisting — the PATCH/POST fires (other prose changes land, `updated_at` advances), but any HTML comment line silently disappears. Confirmed across PR #12 review-feedback round 2 (`update_pull_request`, both `sesn_...` and `sthr_...` prefixes) and PR #17 initial open (`create_pull_request`, `sesn_...`). The filter is unconditional, not gated on content. Worse: the strip is **greedy** — an unclosed `<!--` opener inside backticks in prose (e.g. discussing the literal token in a body, no matching `-->` in the same paragraph) causes the MCP to drop everything from that point to the end of the body. PR #20's first body version hit this and ate its own trailing session-id marker mid-PR-creation. Workaround in prose: refer to the token by phrase (\"HTML-comment opener\", \"the HTML-comment shape\") rather than as a literal substring. ENG-25 resolved the convention problem by moving the `pr-session-id-marker` to a plain-text line shape that MCP doesn't filter; the webhook regex still accepts the legacy HTML form for back-compat with already-merged PRs. Don't write HTML-comment markers on new PRs — they'll vanish and waste review rounds.
18 changes: 18 additions & 0 deletions .claude/memory/learnings/2026-05-26-regex-last-match-semantics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Regex "last match" needs `/g` + matchAll, not just `/m`

`String.prototype.match(/.../m)` returns the **first** occurrence; the `m`
flag only line-anchors `^` / `$`. If your convention says "the marker is
the last line of the body" (e.g. `pr-session-id-marker`), you must take
the last match explicitly:

```ts
const all = [...text.matchAll(/^\s*marker:\s*(\w+)\s*$/gm)];
const value = all.length ? all[all.length - 1][1] : null;
```

PR #20 shipped the buggy form first and a docs-only code-block placeholder
(`marker: xxxxxxxxxxxxxxxx`) won against the real trailer, silently. The
unit test at `tests/unit/extract-session-id.spec.ts` now guards this case.

General rule: any time the contract is "the LAST occurrence of X in a
multi-line body," reach for `matchAll(...).at(-1)`, not `match(... /m)`.
7 changes: 1 addition & 6 deletions app/api/github-webhook/route.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import Anthropic from '@anthropic-ai/sdk';
import { createHmac, timingSafeEqual } from 'node:crypto';
import { extractSessionId } from '@/lib/extract-session-id';

export const runtime = 'nodejs';
export const maxDuration = 60;
Expand All @@ -15,12 +16,6 @@ function verifySignature(rawBody: string, sigHeader: string | null, secret: stri
return timingSafeEqual(a, b);
}

function extractSessionId(text: string | undefined | null): string | null {
if (!text) return null;
const m = text.match(/<!--\s*session-id:\s*((?:sthr_|sesn_)[A-Za-z0-9]+)\s*-->/);
return m?.[1] ?? null;
}

async function fetchPrBody(prNumber: number, token: string): Promise<string | null> {
const res = await fetch(
`https://api.github.com/repos/WillTaylor22/self-managing-codebase/pulls/${prNumber}`,
Expand Down
21 changes: 21 additions & 0 deletions lib/extract-session-id.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
// Pulled out of `app/api/github-webhook/route.ts` so it can be unit-tested
// without instantiating Next's route module. Pure, no I/O.
//
// Accepts two shapes (see .claude/memory/conventions/pr-session-id-marker.md):
// 1. Plain-text line: session-id: sesn_xxxx (preferred — MCP-survivable, ENG-25)
// 2. HTML comment: <!-- session-id: sesn_xxxx --> (legacy, read-only-accepted)
//
// Convention says the marker is the LAST non-empty line of the PR body. We
// honor that for the plain shape so a placeholder inside a fenced code block
// earlier in the body doesn't beat the real trailer (PR #20 review).
//
// The HTML shape is left first-wins: distinctive token, and we don't author
// it anymore so duplicate-in-prose risk is low.
export function extractSessionId(text: string | undefined | null): string | null {
if (!text) return null;
const html = text.match(/<!--\s*session-id:\s*((?:sthr_|sesn_)[A-Za-z0-9]+)\s*-->/);
if (html) return html[1];
const plain = [...text.matchAll(/^\s*session-id:\s*((?:sthr_|sesn_)[A-Za-z0-9]+)\s*$/gm)];
if (plain.length) return plain[plain.length - 1][1];
return null;
}
7 changes: 6 additions & 1 deletion playwright.config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,12 @@ import { defineConfig, devices } from '@playwright/test';
const baseURL = process.env.BASE_URL ?? 'http://localhost:3000';

export default defineConfig({
testDir: './tests/e2e',
// Picks up both `tests/e2e/*.spec.ts` and `tests/unit/*.spec.ts`.
// Unit specs (e.g. extract-session-id) are pure-function tests that
// simply don't touch the `page` fixture; they still run under the
// chromium project but never spawn a browser context.
testDir: './tests',
testMatch: ['e2e/**/*.spec.ts', 'unit/**/*.spec.ts'],
globalSetup: './tests/global-setup.ts',
fullyParallel: true,
forbidOnly: !!process.env.CI,
Expand Down
92 changes: 92 additions & 0 deletions tests/unit/extract-session-id.spec.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
import { test, expect } from '@playwright/test';
import { extractSessionId } from '../../lib/extract-session-id';

// Pure-function unit tests for the webhook's session-id marker extractor.
// No browser, no network — Playwright is just our test runner here.
//
// The shape contract is documented in
// .claude/memory/conventions/pr-session-id-marker.md.

test.describe('extractSessionId', () => {
test('null body → null', () => {
expect(extractSessionId(null)).toBeNull();
});

test('undefined body → null', () => {
expect(extractSessionId(undefined)).toBeNull();
});

test('empty body → null', () => {
expect(extractSessionId('')).toBeNull();
});

test('plain-text marker as the last line', () => {
const body = ['# Some PR', '', 'Body text.', '', 'session-id: sesn_abc123'].join('\n');
expect(extractSessionId(body)).toBe('sesn_abc123');
});

test('plain-text marker tolerates trailing whitespace / blank lines', () => {
const body = '# Some PR\n\nsession-id: sesn_abc123 \n\n\n';
expect(extractSessionId(body)).toBe('sesn_abc123');
});

test('sthr_ legacy prefix accepted on the plain shape', () => {
const body = 'Body.\n\nsession-id: sthr_legacy999';
expect(extractSessionId(body)).toBe('sthr_legacy999');
});

test('HTML-comment marker (legacy shape) still matches', () => {
const body = 'Body.\n\n<!-- session-id: sesn_html42 -->';
expect(extractSessionId(body)).toBe('sesn_html42');
});

test('HTML shape wins over plain when both are present (legacy precedence)', () => {
// Documented behavior — legacy PRs may have both during transition.
const body = 'session-id: sesn_plain1\n\n<!-- session-id: sesn_html2 -->';
expect(extractSessionId(body)).toBe('sesn_html2');
});

test('placeholder in a fenced code block does NOT beat the real trailer (PR #20)', () => {
// This is the exact regression: an earlier `session-id:` line that's
// documentation (e.g. inside a ```code``` block) used to win because
// .match without /g returns the first occurrence. The real marker is
// the last non-empty line; that must be what we return.
const body = [
'## Fix',
'',
'The canonical shape is now a plain-text line on its own:',
'',
'```',
'session-id: sesn_xxxxxxxxxxxxxxxx',
'```',
'',
'More prose here.',
'',
'session-id: sesn_012j21sUvdmnhx3baX6ivYLW',
].join('\n');
expect(extractSessionId(body)).toBe('sesn_012j21sUvdmnhx3baX6ivYLW');
});

test('multiple plain markers → last one wins', () => {
const body = 'session-id: sesn_first\nstuff\nsession-id: sesn_second\nmore\nsession-id: sesn_third';
expect(extractSessionId(body)).toBe('sesn_third');
});

test('inline mention of "session-id:" in prose does NOT match', () => {
// The marker must be on its own line. A sentence like "the session-id: foo line"
// should not produce a false match — note the `^\s* ... \s*$` anchors.
const body = 'In the body we mention the session-id: sesn_inline in prose. End.';
expect(extractSessionId(body)).toBeNull();
});

test('unknown prefix (not sthr_/sesn_) does not match', () => {
const body = 'session-id: zzzz_notvalid';
expect(extractSessionId(body)).toBeNull();
});

test('plain marker indented by leading whitespace is still matched', () => {
// Tolerated by `^\s*` — quoted-block / list-indented bodies still work.
const body = 'Body.\n\n session-id: sesn_indented';
expect(extractSessionId(body)).toBe('sesn_indented');
});
});