Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion docs/05-IPC-MAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ résumé, persisted, and indexed as `story` chunks so they ground live answers.
| `session:set-interview-type` | `{ sessionId, interviewType }` | `{ ok }` (set the session-level type — chosen by the user in the save prompt at stop) |
| `session:set-answer-prefs` | `{ interviewType?, format?, pronunciation? }` | `{ interviewType, format, pronunciation }` (live Cue Card controls; acts on the active session. Switching `interviewType` is dynamic — it persists onto the session row + reframes later answers) |
| `session:set-answering` | `{ enabled }` | `{ enabled, answered }` (coding "listen-only" toggle: when disabled, the interviewer is still transcribed but not auto-answered; enabling it also answers the question they just asked) |
| `session:regenerate` | | `{ regenerated }` (re-answer the last question for the active session) |
| `session:regenerate` | `{ questionId? }` | `{ regenerated }` (re-answer a SPECIFIC question by id — the Cue Card's per-card ↻ — or, with no id, the last question after a format/pronunciation toggle) |
| `session:clear-answer` | — | `{ cleared }` (abort the in-flight answer for the active session) |

### mock (AI-driven mock interviewer)
Expand Down Expand Up @@ -161,6 +161,7 @@ persisted; no DB session, no Cue Card).
| `capture:add-region` | `{ image }` | `{ added: true }` (add a captured region to the multi-image buffer; broadcasts `capture:buffer`) |
| `capture:solve-buffer` | — | `{ started: true }` (solve ALL buffered screenshots in one vision call, then clear) |
| `capture:clear-buffer` | — | `{ cleared: true }` |
| `capture:resolve-last` | — | `{ started: true }` (re-solve the most recent coding problem — the per-card ↻ on a coding-solve card; picks up the current language) |

### overlay / privacy
| Channel | Request | Response |
Expand All @@ -169,6 +170,7 @@ persisted; no DB session, no Cue Card).
| `overlay:set-mode` | `{ mode:'compact'\|'expanded' }` | `{ mode }` |
| `overlay:set-opacity` | `{ opacity }` | `{ opacity }` |
| `overlay:set-clickthrough` | `{ enabled }` | `{ enabled }` |
| `overlay:copy-text` | `{ text }` | `{ copied: true }` (write text to the OS clipboard for the per-card "Copy" — routed through main because the renderer's clipboard-write permission is denied) |
| `privacy:get` | — | `{ enabled }` |
| `privacy:toggle` | — | `{ enabled }` |
| `privacy:set` | `{ enabled }` | `{ enabled }` |
Expand Down
7 changes: 5 additions & 2 deletions docs/06-OPENAI-SERVICE.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,9 +116,12 @@ Builds a **grounding** prompt:
(`overlay/pronunciation.ts` `splitPronunciation`, tolerant of model-output variance) and
renders a structured "🗣 How to say it" panel below the answer. Adds +160 `max_output_tokens`
headroom so the guide never eats the answer.
- **Persona:** the system prompt frames the model as the candidate themselves — "You ARE the
candidate … answering ON THEIR BEHALF, in first person" — never third-person.
- `format` — the single answer control (v1.2): `key_points` (terse bullets) | `explanation`
(a natural, flowing first-person explanation) | `detailed` (thorough, with one example).
It also sets a hard `max_output_tokens` ceiling (220 / 340 / 800) so "key points" can never
(a natural, flowing first-person explanation) | `detailed` (thorough, with one example) |
`story_teller` (a short, vivid first-person story — "you are ME telling MY OWN story").
It also sets a hard `max_output_tokens` ceiling (220 / 340 / 800 / 420) so "key points" can never
drift long regardless of the prompt. (The old format/tone × length split — `star`/`technical`/
`conversational` — was removed.)
Streams tokens (`{type:'delta', token}`), then a `usage` event, then a structured
Expand Down
39 changes: 37 additions & 2 deletions docs/sessions/2026-07-01.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,41 @@ each finding.

Verified: `typecheck` · 113 unit (+9 across v1.2 #3) · `build` green.

## Cue Card answer UX (follow-ups, same branch)

Driven by live testing feedback:

- **First-person persona.** The system prompt now leads with identity: *"You ARE the candidate — a
second version of them — answering ON THEIR BEHALF, in first person."* Applies to every format.
- **Story-teller format.** A 4th `AnswerFormat` (`story_teller`, cap 420 tok): a short, vivid
first-person narrative ("you are ME telling MY OWN story"). Type + prompt + `session.ipc` zod +
a 4th Cue Card toggle.
- **Per-question regenerate.** The single toolbar ↻ was removed; **each answer card has its own ↻**.
To let *any* card (not just the last) be regenerated, the Cue Card now routes all answer events
by `questionId`: `AnswerCard` gained `questionId` (from `questionDetected.id`) + `isCoding`; new
`patchById`/`appendById` reducers replace `patchLast` in the delta/meta/done/reset/context handlers;
a `streamingId` ref flushes buffered tokens to the right card. Backend `regenerateActive()` →
`regenerate(questionId?)` (specific question by id, or last); `session:regenerate` + preload take
`questionId?`. Regenerating a collapsed history card auto-expands it.
- **Coding re-solve.** The ↻ shows on **every** card. `regenerateCard` tries `session.regenerate(qid)`;
a coding-solve card (not a persisted question) returns `{regenerated:false}` → falls back to
`capture:resolve-last`, which re-runs the last solve (`codingMode.resolveLast`, picking up the
current language). Also fixed a bug where a *live* question the classifier labeled "coding" had its
↻ wrongly hidden.

**Adversarial review** of the per-question-regenerate refactor found + fixed **2 real bugs**: an
aborted-but-still-visible card kept a blinking streaming cursor forever (the abort branch now
broadcasts `answerDone`), and the header/"Data sent" panels only reflected the last card (now derive
from the streaming card). The coding-re-solve follow-up was reviewed **inline** (workflow agents were
rate-limited). Verified: typecheck · 118 unit · build green.

- **Per-card Copy.** Each answer card now has a **Copy** (⧉ → ✓) button that copies the clean answer
body (pronunciation guide stripped) — handy for pasting a coding solution into the editor. Routed
through the main-process Electron clipboard via a new `overlay:copy-text` IPC, because the renderer's
`clipboard-write` permission is denied by the app's permission allowlist (only media/display-capture).

## v1.2 status
All three increments done on `feat/prompt-overhaul` (Answer Format + naturalness · Coding solver ·
Pronunciation). Ready for one PR. Version/changelog bump deferred until the user asks (would be v1.2.0).
The 3 base increments (Answer Format + naturalness · Coding solver · Pronunciation) merged to `master`
via **PR #20**. Follow-ups on `feat/answer-ux`: story-teller format · first-person persona ·
per-question regenerate · coding re-solve · per-card Copy. Ready for one PR → master.
Version/changelog bump deferred until asked (would be v1.2.0).
7 changes: 7 additions & 0 deletions src/main/ipc/capture.ipc.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import {
addCapture,
clearCaptures,
quickSolveFromClipboard,
resolveLast,
runCodingSolve,
runCodingSolveFromImage,
solveCaptures,
Expand Down Expand Up @@ -67,4 +68,10 @@ export function registerCaptureIpc(): void {
clearCaptures();
return { cleared: true as const };
});

// Re-solve the most recent coding problem (the Cue Card's per-card ↻ on a coding card).
handle(IPC.capture.resolveLast, NoInput, () => {
void resolveLast();
return { started: true as const };
});
}
8 changes: 8 additions & 0 deletions src/main/ipc/overlay.ipc.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import { z } from 'zod';
import { clipboard } from 'electron';
import { IPC } from '@shared/ipc';
import { handle, NoInput } from './helpers';
import {
Expand Down Expand Up @@ -52,6 +53,13 @@ export function registerOverlayIpc(): void {
},
);

// Write text to the OS clipboard from the renderer (the overlay's per-card "Copy").
// Routed through main because the renderer's clipboard-write permission is denied.
handle(IPC.overlay.copyText, z.object({ text: z.string() }), ({ text }) => {
clipboard.writeText(text);
return { copied: true as const };
});

handle(IPC.privacy.get, NoInput, () => ({ enabled: getPrivacy() }));
// Disabling privacy is gated by a confirmation dialog (see requestPrivacy).
handle(IPC.privacy.toggle, NoInput, async () => ({ enabled: await togglePrivacyGuarded() }));
Expand Down
6 changes: 4 additions & 2 deletions src/main/ipc/session.ipc.ts
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ const interviewType = z.enum([
'sales',
'general',
]);
const answerFormat = z.enum(['key_points', 'explanation', 'detailed']);
const answerFormat = z.enum(['key_points', 'explanation', 'detailed', 'story_teller']);

export function registerSessionIpc(): void {
handle(
Expand Down Expand Up @@ -113,7 +113,9 @@ export function registerSessionIpc(): void {
({ enabled }) => sessionManager.setAnsweringActive(enabled),
);

handle(IPC.session.regenerate, z.void(), () => sessionManager.regenerateActive());
handle(IPC.session.regenerate, z.object({ questionId: z.string().optional() }), ({ questionId }) =>
sessionManager.regenerate(questionId),
);

handle(IPC.session.clearAnswer, z.void(), () => sessionManager.clearAnswerActive());

Expand Down
28 changes: 28 additions & 0 deletions src/main/services/capture/codingMode.ts
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@ const codingLanguage = (): string =>
// request.
const MAX_CAPTURES = 8;
let captureBuffer: string[] = [];
// The most recent solve's input, so the Cue Card's per-card ↻ can re-solve the SAME
// problem (e.g. after switching language) without re-copying or re-capturing it.
let lastSolve: { text: string } | { images: string[] } | null = null;

function broadcastBuffer(): void {
broadcast(EVENTS.captureBuffer, { images: captureBuffer }, ['overlay']);
Expand All @@ -40,6 +43,7 @@ export function clearCaptures(): void {
export function solveCaptures(): Promise<void> {
if (captureBuffer.length === 0) return Promise.resolve();
const images = captureBuffer;
lastSolve = { images };
captureBuffer = [];
broadcastBuffer();
const label =
Expand Down Expand Up @@ -70,17 +74,41 @@ async function streamToOverlay(gen: AsyncGenerator<AnswerEvent>, label: string):

/** Stream a coding solution from plain text (clipboard). */
export function runCodingSolve(text: string): Promise<void> {
lastSolve = { text };
return streamToOverlay(solveFromOcr(text, codingLanguage()), 'Coding problem (from clipboard)');
}

/** Stream a coding solution from a single screenshot/region image (OpenAI vision). */
export function runCodingSolveFromImage(dataUrl: string): Promise<void> {
lastSolve = { images: [dataUrl] };
return streamToOverlay(
solveFromImages([dataUrl], codingLanguage()),
'Coding problem (from screenshot)',
);
}

/** Re-run the most recent coding solve (same problem) — picks up the current
* language/model/effort, so the user can iterate via the Cue Card's per-card ↻. */
export function resolveLast(): Promise<void> {
if (!lastSolve) {
const questionId = crypto.randomUUID();
showOverlay();
broadcast(EVENTS.questionDetected, { id: questionId, text: 'Re-solve', type: 'coding' }, [
'overlay',
]);
broadcast(EVENTS.sessionError, { message: 'Nothing to re-solve yet — solve a problem first.' }, [
'overlay',
]);
broadcast(EVENTS.answerDone, { questionId }, ['overlay']);
return Promise.resolve();
}
const gen =
'text' in lastSolve
? solveFromOcr(lastSolve.text, codingLanguage())
: solveFromImages(lastSolve.images, codingLanguage());
return streamToOverlay(gen, 'Coding problem (re-solve)');
}

/**
* Quick coding help from the clipboard: the user copies the problem text and
* presses the hotkey; we answer from that text. Reliable, no OCR.
Expand Down
6 changes: 6 additions & 0 deletions src/main/services/openai/answer.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,12 @@ describe('streamAnswer — request body', () => {
expect(userPrompt()).toContain('DETAILED');
});

it('caps story_teller at 420 output tokens', async () => {
await collect(streamAnswer(baseInput({ format: 'story_teller' })));
expect(h.lastBody!.max_output_tokens).toBe(420);
expect(userPrompt()).toContain('STORY TELLER');
});

it('includes the structured pronunciation-guide instruction only when enabled', async () => {
await collect(streamAnswer(baseInput({ pronunciation: true })));
expect(userPrompt()).toMatch(/phonetic respelling/i);
Expand Down
13 changes: 10 additions & 3 deletions src/main/services/openai/answer.ts
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,11 @@ const FORMAT_INSTRUCTION: Record<AnswerFormat, string> = {
detailed:
'FORMAT = DETAILED. A thorough, well-structured spoken answer (~150–220 words) with specifics ' +
'and one concrete example drawn from the context. Natural spoken language, not an essay.',
story_teller:
'FORMAT = STORY TELLER. You are ME telling MY OWN story on my behalf. Tell it as a short, vivid ' +
'first-person STORY (~110–150 words): a quick hook, the challenge/stakes, what I actually did, and ' +
'how it turned out (with a real result from the context). Flowing narrative, not bullets — ' +
'memorable and natural, the way I would tell it in the room. One story, tightly told.',
};

/** Hard output ceiling per format — the model literally cannot exceed this, so
Expand All @@ -35,6 +40,7 @@ const FORMAT_MAX_TOKENS: Record<AnswerFormat, number> = {
key_points: 220,
explanation: 340,
detailed: 800,
story_teller: 420,
};

export type AnswerEvent =
Expand All @@ -50,9 +56,10 @@ export type AnswerEvent =
}
| { type: 'usage'; prompt: number; completion: number };

const SYSTEM = `You are a live interview copilot. The candidate reads your output WHILE
speaking in a real interview, so it must be instantly skimmable and spoken in their
first-person voice ("I led…", not "The candidate led…").
const SYSTEM = `You ARE the candidate — a second version of them — answering the interview ON
THEIR BEHALF, in first person, as if they are speaking. Never say "the candidate" or "they";
you are them ("I led…", not "The candidate led…"). They read your output WHILE speaking in a
real interview, so it must be instantly skimmable.
Rules:
- FORMAT is a HARD constraint. Obey the requested format EXACTLY — even if you have more
to say. When unsure, be shorter. Never pad. (KEY POINTS especially must stay tiny.)
Expand Down
46 changes: 35 additions & 11 deletions src/main/services/session/sessionManager.ts
Original file line number Diff line number Diff line change
Expand Up @@ -469,8 +469,14 @@ export const sessionManager = {
}
}
} catch (e) {
// Aborted by clear/regenerate — drop this partial answer silently.
if (abort.signal.aborted) return { questionId };
// Aborted by clear/regenerate — drop this partial answer, but still tell the Cue
// Card this question is done so its card stops showing the streaming cursor. (With
// per-card regenerate + history, the aborted card may be a DIFFERENT, still-visible
// one than the card being regenerated.)
if (abort.signal.aborted) {
broadcast(EVENTS.answerDone, { questionId });
return { questionId };
}
// A real failure (auth, quota, network drop, model-not-found): surface it and
// clear the Cue Card's streaming state, instead of leaving the card spinning
// forever with no error (the most common live failure — e.g. an expired key).
Expand Down Expand Up @@ -546,18 +552,36 @@ export const sessionManager = {
};
},

/** Re-answer the last question for the active session (e.g. after toggling
* length/format/pronunciation, or via the Cue Card "Regenerate" button).
* Reuses the SAME question row — no new transcript line or DB question. */
async regenerateActive(): Promise<{ regenerated: boolean }> {
if (!live?.lastQuestion) return { regenerated: false };
const q = live.lastQuestion;
/** Re-answer a question for the active session — a SPECIFIC one by id (the Cue
* Card's per-card "Regenerate" button) or, with no id, the last question (after
* toggling format/pronunciation). Reuses the SAME question row — no new transcript
* line or DB question. */
async regenerate(questionId?: string): Promise<{ regenerated: boolean }> {
if (!live) return { regenerated: false };
let qid: string;
let text: string;
if (questionId) {
// A specific card: pull its text from its question row (any question in this session).
const row = db()
.select()
.from(schema.detectedQuestions)
.where(eq(schema.detectedQuestions.id, questionId))
.get();
if (!row) return { regenerated: false }; // e.g. an ad-hoc coding-solve card (not persisted)
qid = questionId;
text = row.text;
} else if (live.lastQuestion) {
qid = live.lastQuestion.questionId;
text = live.lastQuestion.text;
} else {
return { regenerated: false };
}
// Abort the current answer BEFORE clearing the Cue Card, so a late token from
// the aborted stream can't land in the cleared answer.
if (live.answerAbort) live.answerAbort.abort();
// Clear the current answer in the Cue Card (without touching the transcript).
broadcast(EVENTS.answerReset, { questionId: q.questionId });
await this.generateAnswer(live.sessionId, q.questionId, q.text);
// Clear that question's answer in the Cue Card (without touching the transcript).
broadcast(EVENTS.answerReset, { questionId: qid });
await this.generateAnswer(live.sessionId, qid, text);
return { regenerated: true };
},

Expand Down
5 changes: 4 additions & 1 deletion src/preload/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,8 @@ const api = {
invoke<{ ok: true }>(IPC.session.setInterviewType, { sessionId, interviewType }),
setAnswering: (enabled: boolean) =>
invoke<{ enabled: boolean; answered: boolean }>(IPC.session.setAnswering, { enabled }),
regenerate: () => invoke<{ regenerated: boolean }>(IPC.session.regenerate),
regenerate: (questionId?: string) =>
invoke<{ regenerated: boolean }>(IPC.session.regenerate, { questionId }),
clearAnswer: () => invoke<{ cleared: boolean }>(IPC.session.clearAnswer),
stop: (sessionId: string) => invoke(IPC.session.stop, { sessionId }),
togglePause: (sessionId: string) => invoke(IPC.session.togglePause, { sessionId }),
Expand Down Expand Up @@ -224,6 +225,7 @@ const api = {
addRegion: (image: string) => invoke<{ added: true }>(IPC.capture.addRegion, { image }),
solveBuffer: () => invoke<{ started: true }>(IPC.capture.solveBuffer),
clearBuffer: () => invoke<{ cleared: true }>(IPC.capture.clearBuffer),
resolveLast: () => invoke<{ started: true }>(IPC.capture.resolveLast),
},
overlay: {
show: () => invoke(IPC.overlay.show),
Expand All @@ -233,6 +235,7 @@ const api = {
setMode: (mode: 'compact' | 'expanded') => invoke(IPC.overlay.setMode, { mode }),
setOpacity: (opacity: number) => invoke(IPC.overlay.setOpacity, { opacity }),
setClickthrough: (enabled: boolean) => invoke(IPC.overlay.setClickthrough, { enabled }),
copyText: (text: string) => invoke<{ copied: true }>(IPC.overlay.copyText, { text }),
},
privacy: {
get: () => invoke<{ enabled: boolean }>(IPC.privacy.get),
Expand Down
Loading
Loading