From 986ef16b490bd030c6e7cbb98c5ba8073fd22111 Mon Sep 17 00:00:00 2001 From: daveh-beep Date: Wed, 22 Apr 2026 14:47:29 -0400 Subject: [PATCH] feat: add communication voice as a new fingerprint domain MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Proposes extending the fingerprint format to support communication voice alongside visual. Four new dimension blocks (tone, structure, register, boundaries) parallel palette/spacing/typography/surfaces. Adds: - docs/communication-voice.md — domain spec with dimension definitions, embedding layout, and domain detection rules - comms-review recipe — scan generated copy for voice drift - comms-verify recipe — generate → review → iterate loop for copy - comms-fingerprint.template.md — starter template No code changes yet — this is the format proposal and skill recipes. Schema changes (making visual dimensions optional so comms-only fingerprints validate) would follow in a separate PR. Co-Authored-By: Claude Opus 4.6 (1M context) --- docs/communication-voice.md | 167 ++++++++++++++++++ .../assets/comms-fingerprint.template.md | 61 +++++++ .../skill-bundle/references/comms-review.md | 99 +++++++++++ .../skill-bundle/references/comms-verify.md | 60 +++++++ 4 files changed, 387 insertions(+) create mode 100644 docs/communication-voice.md create mode 100644 packages/ghost-drift/src/skill-bundle/assets/comms-fingerprint.template.md create mode 100644 packages/ghost-drift/src/skill-bundle/references/comms-review.md create mode 100644 packages/ghost-drift/src/skill-bundle/references/comms-verify.md diff --git a/docs/communication-voice.md b/docs/communication-voice.md new file mode 100644 index 0000000..a20f813 --- /dev/null +++ b/docs/communication-voice.md @@ -0,0 +1,167 @@ +# Communication Voice: A New Fingerprint Domain + +Ghost captures visual brand identity — palette, spacing, typography, surfaces. But brand fidelity extends beyond pixels. A brand's **voice** — how it writes, what it refuses to say, the posture it takes in high-stakes moments — drifts the same way colors and spacing do. Faster, actually, because copy has fewer guardrails than design tokens. + +This document proposes extending the fingerprint format to support **communication voice** as a first-class domain, parallel to visual. + +--- + +## Why communication voice drifts + +Visual drift happens when a developer uses `#3b82f6` instead of `var(--brand-500)`. Communication drift happens when: + +- Generated copy sounds too formal for a casual brand (or too casual for a clinical one) +- Action-first brands bury the CTA in paragraph three +- Empathy-forward brands skip straight to instructions +- Copy includes language the brand explicitly avoids +- Tone shifts abruptly between contexts that should feel coherent + +In an agent-authored world, communication drift is the bigger risk. An agent generating a notification, error message, or enforcement action email has no instinct for brand — only what the prompt tells it. If the prompt doesn't encode voice, the output regresses to generic. + +Ghost's architecture already handles this. The fingerprint is a contract. The review recipe detects drift. The verify recipe iterates. The remediation verbs record intent. The only thing missing is the dimension set. + +--- + +## Communication dimensions + +Visual has four dimension blocks: `palette`, `spacing`, `typography`, `surfaces`. Communication has four parallel blocks: + +### `tone` + +Captures the brand's voice attributes as a structured scale. + +```yaml +tone: + formality: 0.3 # 0 = casual, 1 = formal + directness: 0.8 # 0 = hedging, 1 = blunt + warmth: 0.6 # 0 = clinical, 1 = personal + agency: 0.7 # 0 = prescriptive, 1 = empowering + urgency: 0.4 # 0 = relaxed, 1 = pressing +``` + +Each value is a 0–1 float. The agent uses these as grounding — "this brand lives at 0.8 directness, so lead with the action, don't build up to it." + +**Drift signal:** generated copy that reads at 0.2 directness when the fingerprint says 0.8. + +### `structure` + +Captures composition norms — how the brand arranges information. + +```yaml +structure: + sentenceLengthNorm: short # short | medium | long + paragraphDensity: tight # tight | normal | spacious + ctaPlacement: top # top | bottom | inline + informationOrder: action-first # action-first | context-first | empathy-first + listStyle: bullets # bullets | numbered | prose | none +``` + +**Drift signal:** copy that buries the CTA when the fingerprint says `ctaPlacement: top`. + +### `register` + +Captures language level and vocabulary norms. + +```yaml +register: + readingLevel: 8 # Flesch-Kincaid grade level target + jargonTolerance: low # none | low | moderate | domain-native + contractions: yes # yes | no | contextual + pronouns: second-person # first-person | second-person | third-person | brand-name + sentenceVoice: active # active | passive | mixed +``` + +**Drift signal:** copy at grade 14 when the fingerprint says grade 8. Jargon where the fingerprint says `none`. + +### `boundaries` + +Captures what the brand refuses to say — the exclusion set. + +```yaml +boundaries: + excluded: + - pattern: "we're sorry to inform you" + reason: "passive, distances the brand from the action" + - pattern: "unfortunately" + reason: "hedging word, undermines directness" + - pattern: "per our policy" + reason: "bureaucratic, not human" + required: + - pattern: "here's what to do next" + context: "any action-required communication" + - pattern: "contact us" + context: "any adverse action" + constraints: + - id: "no-shame" + rule: "never imply the recipient caused the problem" + - id: "action-first" + rule: "lead with what the recipient can do, not what happened" +``` + +**Drift signal:** generated copy that includes an excluded phrase or omits a required element. + +--- + +## The partition (same rule as visual) + +Communication dimensions follow the same partition as visual: + +| Fingerprint field | Lives in | Section / key | +|---|---|---| +| `tone`, `structure`, `register`, `boundaries` | Frontmatter | top-level | +| `observation.personality`, `observation.voiceArchetypes` | Frontmatter | `observation:` | +| `observation.summary` | Body | `# Character` | +| `observation.distinctiveTraits` | Body | `# Signature` bullets | +| `decisions[].dimension`, `decisions[].embedding` | Frontmatter | `decisions:` entry | +| `decisions[].decision` (prose rationale) | Body | `### dimension` block | + +No duplication. Prose in frontmatter is a lint error. Structured data in the body is a lint error. + +--- + +## Domain detection + +A fingerprint can be: +- **Visual-only:** has `palette`, `spacing`, `typography`, `surfaces`. No `tone`. +- **Communication-only:** has `tone`, `structure`, `register`, `boundaries`. No `palette`. +- **Both:** has all eight blocks. A full brand fingerprint. + +`ghost-drift lint` validates whichever dimensions are present. `ghost-drift compare` computes distance over the dimensions both fingerprints share. + +--- + +## Embedding + +Visual fingerprints use a 49-dimensional vector (palette 0–20, spacing 21–30, typography 31–40, surfaces 41–48). Communication fingerprints use a parallel vector: + +| Dimensions | Category | What it captures | +|---|---|---| +| 0–4 | Tone | formality, directness, warmth, agency, urgency | +| 5–9 | Structure | sentence length, paragraph density, CTA placement, information order, list style | +| 10–14 | Register | reading level, jargon tolerance, contractions, pronouns, sentence voice | +| 15–19 | Boundaries | exclusion count, required count, constraint count, specificity, coverage | + +20 dimensions. Combined visual+comms fingerprints have 69 dimensions. + +--- + +## Recipes + +Two new skill recipes extend the existing set: + +- **`comms-review`** — scan generated copy for voice drift against the fingerprint. Same flow as `review.md` but checks tone/structure/register/boundaries instead of palette/spacing/typography/surfaces. +- **`comms-verify`** — generate → comms-review → iterate loop for copy generation. Same flow as `verify.md`. + +The existing `profile`, `compare`, `discover` recipes work unchanged — they read whichever dimensions are present. + +--- + +## What this enables + +- Any team can write a `fingerprint.md` for their brand's voice +- Any agent can author copy against it +- Drift gets caught at generation time, not after it ships +- Cross-brand comparison shows where voices diverge (intentionally or not) +- The same `ack/adopt/diverge` remediation works for voice decisions + +Visual is the first domain. Communication is the second. The format and the detection architecture are domain-agnostic — this is the proof. diff --git a/packages/ghost-drift/src/skill-bundle/assets/comms-fingerprint.template.md b/packages/ghost-drift/src/skill-bundle/assets/comms-fingerprint.template.md new file mode 100644 index 0000000..473fc96 --- /dev/null +++ b/packages/ghost-drift/src/skill-bundle/assets/comms-fingerprint.template.md @@ -0,0 +1,61 @@ +--- +name: "" +slug: "" +schema: 4 +generator: ghost@0.9.0 +generated: "" +confidence: 0.0 +source: llm +id: "" +timestamp: "" + +observation: + personality: [] + voiceArchetypes: [] + +decisions: [] + +tone: + formality: 0.5 + directness: 0.5 + warmth: 0.5 + agency: 0.5 + urgency: 0.5 + +structure: + sentenceLengthNorm: medium + paragraphDensity: normal + ctaPlacement: top + informationOrder: action-first + listStyle: bullets + +register: + readingLevel: 8 + jargonTolerance: low + contractions: yes + pronouns: second-person + sentenceVoice: active + +boundaries: + excluded: [] + required: [] + constraints: [] +--- + +# Character + + + +# Signature + + + +- + +# Decisions + + + +### example-decision + + diff --git a/packages/ghost-drift/src/skill-bundle/references/comms-review.md b/packages/ghost-drift/src/skill-bundle/references/comms-review.md new file mode 100644 index 0000000..64f0d1a --- /dev/null +++ b/packages/ghost-drift/src/skill-bundle/references/comms-review.md @@ -0,0 +1,99 @@ +--- +name: comms-review +description: Flag generated or drafted copy that drifts from the communication voice in fingerprint.md. +handoffs: + - label: Regenerate drifting copy to match the fingerprint + skill: comms-verify + prompt: Regenerate the drifting copy against fingerprint.md and re-review + - label: Accept the drift as aligned reality + command: ghost-drift ack + prompt: Acknowledge that the current fingerprint.md no longer matches and record the drift + - label: Declare a dimension intentionally divergent + command: ghost-drift diverge + prompt: Record an intentional divergence on a specific dimension so it stops flagging +--- + +# Recipe: Review copy for communication voice drift + +**Goal:** flag generated or drafted copy that drifts from the communication voice defined in the local `fingerprint.md`. + +Ghost has no `ghost comms-review` CLI command. You — the host agent — are the reviewer. The `fingerprint.md` is your rubric. + +## Steps + +### 1. Read the fingerprint + + cat fingerprint.md + +Absorb the communication dimensions: `tone` (formality, directness, warmth, agency, urgency), `structure` (sentence length, CTA placement, information order), `register` (reading level, jargon tolerance, pronouns), `boundaries` (excluded phrases, required elements, constraints). + +If no communication dimensions are present in the fingerprint, tell the user. The fingerprint may be visual-only. Offer to extend it with communication dimensions via the profile recipe. + +### 2. Collect the copy + +Read the generated or drafted copy. This may be: +- A generated notification, email, or in-app message +- A PR diff that modifies user-facing strings +- A template file (`.md`, `.txt`, `.html`, `.json` with text fields) +- Output from an AI generation endpoint + +### 3. Scan for drift + +For each piece of copy, check against the fingerprint dimensions: + +- **Tone drift:** + - `formality`: does the copy read more formal or casual than the fingerprint's scale? A brand at 0.3 formality shouldn't say "we regret to inform you." + - `directness`: is the CTA buried? Does the copy hedge when the fingerprint says blunt? + - `warmth`: is it clinical when it should be personal, or sentimental when it should be matter-of-fact? + - `agency`: does the copy tell the reader what to do, or empower them with options? Match the fingerprint. + - `urgency`: does the copy create more or less pressure than the fingerprint intends? + +- **Structure drift:** + - `sentenceLengthNorm`: are sentences significantly longer or shorter than the norm? + - `ctaPlacement`: is the call-to-action where the fingerprint says it should be? + - `informationOrder`: action-first brands should lead with the action. Context-first brands should set the scene. Empathy-first brands should acknowledge first. + - `paragraphDensity`: wall of text when the fingerprint says tight? One-liners when it says spacious? + +- **Register drift:** + - `readingLevel`: is the copy significantly above or below the target grade level? + - `jargonTolerance`: does the copy use technical terms when the fingerprint says `none`? + - `contractions`: "we will" when the fingerprint says use contractions? "we'll" when it says don't? + - `pronouns`: "the customer" when the fingerprint says second-person ("you")? + - `sentenceVoice`: passive constructions when the fingerprint says active? + +- **Boundary drift:** + - `excluded`: does the copy contain any phrase from the exclusion list? + - `required`: is the copy missing any required element for its context? + - `constraints`: does the copy violate any constraint (e.g., implying blame when the fingerprint says never)? + +### 4. Filter noise + +Drop matches that aren't real drift: + +- Quoted text from external sources the brand is referencing +- Legal boilerplate that can't be rewritten (regulatory language) +- Placeholder copy clearly marked as draft +- Copy in a context the fingerprint doesn't cover (if the fingerprint is scoped to "enforcement comms" and the copy is marketing, note the gap but don't flag) +- Intentional divergence: if `.ghost-sync.json` records a dimension as `diverging`, note it but don't flag + +### 5. Produce the review + +Group findings by dimension. Lead with the most impactful drift. For each finding: + +- **Where:** the specific sentence or phrase +- **What was found:** the drifting language +- **What the fingerprint says:** the expected voice attribute +- **Why it matters:** one sentence connecting to brand impact +- **Suggested fix:** rewrite that matches the fingerprint + +Format: +- **Ad-hoc chat:** markdown with the drifting text quoted +- **PR review:** inline comments on string changes + summary comment +- **Generation pipeline:** structured JSON with drift annotations per output + +### 6. Record stance if the user accepts the drift + +Same remediation as visual drift: +- `ghost-drift ack` — accept drift across the board +- `ghost-drift diverge --reason "..."` — intentional divergence on one dimension +- `ghost-drift adopt ` — adopt a new voice baseline diff --git a/packages/ghost-drift/src/skill-bundle/references/comms-verify.md b/packages/ghost-drift/src/skill-bundle/references/comms-verify.md new file mode 100644 index 0000000..e10aa99 --- /dev/null +++ b/packages/ghost-drift/src/skill-bundle/references/comms-verify.md @@ -0,0 +1,60 @@ +--- +name: comms-verify +description: Confirm generated copy stays within fingerprint.md voice bounds; iterate if not. +handoffs: + - label: Regenerate with feedback from the review + skill: comms-verify + prompt: Regenerate the copy using the review findings as constraints + - label: Update the fingerprint to capture an uncaptured voice decision + skill: profile + prompt: Add the missing voice decision to fingerprint.md and re-lint +--- + +# Recipe: Verify generated copy against the fingerprint + +**Goal:** confirm that generated copy (a notification, email, error message, or any user-facing text) stays within the communication voice bounds of the local `fingerprint.md`. This is the "generate → review → iterate" loop for copy. + +Ghost has no `ghost comms-verify` CLI command. You drive the loop; the fingerprint is the contract. + +## Steps + +### 1. Generate + +Produce the copy. Work from whatever the user asked for. Respect `tone` (formality, directness, warmth, agency, urgency), `structure` (CTA placement, information order), `register` (reading level, jargon tolerance), `boundaries` (excluded phrases, required elements, constraints). + +### 2. Self-review + +Apply the [comms-review recipe](comms-review.md) to the generated copy. Scan for drift across all four communication dimensions. Group findings by dimension. + +### 3. Decide + +- **No findings** → pass. The copy is aligned. Report back to the user. +- **Findings exist** → iterate: + - For each finding, identify the fingerprint value the generator should have followed. + - Regenerate with explicit guidance: "Use second-person pronouns ('you') instead of third-person ('the customer'). Lead with the action per `structure.informationOrder: action-first`. Drop 'unfortunately' per `boundaries.excluded`." + - Re-run the comms-review. Up to 3 iterations. + - If still drifting after 3 tries: report to the user. The fingerprint may be missing a voice decision the generator needs, or the generation prompt may be too loose. + +### 4. (Optional) Suite verification + +If the user is iterating on the fingerprint's communication dimensions: + +- Generate against a suite of diverse contexts (welcome email, error notification, enforcement action, support response, transactional receipt, marketing CTA). +- Run comms-review against each. +- Classify each dimension as **tight** (no drift), **leaky** (occasional drift), or **uncaptured** (frequent drift). +- "Uncaptured" dimensions are the signal the fingerprint is missing a voice decision. Tell the user which one to add. + +### 5. Return with annotations + +When the loop completes, return the final copy with: +- Which dimensions were checked +- Any remaining drift that couldn't be resolved (with the fingerprint decision that was violated) +- The iteration count (0 = first-pass clean, 3 = max iterations hit) + +This gives the human reviewer a head start — they know exactly where to look. + +## Why the loop matters + +The fingerprint is a contract. Generation tests the contract. Drift shows where the contract is ambiguous or silent. Use comms-verify results to refine both the generator's prompt and the fingerprint's voice decisions. + +A visual fingerprint missing a border-radius decision produces a leaky component. A communication fingerprint missing a tone decision produces a message that doesn't sound like the brand. Both are the same problem. Both are fixed the same way: add the decision, re-verify.