From 986ef16b490bd030c6e7cbb98c5ba8073fd22111 Mon Sep 17 00:00:00 2001
From: daveh-beep <daveh@squareup.com>
Date: Wed, 22 Apr 2026 14:47:29 -0400
Subject: [PATCH] feat: add communication voice as a new fingerprint domain
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Proposes extending the fingerprint format to support communication
voice alongside visual. Four new dimension blocks (tone, structure,
register, boundaries) parallel palette/spacing/typography/surfaces.

Adds:
- docs/communication-voice.md — domain spec with dimension definitions,
  embedding layout, and domain detection rules
- comms-review recipe — scan generated copy for voice drift
- comms-verify recipe — generate → review → iterate loop for copy
- comms-fingerprint.template.md — starter template

No code changes yet — this is the format proposal and skill recipes.
Schema changes (making visual dimensions optional so comms-only
fingerprints validate) would follow in a separate PR.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---
 docs/communication-voice.md                   | 167 ++++++++++++++++++
 .../assets/comms-fingerprint.template.md      |  61 +++++++
 .../skill-bundle/references/comms-review.md   |  99 +++++++++++
 .../skill-bundle/references/comms-verify.md   |  60 +++++++
 4 files changed, 387 insertions(+)
 create mode 100644 docs/communication-voice.md
 create mode 100644 packages/ghost-drift/src/skill-bundle/assets/comms-fingerprint.template.md
 create mode 100644 packages/ghost-drift/src/skill-bundle/references/comms-review.md
 create mode 100644 packages/ghost-drift/src/skill-bundle/references/comms-verify.md

diff --git a/docs/communication-voice.md b/docs/communication-voice.md
new file mode 100644
index 0000000..a20f813
--- /dev/null
+++ b/docs/communication-voice.md
@@ -0,0 +1,167 @@
+# Communication Voice: A New Fingerprint Domain
+
+Ghost captures visual brand identity — palette, spacing, typography, surfaces. But brand fidelity extends beyond pixels. A brand's **voice** — how it writes, what it refuses to say, the posture it takes in high-stakes moments — drifts the same way colors and spacing do. Faster, actually, because copy has fewer guardrails than design tokens.
+
+This document proposes extending the fingerprint format to support **communication voice** as a first-class domain, parallel to visual.
+
+---
+
+## Why communication voice drifts
+
+Visual drift happens when a developer uses `#3b82f6` instead of `var(--brand-500)`. Communication drift happens when:
+
+- Generated copy sounds too formal for a casual brand (or too casual for a clinical one)
+- Action-first brands bury the CTA in paragraph three
+- Empathy-forward brands skip straight to instructions
+- Copy includes language the brand explicitly avoids
+- Tone shifts abruptly between contexts that should feel coherent
+
+In an agent-authored world, communication drift is the bigger risk. An agent generating a notification, error message, or enforcement action email has no instinct for brand — only what the prompt tells it. If the prompt doesn't encode voice, the output regresses to generic.
+
+Ghost's architecture already handles this. The fingerprint is a contract. The review recipe detects drift. The verify recipe iterates. The remediation verbs record intent. The only thing missing is the dimension set.
+
+---
+
+## Communication dimensions
+
+Visual has four dimension blocks: `palette`, `spacing`, `typography`, `surfaces`. Communication has four parallel blocks:
+
+### `tone`
+
+Captures the brand's voice attributes as a structured scale.
+
+```yaml
+tone:
+  formality: 0.3       # 0 = casual, 1 = formal
+  directness: 0.8      # 0 = hedging, 1 = blunt
+  warmth: 0.6          # 0 = clinical, 1 = personal
+  agency: 0.7          # 0 = prescriptive, 1 = empowering
+  urgency: 0.4         # 0 = relaxed, 1 = pressing
+```
+
+Each value is a 0–1 float. The agent uses these as grounding — "this brand lives at 0.8 directness, so lead with the action, don't build up to it."
+
+**Drift signal:** generated copy that reads at 0.2 directness when the fingerprint says 0.8.
+
+### `structure`
+
+Captures composition norms — how the brand arranges information.
+
+```yaml
+structure:
+  sentenceLengthNorm: short    # short | medium | long
+  paragraphDensity: tight      # tight | normal | spacious
+  ctaPlacement: top            # top | bottom | inline
+  informationOrder: action-first  # action-first | context-first | empathy-first
+  listStyle: bullets           # bullets | numbered | prose | none
+```
+
+**Drift signal:** copy that buries the CTA when the fingerprint says `ctaPlacement: top`.
+
+### `register`
+
+Captures language level and vocabulary norms.
+
+```yaml
+register:
+  readingLevel: 8              # Flesch-Kincaid grade level target
+  jargonTolerance: low         # none | low | moderate | domain-native
+  contractions: yes            # yes | no | contextual
+  pronouns: second-person      # first-person | second-person | third-person | brand-name
+  sentenceVoice: active        # active | passive | mixed
+```
+
+**Drift signal:** copy at grade 14 when the fingerprint says grade 8. Jargon where the fingerprint says `none`.
+
+### `boundaries`
+
+Captures what the brand refuses to say — the exclusion set.
+
+```yaml
+boundaries:
+  excluded:
+    - pattern: "we're sorry to inform you"
+      reason: "passive, distances the brand from the action"
+    - pattern: "unfortunately"
+      reason: "hedging word, undermines directness"
+    - pattern: "per our policy"
+      reason: "bureaucratic, not human"
+  required:
+    - pattern: "here's what to do next"
+      context: "any action-required communication"
+    - pattern: "contact us"
+      context: "any adverse action"
+  constraints:
+    - id: "no-shame"
+      rule: "never imply the recipient caused the problem"
+    - id: "action-first"
+      rule: "lead with what the recipient can do, not what happened"
+```
+
+**Drift signal:** generated copy that includes an excluded phrase or omits a required element.
+
+---
+
+## The partition (same rule as visual)
+
+Communication dimensions follow the same partition as visual:
+
+| Fingerprint field | Lives in | Section / key |
+|---|---|---|
+| `tone`, `structure`, `register`, `boundaries` | Frontmatter | top-level |
+| `observation.personality`, `observation.voiceArchetypes` | Frontmatter | `observation:` |
+| `observation.summary` | Body | `# Character` |
+| `observation.distinctiveTraits` | Body | `# Signature` bullets |
+| `decisions[].dimension`, `decisions[].embedding` | Frontmatter | `decisions:` entry |
+| `decisions[].decision` (prose rationale) | Body | `### dimension` block |
+
+No duplication. Prose in frontmatter is a lint error. Structured data in the body is a lint error.
+
+---
+
+## Domain detection
+
+A fingerprint can be:
+- **Visual-only:** has `palette`, `spacing`, `typography`, `surfaces`. No `tone`.
+- **Communication-only:** has `tone`, `structure`, `register`, `boundaries`. No `palette`.
+- **Both:** has all eight blocks. A full brand fingerprint.
+
+`ghost-drift lint` validates whichever dimensions are present. `ghost-drift compare` computes distance over the dimensions both fingerprints share.
+
+---
+
+## Embedding
+
+Visual fingerprints use a 49-dimensional vector (palette 0–20, spacing 21–30, typography 31–40, surfaces 41–48). Communication fingerprints use a parallel vector:
+
+| Dimensions | Category | What it captures |
+|---|---|---|
+| 0–4 | Tone | formality, directness, warmth, agency, urgency |
+| 5–9 | Structure | sentence length, paragraph density, CTA placement, information order, list style |
+| 10–14 | Register | reading level, jargon tolerance, contractions, pronouns, sentence voice |
+| 15–19 | Boundaries | exclusion count, required count, constraint count, specificity, coverage |
+
+20 dimensions. Combined visual+comms fingerprints have 69 dimensions.
+
+---
+
+## Recipes
+
+Two new skill recipes extend the existing set:
+
+- **`comms-review`** — scan generated copy for voice drift against the fingerprint. Same flow as `review.md` but checks tone/structure/register/boundaries instead of palette/spacing/typography/surfaces.
+- **`comms-verify`** — generate → comms-review → iterate loop for copy generation. Same flow as `verify.md`.
+
+The existing `profile`, `compare`, `discover` recipes work unchanged — they read whichever dimensions are present.
+
+---
+
+## What this enables
+
+- Any team can write a `fingerprint.md` for their brand's voice
+- Any agent can author copy against it
+- Drift gets caught at generation time, not after it ships
+- Cross-brand comparison shows where voices diverge (intentionally or not)
+- The same `ack/adopt/diverge` remediation works for voice decisions
+
+Visual is the first domain. Communication is the second. The format and the detection architecture are domain-agnostic — this is the proof.
diff --git a/packages/ghost-drift/src/skill-bundle/assets/comms-fingerprint.template.md b/packages/ghost-drift/src/skill-bundle/assets/comms-fingerprint.template.md
new file mode 100644
index 0000000..473fc96
--- /dev/null
+++ b/packages/ghost-drift/src/skill-bundle/assets/comms-fingerprint.template.md
@@ -0,0 +1,61 @@
+---
+name: ""
+slug: ""
+schema: 4
+generator: ghost@0.9.0
+generated: ""
+confidence: 0.0
+source: llm
+id: ""
+timestamp: ""
+
+observation:
+  personality: []
+  voiceArchetypes: []
+
+decisions: []
+
+tone:
+  formality: 0.5
+  directness: 0.5
+  warmth: 0.5
+  agency: 0.5
+  urgency: 0.5
+
+structure:
+  sentenceLengthNorm: medium
+  paragraphDensity: normal
+  ctaPlacement: top
+  informationOrder: action-first
+  listStyle: bullets
+
+register:
+  readingLevel: 8
+  jargonTolerance: low
+  contractions: yes
+  pronouns: second-person
+  sentenceVoice: active
+
+boundaries:
+  excluded: []
+  required: []
+  constraints: []
+---
+
+# Character
+
+<!-- One paragraph. The opening read: what does this brand sound like when it speaks? Not marketing copy — the actual posture. How does it handle good news? Bad news? Uncertainty? Write it so someone who has never heard this brand can hear it after reading this paragraph. -->
+
+# Signature
+
+<!-- 3–7 bullets. The distinctive traits that make this voice unlike its peers. What would someone notice if they read 20 messages from this brand back-to-back? What would they miss if they read a competitor instead? -->
+
+-
+
+# Decisions
+
+<!-- One `### dimension` block per voice decision. Each block has a prose rationale explaining *why* this choice was made, grounded in evidence. Dimensions should match the `decisions:` entries in the frontmatter. -->
+
+### example-decision
+
+<!-- Why this choice? What evidence? What's the alternative and why was it rejected? -->
diff --git a/packages/ghost-drift/src/skill-bundle/references/comms-review.md b/packages/ghost-drift/src/skill-bundle/references/comms-review.md
new file mode 100644
index 0000000..64f0d1a
--- /dev/null
+++ b/packages/ghost-drift/src/skill-bundle/references/comms-review.md
@@ -0,0 +1,99 @@
+---
+name: comms-review
+description: Flag generated or drafted copy that drifts from the communication voice in fingerprint.md.
+handoffs:
+  - label: Regenerate drifting copy to match the fingerprint
+    skill: comms-verify
+    prompt: Regenerate the drifting copy against fingerprint.md and re-review
+  - label: Accept the drift as aligned reality
+    command: ghost-drift ack
+    prompt: Acknowledge that the current fingerprint.md no longer matches and record the drift
+  - label: Declare a dimension intentionally divergent
+    command: ghost-drift diverge
+    prompt: Record an intentional divergence on a specific dimension so it stops flagging
+---
+
+# Recipe: Review copy for communication voice drift
+
+**Goal:** flag generated or drafted copy that drifts from the communication voice defined in the local `fingerprint.md`.
+
+Ghost has no `ghost comms-review` CLI command. You — the host agent — are the reviewer. The `fingerprint.md` is your rubric.
+
+## Steps
+
+### 1. Read the fingerprint
+
+    cat fingerprint.md
+
+Absorb the communication dimensions: `tone` (formality, directness, warmth, agency, urgency), `structure` (sentence length, CTA placement, information order), `register` (reading level, jargon tolerance, pronouns), `boundaries` (excluded phrases, required elements, constraints).
+
+If no communication dimensions are present in the fingerprint, tell the user. The fingerprint may be visual-only. Offer to extend it with communication dimensions via the profile recipe.
+
+### 2. Collect the copy
+
+Read the generated or drafted copy. This may be:
+- A generated notification, email, or in-app message
+- A PR diff that modifies user-facing strings
+- A template file (`.md`, `.txt`, `.html`, `.json` with text fields)
+- Output from an AI generation endpoint
+
+### 3. Scan for drift
+
+For each piece of copy, check against the fingerprint dimensions:
+
+- **Tone drift:**
+  - `formality`: does the copy read more formal or casual than the fingerprint's scale? A brand at 0.3 formality shouldn't say "we regret to inform you."
+  - `directness`: is the CTA buried? Does the copy hedge when the fingerprint says blunt?
+  - `warmth`: is it clinical when it should be personal, or sentimental when it should be matter-of-fact?
+  - `agency`: does the copy tell the reader what to do, or empower them with options? Match the fingerprint.
+  - `urgency`: does the copy create more or less pressure than the fingerprint intends?
+
+- **Structure drift:**
+  - `sentenceLengthNorm`: are sentences significantly longer or shorter than the norm?
+  - `ctaPlacement`: is the call-to-action where the fingerprint says it should be?
+  - `informationOrder`: action-first brands should lead with the action. Context-first brands should set the scene. Empathy-first brands should acknowledge first.
+  - `paragraphDensity`: wall of text when the fingerprint says tight? One-liners when it says spacious?
+
+- **Register drift:**
+  - `readingLevel`: is the copy significantly above or below the target grade level?
+  - `jargonTolerance`: does the copy use technical terms when the fingerprint says `none`?
+  - `contractions`: "we will" when the fingerprint says use contractions? "we'll" when it says don't?
+  - `pronouns`: "the customer" when the fingerprint says second-person ("you")?
+  - `sentenceVoice`: passive constructions when the fingerprint says active?
+
+- **Boundary drift:**
+  - `excluded`: does the copy contain any phrase from the exclusion list?
+  - `required`: is the copy missing any required element for its context?
+  - `constraints`: does the copy violate any constraint (e.g., implying blame when the fingerprint says never)?
+
+### 4. Filter noise
+
+Drop matches that aren't real drift:
+
+- Quoted text from external sources the brand is referencing
+- Legal boilerplate that can't be rewritten (regulatory language)
+- Placeholder copy clearly marked as draft
+- Copy in a context the fingerprint doesn't cover (if the fingerprint is scoped to "enforcement comms" and the copy is marketing, note the gap but don't flag)
+- Intentional divergence: if `.ghost-sync.json` records a dimension as `diverging`, note it but don't flag
+
+### 5. Produce the review
+
+Group findings by dimension. Lead with the most impactful drift. For each finding:
+
+- **Where:** the specific sentence or phrase
+- **What was found:** the drifting language
+- **What the fingerprint says:** the expected voice attribute
+- **Why it matters:** one sentence connecting to brand impact
+- **Suggested fix:** rewrite that matches the fingerprint
+
+Format:
+- **Ad-hoc chat:** markdown with the drifting text quoted
+- **PR review:** inline comments on string changes + summary comment
+- **Generation pipeline:** structured JSON with drift annotations per output
+
+### 6. Record stance if the user accepts the drift
+
+Same remediation as visual drift:
+- `ghost-drift ack` — accept drift across the board
+- `ghost-drift diverge <dimension> --reason "..."` — intentional divergence on one dimension
+- `ghost-drift adopt <parent.md>` — adopt a new voice baseline
diff --git a/packages/ghost-drift/src/skill-bundle/references/comms-verify.md b/packages/ghost-drift/src/skill-bundle/references/comms-verify.md
new file mode 100644
index 0000000..e10aa99
--- /dev/null
+++ b/packages/ghost-drift/src/skill-bundle/references/comms-verify.md
@@ -0,0 +1,60 @@
+---
+name: comms-verify
+description: Confirm generated copy stays within fingerprint.md voice bounds; iterate if not.
+handoffs:
+  - label: Regenerate with feedback from the review
+    skill: comms-verify
+    prompt: Regenerate the copy using the review findings as constraints
+  - label: Update the fingerprint to capture an uncaptured voice decision
+    skill: profile
+    prompt: Add the missing voice decision to fingerprint.md and re-lint
+---
+
+# Recipe: Verify generated copy against the fingerprint
+
+**Goal:** confirm that generated copy (a notification, email, error message, or any user-facing text) stays within the communication voice bounds of the local `fingerprint.md`. This is the "generate → review → iterate" loop for copy.
+
+Ghost has no `ghost comms-verify` CLI command. You drive the loop; the fingerprint is the contract.
+
+## Steps
+
+### 1. Generate
+
+Produce the copy. Work from whatever the user asked for. Respect `tone` (formality, directness, warmth, agency, urgency), `structure` (CTA placement, information order), `register` (reading level, jargon tolerance), `boundaries` (excluded phrases, required elements, constraints).
+
+### 2. Self-review
+
+Apply the [comms-review recipe](comms-review.md) to the generated copy. Scan for drift across all four communication dimensions. Group findings by dimension.
+
+### 3. Decide
+
+- **No findings** → pass. The copy is aligned. Report back to the user.
+- **Findings exist** → iterate:
+  - For each finding, identify the fingerprint value the generator should have followed.
+  - Regenerate with explicit guidance: "Use second-person pronouns ('you') instead of third-person ('the customer'). Lead with the action per `structure.informationOrder: action-first`. Drop 'unfortunately' per `boundaries.excluded`."
+  - Re-run the comms-review. Up to 3 iterations.
+  - If still drifting after 3 tries: report to the user. The fingerprint may be missing a voice decision the generator needs, or the generation prompt may be too loose.
+
+### 4. (Optional) Suite verification
+
+If the user is iterating on the fingerprint's communication dimensions:
+
+- Generate against a suite of diverse contexts (welcome email, error notification, enforcement action, support response, transactional receipt, marketing CTA).
+- Run comms-review against each.
+- Classify each dimension as **tight** (no drift), **leaky** (occasional drift), or **uncaptured** (frequent drift).
+- "Uncaptured" dimensions are the signal the fingerprint is missing a voice decision. Tell the user which one to add.
+
+### 5. Return with annotations
+
+When the loop completes, return the final copy with:
+- Which dimensions were checked
+- Any remaining drift that couldn't be resolved (with the fingerprint decision that was violated)
+- The iteration count (0 = first-pass clean, 3 = max iterations hit)
+
+This gives the human reviewer a head start — they know exactly where to look.
+
+## Why the loop matters
+
+The fingerprint is a contract. Generation tests the contract. Drift shows where the contract is ambiguous or silent. Use comms-verify results to refine both the generator's prompt and the fingerprint's voice decisions.
+
+A visual fingerprint missing a border-radius decision produces a leaky component. A communication fingerprint missing a tone decision produces a message that doesn't sound like the brand. Both are the same problem. Both are fixed the same way: add the decision, re-verify.