mrviduus · mrviduus · Jun 15, 2026 · Jun 15, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -2,6 +2,18 @@
 
 ## [Unreleased]
 
+### Phase 7 — crew specialist sub-agents + prompts (AI-041) (2026-06-15)
+
+The four generic, **single-call** crew specialists the content crews (AI-042/043) compose via `CrewTasks.Of` + `CrewOrchestrator` (AI-040). Each is exactly ONE `ILlmService` gateway call — no tools, no `AgentLoop`, no iteration — and is domain-agnostic (no SEO/AutoPublish specifics): they operate on a shared `ContentBrief` (length in CHARACTERS, banned phrases, target language, optional style guide). **Why these four, in this order**: a researcher condenses the source into grounded bullet FACTS; a drafter writes the field strictly from those notes; a critic scores the draft 1-5 **against the research notes** (not its own knowledge) — every claim not supported by the notes is a factual-accuracy `blocker` — and an editor rewrites fixing each issue blockers-first. Grounding the critic on the research notes is the crux: it turns "does this sound plausible?" into "is this actually in the source?", which is what catches hallucinations the drafter slipped in.
+
+- **`Application/Agents/CrewAgentContracts.cs`** (new) — the records threaded through a crew: `ContentBrief`, `ResearchInput`/`ResearchNotes`, `DraftInput`/`Draft`, `CritiqueInput`/`EditInput`, `CritiqueResult` (four 1-5 scores + `Issues` + `ParseFailed`), `CritiqueIssue` (severity `blocker|major|minor`). All read the same brief so "write to N chars" and "score length against N" can never drift.
+- **`Application/Agents/SingleCallAgent.cs`** (new, abstract) — base for a one-gateway-call `IAgent<TIn,TOut>`: subclasses own only `FeatureTag` (routing), `MaxOutputTokens`, `BuildPrompt`, `Parse`; the base does the request/timing/step/usage plumbing so each specialist is ~15 lines. Produces the same shape the orchestrator schedules — one `"llm_response"` `AgentStep` + `AgentUsage(Iterations: 1, …)` mapped from the gateway's `LlmUsage`.
+- **`ResearcherAgent` / `DrafterAgent` / `CriticAgent` / `EditorAgent`** (new) — feature tags `crew.researcher` / `crew.drafter` / `crew.critic` / `crew.editor`; token budgets 600/500/700/500. Researcher/drafter/editor parse = trimmed text; critic parses via `CriticOutputParser`.
+- **`Application/Agents/CriticOutputParser.cs`** (new, pure) — strips ```` ```json ````/```` ``` ```` fences, slices first `{` to last `}`, deserializes case-insensitively into a private DTO matching the prompt's schema, clamps each score to [1,5], coerces unknown/blank severity → `minor`, drops issues with no fix. **Fail-closed**: any exception / empty / no-brace → `CritiqueResult(1,1,1,1, [blocker "unparseable"], ParseFailed: true)` and NEVER throws — an unreadable critic must read as "reject", never as a silent clean pass.
+- **`Application/Agents/Prompts/`** (new dir) — `ResearcherPrompt`/`DrafterPrompt`/`CriticPrompt`/`EditorPrompt` (pure `BuildSystemPrompt`/`BuildUserPrompt`, mirroring `ExplainPrompt`) + an internal `BriefConstraints` so drafter/critic/editor render length + banned phrases IDENTICALLY. `CriticPrompt` inlines the literal JSON schema it shares with the parser and demands a bare JSON object only.
+- **DI** — the four registered as singletons next to `StudyBuddyAgent` in `Api/Program.cs` (stateless, take the singleton `ILlmService`).
+- Tests: `CrewSpecialistsTests` (23, fake `ILlmService`, no key/network) — each agent maps its canned response to the typed output with one `llm_response` step + `Iterations==1` + usage from `LlmUsage`; the `CriticOutputParser` battery (well-formed, fenced, trailing prose, score clamp 0→1 / 9→5, unknown severity → minor, missing-fix dropped, garbage/empty/no-brace fail-closed, never-throws sweep); prompt builders surface length range / banned phrase / language / the critic schema tokens (`factual_accuracy`, `severity`, `blocker`); and a crew integration smoke test wiring all four via `CrewTasks.Of` into a 4-stage `CrewPlan` run through the real `CrewOrchestrator` — asserts state threads research→draft→critique→edit and the transcript has 4 `CrewStepEntry`s in declaration order (proves the AI-040 contract; no stray `ITool` added).
+
 ### Phase 7 — CrewOrchestrator primitive (AI-040) (2026-06-15)
 
 Phase 7 opens with the **generic multi-agent orchestration engine** — the crew-level analogue of `AgentLoop` (AI-034). Engine-only: no concrete crews, no specialist agents, no SEO/AutoPublish wiring, no endpoint (those are AI-041+). Like `AgentLoop` shipped before `StudyBuddyAgent`, the primitive lands first and migrates callers later. **Reuses Phase 6 seams**: no new tables, no new persistence interface — a crew run persists through the same `IAgentRunWriter`/`agent_run` path as a single agent.

diff --git a/backend/src/Api/Program.cs b/backend/src/Api/Program.cs
@@ -89,6 +89,12 @@
 // Agent loop engine (Phase 6, AI-034). Concrete agents (StudyBuddy, AI-035) build on it.
 TextStack.Ai.Agents.ServiceCollectionExtensions.AddAiAgents(builder.Services);
 builder.Services.AddScoped<Application.Agents.StudyBuddyAgent>();
+// Crew specialists (Phase 7, AI-041): single-call IAgent<TIn,TOut> sub-agents the content crews
+// (AI-042/043) compose via CrewTasks.Of. Stateless + ILlmService is a singleton, so singleton is fine.
+builder.Services.AddSingleton<Application.Agents.ResearcherAgent>();
+builder.Services.AddSingleton<Application.Agents.DrafterAgent>();
+builder.Services.AddSingleton<Application.Agents.CriticAgent>();
+builder.Services.AddSingleton<Application.Agents.EditorAgent>();
 builder.Services.AddAuthSettings(builder.Configuration);
 
 var connectionString = builder.Configuration.GetConnectionString("Default")

diff --git a/backend/src/Application/Agents/CrewAgentContracts.cs b/backend/src/Application/Agents/CrewAgentContracts.cs
@@ -0,0 +1,50 @@
+namespace Application.Agents;
+
+/// <summary>
+/// The shared writing assignment threaded through a content crew (Phase 7, AI-041). Every specialist reads
+/// the SAME brief so the constraints (length, banned phrases, language, style) are enforced identically by
+/// drafter, critic and editor — there is no drift between "write to N chars" and "score length against N".
+/// Lengths are in CHARACTERS (SEO fields are character-bounded, not token-bounded).
+/// </summary>
+public record ContentBrief(
+    string EntityType,
+    string FieldName,
+    int MinLength,
+    int MaxLength,
+    IReadOnlyList<string> BannedPhrases,
+    string TargetLanguage,
+    string? StyleGuide);
+
+/// <summary>Input to the researcher: the brief plus the raw source material to condense into neutral facts.</summary>
+public record ResearchInput(ContentBrief Brief, string SourceMaterial);
+
+/// <summary>The researcher's output: bullet FACTS grounded entirely in the source, ready for the drafter.</summary>
+public record ResearchNotes(string Notes);
+
+/// <summary>Input to the drafter: the brief plus the research notes it must write strictly from.</summary>
+public record DraftInput(ContentBrief Brief, ResearchNotes Research);
+
+/// <summary>A produced draft of the requested field.</summary>
+public record Draft(string Text);
+
+/// <summary>Input to the critic: the brief, the draft to score, and the research notes to ground it against.</summary>
+public record CritiqueInput(ContentBrief Brief, Draft Draft, ResearchNotes Research);
+
+/// <summary>Input to the editor: the brief, the draft to revise, and the critique listing what to fix.</summary>
+public record EditInput(ContentBrief Brief, Draft Draft, CritiqueResult Critique);
+
+/// <summary>
+/// The critic's structured verdict: 1-5 scores on each axis plus an issue list. <see cref="ParseFailed"/> is
+/// true when the critic's output could not be parsed (fail-closed all-1s) — callers treat that as "reject,
+/// needs work", never as a clean pass.
+/// </summary>
+public record CritiqueResult(
+    int FactualAccuracy,
+    int Tone,
+    int Length,
+    int BannedPhrases,
+    IReadOnlyList<CritiqueIssue> Issues,
+    bool ParseFailed);
+
+/// <summary>One actionable defect the critic found. <c>Severity</c> is one of "blocker", "major", "minor".</summary>
+public record CritiqueIssue(string Location, string Severity, string Fix);
diff --git a/backend/src/Application/Agents/CriticAgent.cs b/backend/src/Application/Agents/CriticAgent.cs
@@ -0,0 +1,21 @@
+using Application.Agents.Prompts;
+using TextStack.Ai.Core;
+
+namespace Application.Agents;
+
+/// <summary>
+/// Crew critic (AI-041): one gateway call that scores the draft 1-5 on four axes against the research notes
+/// and lists actionable issues. Grounds factual accuracy on the notes (unsupported claims = blockers) and
+/// fails closed if its JSON can't be parsed — see <see cref="CriticOutputParser"/>.
+/// </summary>
+public sealed class CriticAgent(ILlmService llm) : SingleCallAgent<CritiqueInput, CritiqueResult>(llm)
+{
+    protected override string FeatureTag => "crew.critic";
+    protected override int MaxOutputTokens => 700;
+
+    protected override (string system, string user) BuildPrompt(CritiqueInput input) =>
+        (CriticPrompt.BuildSystemPrompt(input.Brief),
+         CriticPrompt.BuildUserPrompt(input.Draft, input.Research));
+
+    protected override CritiqueResult Parse(string text, CritiqueInput input) => CriticOutputParser.Parse(text);
+}
diff --git a/backend/src/Application/Agents/CriticOutputParser.cs b/backend/src/Application/Agents/CriticOutputParser.cs
@@ -0,0 +1,153 @@
+using System.Text.Json;
+using System.Text.Json.Serialization;
+
+namespace Application.Agents;
+
+/// <summary>
+/// Parses the critic's JSON verdict into a typed <see cref="CritiqueResult"/> (AI-041). LLMs leak fences and
+/// prose around JSON, so we strip code fences and extract the first top-level balanced <c>{...}</c> object via a
+/// string-aware brace scan before deserializing leniently. Scores are clamped to [1,5]; a blank severity stays
+/// "minor" (absent = not asserted) while a non-empty-but-unrecognized severity coerces to "major" (fail-closed
+/// direction — never silently downgrade an unknown severity below the blocker threshold); issues without a fix
+/// are dropped. Critically it is FAIL-CLOSED: any failure (empty, no brace, malformed, exception) yields the
+/// worst-possible verdict with <c>ParseFailed: true</c>, never an exception and never a silent clean pass — an
+/// unparseable critic must read as "reject", so a hallucinating drafter can't sneak past on a critic that
+/// merely failed to format its output.
+/// </summary>
+public static class CriticOutputParser
+{
+    private static readonly JsonSerializerOptions Options = new()
+    {
+        PropertyNameCaseInsensitive = true,
+        NumberHandling = JsonNumberHandling.AllowReadingFromString,
+    };
+
+    private static readonly CritiqueResult FailClosed = new(
+        1, 1, 1, 1,
+        [new CritiqueIssue("output", "blocker", "Critic output was unparseable.")],
+        ParseFailed: true);
+
+    public static CritiqueResult Parse(string llmText)
+    {
+        try
+        {
+            var json = ExtractJson(llmText);
+            if (json is null)
+                return FailClosed;
+
+            var dto = JsonSerializer.Deserialize<CriticDto>(json, Options);
+            if (dto?.Scores is null)
+                return FailClosed;
+
+            var issues = (dto.Issues ?? [])
+                .Where(i => !string.IsNullOrWhiteSpace(i.Fix))
+                .Select(i => new CritiqueIssue(
+                    i.Location ?? "draft",
+                    NormalizeSeverity(i.Severity),
+                    i.Fix!.Trim()))
+                .ToList();
+
+            return new CritiqueResult(
+                Clamp(dto.Scores.FactualAccuracy),
+                Clamp(dto.Scores.Tone),
+                Clamp(dto.Scores.Length),
+                Clamp(dto.Scores.BannedPhrases),
+                issues,
+                ParseFailed: false);
+        }
+        catch
+        {
+            // Never throw: a critic we cannot read must fail closed, not crash the crew.
+            return FailClosed;
+        }
+    }
+
+    /// <summary>
+    /// Strips ```` ``` ```` / ```` ```json ```` fences and returns the first top-level balanced <c>{...}</c>.
+    /// A string-aware brace scan from the first <c>{</c>: braces inside JSON string literals are ignored (an
+    /// in-string flag toggles on each unescaped <c>"</c>, honoring <c>\</c> so <c>\"</c> doesn't toggle), so the
+    /// scan stops at the object's own closing brace. This makes "valid object then arbitrary prose" parse
+    /// cleanly and extracts object-0 from <c>[{...},{...}]</c> without splicing across elements. Returns null if
+    /// there is no <c>{</c> or the object never balances — caller fail-closes.
+    /// </summary>
+    private static string? ExtractJson(string text)
+    {
+        if (string.IsNullOrWhiteSpace(text))
+            return null;
+
+        var cleaned = text.Replace("```json", string.Empty, StringComparison.OrdinalIgnoreCase)
+                          .Replace("```", string.Empty);
+
+        var start = cleaned.IndexOf('{');
+        if (start < 0)
+            return null;
+
+        var depth = 0;
+        var inString = false;
+        var escaped = false;
+
+        for (var i = start; i < cleaned.Length; i++)
+        {
+            var c = cleaned[i];
+
+            if (inString)
+            {
+                if (escaped)
+                    escaped = false;
+                else if (c == '\\')
+                    escaped = true;
+                else if (c == '"')
+                    inString = false;
+                continue;
+            }
+
+            switch (c)
+            {
+                case '"':
+                    inString = true;
+                    break;
+                case '{':
+                    depth++;
+                    break;
+                case '}':
+                    depth--;
+                    if (depth == 0)
+                        return cleaned[start..(i + 1)];
+                    break;
+            }
+        }
+
+        return null; // never balanced → fail closed
+    }
+
+    private static int Clamp(int score) => Math.Clamp(score, 1, 5);
+
+    private static string NormalizeSeverity(string? severity)
+    {
+        var s = severity?.Trim().ToLowerInvariant();
+        if (string.IsNullOrEmpty(s))
+            return "minor"; // absent = not asserted, don't manufacture severity
+        return s is "blocker" or "major" or "minor" ? s : "major"; // unknown-but-stated → fail-closed direction
+    }
+
+    private sealed class CriticDto
+    {
+        [JsonPropertyName("scores")] public ScoresDto? Scores { get; set; }
+        [JsonPropertyName("issues")] public List<IssueDto>? Issues { get; set; }
+    }
+
+    private sealed class ScoresDto
+    {
+        [JsonPropertyName("factual_accuracy")] public int FactualAccuracy { get; set; }
+        [JsonPropertyName("tone")] public int Tone { get; set; }
+        [JsonPropertyName("length")] public int Length { get; set; }
+        [JsonPropertyName("banned_phrases")] public int BannedPhrases { get; set; }
+    }
+
+    private sealed class IssueDto
+    {
+        [JsonPropertyName("location")] public string? Location { get; set; }
+        [JsonPropertyName("severity")] public string? Severity { get; set; }
+        [JsonPropertyName("fix")] public string? Fix { get; set; }
+    }
+}
diff --git a/backend/src/Application/Agents/DrafterAgent.cs b/backend/src/Application/Agents/DrafterAgent.cs
@@ -0,0 +1,20 @@
+using Application.Agents.Prompts;
+using TextStack.Ai.Core;
+
+namespace Application.Agents;
+
+/// <summary>
+/// Crew drafter (AI-041): one gateway call that writes the requested field strictly from the research notes,
+/// within the brief's character bounds and language. Stays groundable — the critic scores it against the notes.
+/// </summary>
+public sealed class DrafterAgent(ILlmService llm) : SingleCallAgent<DraftInput, Draft>(llm)
+{
+    protected override string FeatureTag => "crew.drafter";
+    protected override int MaxOutputTokens => 500;
+
+    protected override (string system, string user) BuildPrompt(DraftInput input) =>
+        (DrafterPrompt.BuildSystemPrompt(input.Brief),
+         DrafterPrompt.BuildUserPrompt(input.Research));
+
+    protected override Draft Parse(string text, DraftInput input) => new(text.Trim());
+}
diff --git a/backend/src/Application/Agents/EditorAgent.cs b/backend/src/Application/Agents/EditorAgent.cs
@@ -0,0 +1,20 @@
+using Application.Agents.Prompts;
+using TextStack.Ai.Core;
+
+namespace Application.Agents;
+
+/// <summary>
+/// Crew editor (AI-041): one gateway call that rewrites the draft fixing each critique issue (blockers first),
+/// keeping the supported facts and staying within the brief's character bounds. Last link in the content crew.
+/// </summary>
+public sealed class EditorAgent(ILlmService llm) : SingleCallAgent<EditInput, Draft>(llm)
+{
+    protected override string FeatureTag => "crew.editor";
+    protected override int MaxOutputTokens => 500;
+
+    protected override (string system, string user) BuildPrompt(EditInput input) =>
+        (EditorPrompt.BuildSystemPrompt(input.Brief),
+         EditorPrompt.BuildUserPrompt(input.Draft, input.Critique));
+
+    protected override Draft Parse(string text, EditInput input) => new(text.Trim());
+}
diff --git a/backend/src/Application/Agents/Prompts/BriefConstraints.cs b/backend/src/Application/Agents/Prompts/BriefConstraints.cs
@@ -0,0 +1,18 @@
+namespace Application.Agents.Prompts;
+
+/// <summary>
+/// Shared, single-source rendering of the brief's length + banned-phrase constraints so the drafter, critic
+/// and editor read them IDENTICALLY (AI-041). If the drafter is told "120-300 characters" and the critic
+/// scores length against a different phrasing, length critique drifts — so all three call through here.
+/// </summary>
+internal static class BriefConstraints
+{
+    public static string Length(ContentBrief brief) =>
+        $"{brief.MinLength}-{brief.MaxLength} characters";
+
+    /// <summary>Comma-separated banned phrases, or null when the brief bans none.</summary>
+    public static string? BannedPhrases(ContentBrief brief) =>
+        brief.BannedPhrases.Count == 0
+            ? null
+            : string.Join(", ", brief.BannedPhrases.Select(p => $"\"{p}\""));
+}
diff --git a/backend/src/Application/Agents/Prompts/CriticPrompt.cs b/backend/src/Application/Agents/Prompts/CriticPrompt.cs
@@ -0,0 +1,44 @@
+namespace Application.Agents.Prompts;
+
+/// <summary>
+/// The crew critic's prompt (AI-041) — the most load-bearing prompt in the crew. It scores the draft 1-5 on
+/// four axes AGAINST THE RESEARCH NOTES (not the model's own knowledge): any draft claim not supported by the
+/// notes is a factual-accuracy <c>blocker</c>. That research-grounding is the whole point — it turns "does
+/// this sound plausible?" into "is this actually in the source?", which is what makes the critic catch
+/// hallucinations the drafter slipped in. The output MUST be a bare JSON object matching the exact schema
+/// <see cref="CriticOutputParser"/> deserializes; the parser fails closed on anything else. Pure strings.
+/// </summary>
+public static class CriticPrompt
+{
+    /// <summary>The literal schema the critic must emit and <see cref="CriticOutputParser"/> parses — kept in one place.</summary>
+    public const string Schema =
+        "{\"scores\":{\"factual_accuracy\":1-5,\"tone\":1-5,\"length\":1-5,\"banned_phrases\":1-5}," +
+        "\"issues\":[{\"location\":\"…\",\"severity\":\"blocker|major|minor\",\"fix\":\"…\"}]}";
+
+    public static string BuildSystemPrompt(ContentBrief brief)
+    {
+        var prompt =
+            $"You are an editor reviewing a draft {brief.FieldName} for a {brief.EntityType}. " +
+            "Score the draft 1-5 (1 worst, 5 best) on each of four axes:\n" +
+            "- factual_accuracy: every claim in the draft must be supported BY THE RESEARCH NOTES. " +
+            "Flag EVERY claim that is not supported by the notes as a \"blocker\" issue and lower this score.\n" +
+            $"- tone: does it match the requested style{(string.IsNullOrWhiteSpace(brief.StyleGuide) ? string.Empty : $" ({brief.StyleGuide!.Trim()})")} " +
+            $"and read naturally in {brief.TargetLanguage}?\n" +
+            $"- length: is the draft within {BriefConstraints.Length(brief)}?\n" +
+            "- banned_phrases: does the draft avoid the banned phrases?";
+
+        if (BriefConstraints.BannedPhrases(brief) is { } banned)
+            prompt += $" Banned phrases: {banned}.";
+
+        prompt +=
+            "\nList every concrete problem as an issue with a location, a severity " +
+            "(\"blocker\", \"major\", or \"minor\"), and a specific fix.\n" +
+            "Output a bare JSON object ONLY — no markdown, no code fences, no commentary before or after it — " +
+            "with EXACTLY this schema:\n" + Schema;
+
+        return prompt;
+    }
+
+    public static string BuildUserPrompt(Draft draft, ResearchNotes research) =>
+        $"Research notes:\n{research.Notes}\n\nDraft to review:\n{draft.Text}";
+}