Skip to content

Host Service: Claude API + TTS → Programme #45

@zeevenn

Description

@zeevenn

Parent

#39 — PRD: AI Radio — Host-driven Channel experience with chat timeline

What to build

Implement the Host Service — the AI orchestrator that turns a Channel Style and an optional Intervention into a completed Programme.

Interface:

generateProgramme(
  channelStyle: ChannelStyle,
  candidateTracks: Track[],
  intervention?: Intervention
)  Promise<Programme>

The Host Service accepts the CandidateSet (already filtered by the caller) rather than querying the Library itself — this keeps it pure and testable.

Steps the Host performs:

  1. Call the Claude API (model: claude-sonnet-4-6) with the CandidateSet track list, the Channel Style, and any Intervention. The prompt asks Claude to: select and sequence 5–8 Tracks, and write a short Interlude Script (1–3 sentences, in Chinese) to insert before the first Track and between every 2–3 Tracks.
  2. Parse Claude's response into an ordered list of { kind: 'track', trackId } and { kind: 'interlude', script } items.
  3. For each Interlude Script, call the TTS provider to synthesise audio and obtain a URL.
  4. Assemble and return the full Programme: Segment[] with all audio URLs resolved.

TTS provider: use a cloud TTS API (e.g. OpenAI TTS, Azure TTS, or similar) that returns a streamable audio URL. The specific provider is a configuration choice — inject it as a dependency so it can be swapped. For v1, any provider that returns an MP3/AAC URL is acceptable.

Prompt caching: use the Anthropic SDK's prompt caching (cache_control) on the system prompt and the CandidateSet track list (these are stable across calls for the same Channel). This reduces latency and cost for Intervention-triggered regenerations.

The Host Service does not handle offline fallback — that is a separate module (#51).

Acceptance criteria

  • generateProgramme returns a valid Programme with at least one TrackSegment and at least one InterlSegment
  • Each InterlSegment has a non-empty script and a non-empty audioUrl
  • Each TrackSegment references a Track that was in the input candidateTracks
  • Intervention (if provided) is honoured: Track Request results in the requested Track appearing in the Programme; Mood Change produces a Programme that acknowledges the style shift
  • Prompt caching is applied to the system prompt and CandidateSet sections of the Claude API call
  • TTS provider is injected (not hardcoded) and can be swapped via config

Blocked by

Metadata

Metadata

Assignees

No one assigned

    Labels

    ready-for-agentReady for an AI agent to implement

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions