Status
Deferred. This was Tier 2 item 1 + 2 in the strategic roadmap (`.claude/plans/can-you-look-at-quirky-allen.md`), originally framed as a "daemon" with a 3–4 week effort estimate. After review the conclusion is: overkill for current realistic workloads. Filing for tracking; do not start without a concrete customer use case.
What was proposed
A long-running process that holds the loaded Whisper model + a warm EPUBCheck JVM in memory, accepts conversion jobs over a Unix socket / HTTP / stdio JSON-RPC, and returns results. The strategic intent was to amortise one-time costs (Whisper model load, JVM startup) across hundreds of books in a catalogue, and to enable a "dpub processes a 500-book catalogue in N hours; Pipeline 2 takes 30× longer just on JVM startup" benchmark as a positioning artefact.
Why it's overkill (today)
| Cost |
Per invocation |
Daemon helps? |
| Whisper model load |
~10–15 s with `--transcribe` |
yes, big |
| EPUBCheck JVM |
~3–5 s with `--validate` |
yes, modest |
| ACE / Node |
~1–2 s with `--a11y` |
yes, modest |
| dpub binary cold start |
~100 ms |
irrelevant |
- For a single-book convert, these are amortised over the conversion itself (minutes to hours). User doesn't notice.
- For `dpub batch` (Tier 1, already shipped) over a small catalogue without `--transcribe`, the win is near zero — the binary runs and exits, no expensive load to amortise.
- The only realistic workflow where the daemon is meaningfully faster is large transcribed catalogues (≥20 books with `--transcribe`). Eleven Ways' day-to-day (handfuls of books, manually triggered) doesn't hit that case.
- Pipeline-2-vs-dpub catalogue benchmarks are positioning, not user pain. We can claim them when needed.
If we ever do build it
The 3–4 week estimate from the plan covered a polished daemon (cross-platform Unix socket + Windows named pipe, robust lifecycle, parallel jobs with backpressure, progress streaming, full test surface, docs). A pragmatic minimum viable version is much less:
- `dpub serve --port N`: HTTP + JSON over TCP, sync, single-threaded, one job at a time
- Use `tiny_http` (sync, ~no deps) — no tokio runtime
- Skip Unix-socket-vs-named-pipe juggling; HTTP works everywhere
- Skip parallel jobs — the existing rayon parallelism inside a single convert is plenty
- Skip lifecycle ceremony — start it, Ctrl-C kills it, that's it
Estimated scope: ~3–5 days of focused work, ~200 lines of code plus a small HTTP client. Same headline benefit (one model load amortised across N conversions); a fraction of the surface.
When to revisit
Reopen and start work when any of these is true:
- A customer has a catalogue-scale transcription workflow (≥20 books with `--transcribe`) where the per-book model reload is a real friction point.
- We want to publish a Pipeline-2-vs-dpub catalogue benchmark for marketing.
- M7 (WASM) ships and we want a complementary "server" surface for non-WASM-friendly workloads.
Out of scope for the eventual work
- Watch-folder mode (separate concern)
- Real OS daemon-isation (systemd, launchd, Windows service) — let the user manage the process
- Native-code daemons in C++ etc. — Rust is enough
Related
- Tier 2 in `.claude/plans/can-you-look-at-quirky-allen.md` (preserved for context).
- M7 WASM (next in Tier 2; preferred priority instead).
- M6.5 word-level Media Overlay sync (separate Tier 2 item, also deferred until requested).
Status
Deferred. This was Tier 2 item 1 + 2 in the strategic roadmap (`.claude/plans/can-you-look-at-quirky-allen.md`), originally framed as a "daemon" with a 3–4 week effort estimate. After review the conclusion is: overkill for current realistic workloads. Filing for tracking; do not start without a concrete customer use case.
What was proposed
A long-running process that holds the loaded Whisper model + a warm EPUBCheck JVM in memory, accepts conversion jobs over a Unix socket / HTTP / stdio JSON-RPC, and returns results. The strategic intent was to amortise one-time costs (Whisper model load, JVM startup) across hundreds of books in a catalogue, and to enable a "dpub processes a 500-book catalogue in N hours; Pipeline 2 takes 30× longer just on JVM startup" benchmark as a positioning artefact.
Why it's overkill (today)
If we ever do build it
The 3–4 week estimate from the plan covered a polished daemon (cross-platform Unix socket + Windows named pipe, robust lifecycle, parallel jobs with backpressure, progress streaming, full test surface, docs). A pragmatic minimum viable version is much less:
Estimated scope: ~3–5 days of focused work, ~200 lines of code plus a small HTTP client. Same headline benefit (one model load amortised across N conversions); a fraction of the surface.
When to revisit
Reopen and start work when any of these is true:
Out of scope for the eventual work
Related