Summary
Render an open .docx to PDF.
Why
PDF is the most common "final, shareable" output for documents. Distinct from the text-family exports: agents and users often want a fixed-layout artifact to send.
Why this is the outlier (heaviest lift)
Unlike Markdown/HTML/text — which are tree-to-tree serializations over the existing document_view model — PDF requires a layout/render engine: line breaking, pagination, font metrics, page geometry, headers/footers, section/column handling. None of that exists in docx-core today, and it's a fundamentally different problem from the parse/serialize layer.
Options to evaluate (separately)
- Render the semantic-HTML output (from the HTML export issue) to PDF via a headless engine — pragmatic, but adds a heavy dependency and reintroduces the "LibreOffice/Chromium-on-Lambda" footprint we'd otherwise avoid.
- A lightweight, dependency-light renderer for a constrained subset of documents.
- Punt entirely and document HTML→PDF as a user-side step.
Scope / sequencing
- Separate, larger initiative. Should not block or be bundled with the structured-export (Markdown/HTML/text) work.
Effort: high. Priority: last; design spike first.
Summary
Render an open
.docxto PDF.Why
PDF is the most common "final, shareable" output for documents. Distinct from the text-family exports: agents and users often want a fixed-layout artifact to send.
Why this is the outlier (heaviest lift)
Unlike Markdown/HTML/text — which are tree-to-tree serializations over the existing
document_viewmodel — PDF requires a layout/render engine: line breaking, pagination, font metrics, page geometry, headers/footers, section/column handling. None of that exists indocx-coretoday, and it's a fundamentally different problem from the parse/serialize layer.Options to evaluate (separately)
Scope / sequencing
Effort: high. Priority: last; design spike first.