You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add document export / serialization to the safe-docx suite: render an open .docx into Markdown, HTML, plain text, and (separately) PDF. This completes the read → edit → compare → export loop.
Why
Agents routinely need a document's content in a portable format — to summarize, diff, feed to another model, or hand off to a human. Today the suite can read, grep, edit, compare, and save .docx, but it cannot emit a rendering in another format.
Architecture: one serializer core, several emitters
The OOXML parse layer already exists (packages/docx-core/src/primitives/), and document_view.ts already produces a structured, semantically-tagged model — headings with levels, list metadata (list_level, label_type), and inline tags (<b>, <i>, <u>, <a href>, <font>, <highlight>).
So Markdown, HTML, and plain text are thin emitters over a shared "structured-export" serializer core and should be built together. The inline layer is already HTML-shaped, which makes HTML and Markdown roughly equal first targets. PDF is a different problem — it needs a layout/render engine — and is tracked as a separate initiative.
Summary
Add document export / serialization to the safe-docx suite: render an open
.docxinto Markdown, HTML, plain text, and (separately) PDF. This completes the read → edit → compare → export loop.Why
Agents routinely need a document's content in a portable format — to summarize, diff, feed to another model, or hand off to a human. Today the suite can read, grep, edit, compare, and save
.docx, but it cannot emit a rendering in another format.Architecture: one serializer core, several emitters
The OOXML parse layer already exists (
packages/docx-core/src/primitives/), anddocument_view.tsalready produces a structured, semantically-tagged model — headings with levels, list metadata (list_level,label_type), and inline tags (<b>,<i>,<u>,<a href>,<font>,<highlight>).So Markdown, HTML, and plain text are thin emitters over a shared "structured-export" serializer core and should be built together. The inline layer is already HTML-shaped, which makes HTML and Markdown roughly equal first targets. PDF is a different problem — it needs a layout/render engine — and is tracked as a separate initiative.
Sequencing
Sub-issues