Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 125 additions & 0 deletions docs/pipeline/indesign-output-generator.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# InDesign output generator: guide

The output generator is stage 4 — the final stage — of the
InDesign-to-WordPress pipeline. It takes the
[intermediate representation](../../packages/pipeline/src/indesign/ir.js)
produced by the [IDML parser](../../packages/pipeline/README.md) (#62) or the
[PDF fallback parser](indesign-pdf-fidelity.md) (#63), runs it through the
[token mapper](indesign-token-mapper.md) (#64), and emits a complete,
installable Flavian-compatible FSE theme directory.

It turns the design into a finished WordPress product:

| Artifact | What it is |
| --- | --- |
| `theme.json` | The token mapper's merged base + partial — one schema-valid file. |
| `patterns/spread-N.php` | One FSE block pattern per spread, filed under the **InDesign Imports** category. |
| `templates/` | `index.html` (stitches the spread patterns between header/footer), plus `page.html` and `404.html`. |
| `parts/` | `header.html` and `footer.html`, derived from master-spread chrome where present. |
| `style.css`, `functions.php` | Theme header + bootstrap (registers the pattern category, enqueues styles). |
| `bin/import-media.sh` | WP-CLI script that imports staged assets into the media library. |
| `bin/seed-content.sh` | (optional, `--seed-content`) one draft page per spread, populated with its pattern. |
| `indesign-pipeline-report.md` | Produced files, unmapped IR nodes, and manual follow-ups. |

## Usage

```js
import { parseIdml, generateTheme } from '@flavian/pipeline';

const ir = await parseIdml('./brochure.idml');
const { files, assets, themeJson, report } = generateTheme(ir, {
// slug, name, // theme slug / display name (default: from the doc name)
// seedContent: true, // also emit bin/seed-content.sh
// base, fontMap, namespace, tolerance, tolerancePx, gridPx, fluid, // token-mapper options
// tokens, // a precomputed mapTokens() result (skips re-mapping)
});

// `files` is [{ path, contents, mode? }] — pure data; the CLI does the fs writes.
for (const f of files) console.log(f.path);
console.log(report.markdown);
```

`generateTheme` is **pure and deterministic**: no filesystem, clock, or
randomness. The same IR yields byte-identical files every run, so reruns never
produce diff churn in unrelated artifacts.

From the command line (composes with the parser CLIs, or parses directly):

```bash
# Compose with a parser CLI…
node packages/pipeline/bin/parse-idml.mjs brochure.idml \
| node packages/pipeline/bin/generate-theme.mjs - --out-dir themes/brochure

# …or parse + generate in one step (accepts .idml or .pdf directly).
node packages/pipeline/bin/generate-theme.mjs brochure.idml \
--out-dir themes/brochure --asset-dir ./extracted-images --seed-content
```

### Options

| Option | CLI flag | Default | Effect |
| --- | --- | --- | --- |
| `slug` | `--slug <str>` | slug of the document name | Theme directory slug. |
| `name` | `--name <str>` | the document name | Theme display name. |
| — | `--out-dir <dir>` | _(required)_ | Where the theme directory is written. |
| — | `--asset-dir <dir>` | — | Source of image bytes to copy into `assets/` (matched by basename). |
| `seedContent` | `--seed-content` | off | Emit `bin/seed-content.sh`. |
| `tokens` | — | computed | A precomputed `mapTokens()` result. |

All [token-mapper options](indesign-token-mapper.md#options) (`--base`,
`--font-map`, `--namespace`, `--grid`, `--tolerance`, `--type-tolerance`,
`--fluid`, `--dpi`) pass straight through.

## How frames become blocks

Each spread is laid out top-to-bottom in reading order, then mapped to core
blocks:

- **Text frames** → `core/heading` / `core/paragraph`, grouped in a
`core/group`. A run's paragraph style decides the role: `Heading N` →
`core/heading` at level N; `Body`/`Caption`/etc. → `core/paragraph`. The
font-size, font-family, and text-color come from the **design tokens**
(preset slugs), never inline values, wherever the mapper produced one.
- **Image frames** → `core/image`, or `core/cover` when one or more text frames
sit on top of the image (a background with overlaid copy). Image URLs resolve
through `get_theme_file_uri()` so the theme stays relocatable (the
pattern-first rule — no broken `src=""` in markup).
- **Side-by-side frames** (overlapping vertical bands) → `core/columns`, one
`core/column` per frame, left to right.

### Template parts from masters

A master spread's repeating chrome is split by vertical position: text in the
top band becomes `parts/header.html`, text in the bottom band becomes
`parts/footer.html` (running heads / page-number chrome → a web footer line).
With no usable master, sensible FSE defaults (site title + navigation; a
copyright line) are emitted instead, and the report flags it.

### Assets

The generator works from the IR, which carries image *references*, not bytes.
So every image frame gets a deterministic staged filename
(`assets/spread-N-image-K.ext`), and `bin/import-media.sh` imports whatever
lands in `assets/`. Pass `--asset-dir` to have the CLI copy the real bytes in
(matched by basename); otherwise the report lists the expected filenames as a
follow-up.

## Acceptance criteria

| Criterion (#65) | How it's met |
| --- | --- |
| End-to-end on a fixture `.idml` produces a theme that loads in the Site Editor without PHP errors. | `generate-theme.mjs` writes a full theme dir; PHP files are a standard header + bootstrap and block markup with only `esc_url( get_theme_file_uri() )` interpolation. |
| ≥1 pattern per spread appears under an "InDesign Imports" category. | One `patterns/spread-N.php` per spread, each `Categories: indesign-imports`; `functions.php` registers the category with the label **InDesign Imports**. |
| `theme.json` round-trips through validation. | It is the token mapper's `merged` output, already validated against the WordPress schema (ajv + zod). |
| Report enumerates produced files and unmapped IR nodes. | `indesign-pipeline-report.md` — see the **Produced artifacts** and **Unmapped IR nodes** sections. |
| Snapshot tests cover two fixture spreads (text-heavy, image-heavy). | `tests/indesign/generate.test.mjs` snapshots `patterns/spread-1.php` (text) and `spread-2.php` (image), stored in `tests/indesign/__snapshots__/`. |
| Patterns are deterministic given the same IR. | `generateTheme` is pure; a determinism test asserts two runs are byte-identical. |

## Testing

```bash
pnpm --filter @flavian/pipeline test

# Re-record the markup snapshots after an intentional change:
UPDATE_SNAPSHOTS=1 node --test packages/pipeline/tests/indesign/generate.test.mjs
```
75 changes: 62 additions & 13 deletions packages/pipeline/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Conversion pipeline for InDesign (and future) sources into WordPress FSE themes.

## Status

This package ships the **IDML parser** (sub-issue #62), the **PDF fallback parser** (sub-issue #63), and the **style + token mapper** (sub-issue #64) of the InDesign-to-WordPress epic. The two parsers emit the same intermediate representation; the mapper turns that IR into WordPress design tokens (a `theme.json` partial, DTCG `design-tokens.json`, and a report). The output generator (#65) will land as a separate PR.
This package ships the **IDML parser** (sub-issue #62), the **PDF fallback parser** (sub-issue #63), the **style + token mapper** (sub-issue #64), and the **output generator** (sub-issue #65) of the InDesign-to-WordPress epic. The two parsers emit the same intermediate representation; the mapper turns that IR into WordPress design tokens (a `theme.json` partial, DTCG `design-tokens.json`, and a report); the output generator turns the IR plus those tokens into a complete, installable FSE theme (patterns, templates, parts, merged `theme.json`, asset scripts, and a generation report).

IDML is the primary path (full access to stories, frames, styles, swatches, masters). PDF is a lossy fallback for when only the exported PDF is available, or as a verification source against IDML output — see [`docs/pipeline/indesign-pdf-fidelity.md`](../../docs/pipeline/indesign-pdf-fidelity.md).

Expand All @@ -15,7 +15,8 @@ packages/pipeline/
├── bin/
│ ├── parse-idml.mjs CLI: IDML → validated IR JSON on stdout
│ ├── parse-pdf.mjs CLI: PDF → reconstructed IR JSON on stdout
│ └── map-tokens.mjs CLI: IR (or .idml/.pdf) → theme.json + design tokens
│ ├── map-tokens.mjs CLI: IR (or .idml/.pdf) → theme.json + design tokens
│ └── generate-theme.mjs CLI: IR (or .idml/.pdf) → complete FSE theme directory
├── config/
│ └── font-map.json InDesign family → web/Google font fallback table
└── src/
Expand All @@ -41,17 +42,29 @@ packages/pipeline/
│ ├── color.js Re-exports the shared color helpers for the PDF path
│ ├── png.js Decoded pixels → PNG via node:zlib (pure)
│ └── assets.js Write extracted images to the asset cache
└── map/ IR → WordPress design tokens (token mapper)
├── index.js mapTokens orchestrator → { partial, designTokens, merged, report }
├── colors.js Swatches → color palette (convert, dedupe, reuse base)
├── typography.js Paragraph styles → font-size scale + element/block presets
├── spacing.js Geometry + paragraph spacing → quantized spacing scale
├── fonts.js Fonts → font families via config/font-map.json
├── theme-json.js Assemble partial, deep-merge with base, validate
├── design-tokens.js DTCG / Style Dictionary emitter
├── report.js Warnings + provenance aggregation
├── slug.js Namespaced slug helpers
└── schema/ Vendored WP theme.json schema + zod subset
├── map/ IR → WordPress design tokens (token mapper)
│ ├── index.js mapTokens orchestrator → { partial, designTokens, merged, report }
│ ├── colors.js Swatches → color palette (convert, dedupe, reuse base)
│ ├── typography.js Paragraph styles → font-size scale + element/block presets
│ ├── spacing.js Geometry + paragraph spacing → quantized spacing scale
│ ├── fonts.js Fonts → font families via config/font-map.json
│ ├── theme-json.js Assemble partial, deep-merge with base, validate
│ ├── design-tokens.js DTCG / Style Dictionary emitter
│ ├── report.js Warnings + provenance aggregation
│ ├── slug.js Namespaced slug helpers
│ └── schema/ Vendored WP theme.json schema + zod subset
└── generate/ IR + tokens → installable FSE theme (output generator)
├── index.js generateTheme orchestrator → { files, assets, themeJson, report }
├── layout.js Spread frames → reading-order rows + column/cover detection
├── blocks.js Frames → core block markup (heading/paragraph/image/cover)
├── patterns.js Spread → pattern PHP file (one per spread)
├── parts.js Master spreads → header/footer template parts
├── templates.js index.html / page.html / 404.html
├── theme-files.js style.css + functions.php
├── media.js Asset staging plan + import-media.sh / seed-content.sh
├── report.js indesign-pipeline-report.md (markdown)
├── escape.js HTML/PHP escaping + get_theme_file_uri() helpers
└── slugs.js Deterministic theme/pattern/asset naming
```

## Quick start
Expand Down Expand Up @@ -130,6 +143,40 @@ node packages/pipeline/bin/map-tokens.mjs brochure.idml --out-dir ./tokens

See [`docs/pipeline/indesign-token-mapper.md`](../../docs/pipeline/indesign-token-mapper.md) for the conversion math (CMYK/LAB → sRGB), the font-map format, the warning codes, merge semantics, and how the acceptance criteria are met.

## Output generator

Turn a parsed IR (plus the mapped tokens) into a complete, installable FSE
theme: one block pattern per spread under an **InDesign Imports** category,
templates and header/footer parts, the merged `theme.json`, asset import
scripts, and a generation report. `generateTheme` is pure and deterministic —
the same IR yields byte-identical files, so reruns never churn.

```js
import { parseIdml, generateTheme } from '@flavian/pipeline';

const ir = await parseIdml('./brochure.idml');
const { files, assets, report } = generateTheme(ir, {
// slug, name, // theme slug / display name (default: from the doc name)
// seedContent: true, // also emit bin/seed-content.sh
// tokens, // a precomputed mapTokens() result (skips re-mapping)
});

for (const f of files) console.log(f.path); // [{ path, contents, mode? }]
```

From the command line (composes with the parser CLIs, or parses directly):

```bash
node packages/pipeline/bin/parse-idml.mjs brochure.idml \
| node packages/pipeline/bin/generate-theme.mjs - --out-dir themes/brochure

# or parse + generate in one step, staging image bytes from a source dir
node packages/pipeline/bin/generate-theme.mjs brochure.idml \
--out-dir themes/brochure --asset-dir ./images --seed-content
```

See [`docs/pipeline/indesign-output-generator.md`](../../docs/pipeline/indesign-output-generator.md) for the frame → block mapping, master → parts derivation, asset staging, and how the acceptance criteria are met.

## IR shape

The intermediate representation is described in [`src/indesign/ir.js`](src/indesign/ir.js). At the top level:
Expand Down Expand Up @@ -170,6 +217,8 @@ Tests build minimal fixtures programmatically — no binary fixtures in git. `te

The PDF heuristics (clustering, classification, color, PNG encoding) are split into pure modules under `src/indesign/pdf/` and unit-tested without a PDF engine; only `extract.js` and the orchestrator touch pdfjs.

The output generator's markup is covered by snapshot tests in `tests/indesign/generate.test.mjs`; the snapshots live in `tests/indesign/__snapshots__/`. Re-record them after an intentional change with `UPDATE_SNAPSHOTS=1 node --test tests/indesign/generate.test.mjs`.

## Adding a new input format

When sub-issues for Figma / Canva migrations land, mirror the InDesign layout: a sibling directory under `src/`, its own IR schema, and a `parsers/` subdir for any input-format-specific decoders. The top-level `src/index.js` re-exports each surface so consumers `import { parseIdml, parseFigma } from '@flavian/pipeline'`.
Loading
Loading