Summary
The current Content_Parser interface and atmosphere_content_parser filter assume a single parser produces a single content record per post. site.standard.document is a union field that accepts multiple lexicon types — long-term we likely want to support more than just at.markpub.markdown (e.g. HTML, plain text, or whatever standard.site adds next), and possibly emit more than one representation per post.
This issue tracks design-and-discuss before a second parser implementation lands and forces a breaking interface change.
Current Shape
includes/content-parser/interface-content-parser.php — parse() returns a single ?array shaped for one lexicon type. get_type() returns one NSID.
includes/class-atmosphere.php:50 — registers a single Markpub parser via atmosphere_content_parser filter.
includes/transformer/class-document.php:172 — fetches one parser, calls parse(), sets content to the single result.
Constraints this imposes:
- Only one parser can win the filter (last-wins). Two third-party parsers can't coexist.
- Document can't pick a parser based on the post (e.g. classic content → HTML parser; block content → Markpub).
- No path to emit multiple representations even though the union allows it.
Discussion Points
Sketching a few directions — not picking one yet:
A. Registry of parsers by NSID
Replace the single-parser filter with a registry. Parsers register against their get_type(). Document picks the preferred one per post (config or filter hook), or iterates and emits all that produce non-null output.
B. Per-NSID filter pattern
Keep the filter shape but namespace it: atmosphere_content_parser_at_markpub_markdown, atmosphere_content_parser_at_html (or similar). Document iterates across known NSIDs.
C. Parser produces canonical IR; Document projects
Parser returns a structured intermediate (e.g. block tree) and a list of supported output types. Document negotiates and projects. Most flexible, biggest refactor.
Open Questions
- Does standard.site already define more than one content format today, or is this purely future-proofing?
- Should the post author be able to override which format publishes (per-post meta), or is this a site-wide config?
- Does the union field accept an array (multiple representations) or strictly one of N?
- Where does language/locale or accessibility metadata live — per-parser or shared?
Out of Scope
- Implementing a second parser. This issue is interface-design only; the next parser PR consumes whatever shape lands here.
- Changing the Markpub parser's behavior. Markpub keeps producing
at.markpub.markdown regardless of the chosen registry shape.
Context
Raised by @pfefferle in #9 (comment) ("we should maybe find a more generic way, to also transform into the other standard.site content formats").
Markpub (#9) is shipping with the current single-parser interface; this issue captures the follow-up so the design is settled before a second parser implementation forces our hand.
Summary
The current
Content_Parserinterface andatmosphere_content_parserfilter assume a single parser produces a single content record per post.site.standard.documentis a union field that accepts multiple lexicon types — long-term we likely want to support more than justat.markpub.markdown(e.g. HTML, plain text, or whatever standard.site adds next), and possibly emit more than one representation per post.This issue tracks design-and-discuss before a second parser implementation lands and forces a breaking interface change.
Current Shape
includes/content-parser/interface-content-parser.php—parse()returns a single?arrayshaped for one lexicon type.get_type()returns one NSID.includes/class-atmosphere.php:50— registers a singleMarkpubparser viaatmosphere_content_parserfilter.includes/transformer/class-document.php:172— fetches one parser, callsparse(), setscontentto the single result.Constraints this imposes:
Discussion Points
Sketching a few directions — not picking one yet:
A. Registry of parsers by NSID
Replace the single-parser filter with a registry. Parsers register against their
get_type(). Document picks the preferred one per post (config or filter hook), or iterates and emits all that produce non-null output.B. Per-NSID filter pattern
Keep the filter shape but namespace it:
atmosphere_content_parser_at_markpub_markdown,atmosphere_content_parser_at_html(or similar). Document iterates across known NSIDs.C. Parser produces canonical IR; Document projects
Parser returns a structured intermediate (e.g. block tree) and a list of supported output types. Document negotiates and projects. Most flexible, biggest refactor.
Open Questions
Out of Scope
at.markpub.markdownregardless of the chosen registry shape.Context
Raised by @pfefferle in #9 (comment) ("we should maybe find a more generic way, to also transform into the other standard.site content formats").
Markpub (#9) is shipping with the current single-parser interface; this issue captures the follow-up so the design is settled before a second parser implementation forces our hand.