Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 70 additions & 0 deletions docs/modules/ROOT/pages/generators.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,76 @@ Unknown top-level keys are silently ignored so future schema additions stay non-
If the same id appears under more than one addon root, the first one wins: that root's manifest sets the format's escape rules.
Later roots can still contribute layered partials and helpers under the same id through the existing template-loading path, so a project can supplement a shared format without redefining it.

To add a generator that builds its output structure imperatively, rather than rendering one page per symbol from templates, see <<script-driven-generators,Script-driven generators>>.

[#script-driven-generators]
== Script-driven generators

A data-driven generator renders one page per symbol from templates. When you need a different output structure - one file per namespace, or a single artifact aggregated across every symbol, such as a search index - a template generator cannot express it, because the page-per-symbol shape is fixed by the host. A script-driven generator hands the whole emit to a Lua or JavaScript script, which traverses the corpus and writes whatever files it wants. No C++ and no templates are involved.

A generator directory is script-driven when its `mrdocs-generator.yml` names an entry script:

[source,yaml]
----
script: generator.lua
----

The `script` key holds a path to a Lua (`.lua`) or JavaScript (`.js`) file, relative to the generator directory. Naming a script is what distinguishes the two flavors: a manifest with a `script` key is script-driven, otherwise the directory is a data-driven (template) generator. As with template generators, the directory name is the generator id you select with `--generator`.

=== The `generate` entry point

The script defines a single entry point:

[source]
----
generate(corpus, output)
----

`corpus.symbols` is the array of every symbol. Each symbol carries the same fields the template and helper layers see, plus a flat `_id` string suitable as a stable per-symbol URL fragment.

`output.write(relativePath, contents)` writes one file. The path is resolved under the output directory and may not escape it; an absolute path or one that climbs above the output directory is rejected. Parent directories are created as needed.

Because the script owns the output, it also owns what a per-page generator would otherwise do for it: the URLs it emits, and any escaping of the content it writes. The host does not apply an escape map to a script-driven generator's output.

In Lua, `generate` may be the value the script returns or a global function; in JavaScript it is a global function:

[source,lua]
----
return function(corpus, output)
-- ...
end
----

Unlike a corpus-transform extension, whose hook is optional, a generator must define a `generate` function: selecting the generator is a request for output, so a missing entry point is an error.

=== Example: a search index

This generator emits a single search-index.json aggregating every symbol, an artifact no per-page generator can produce:

[source,lua]
----
-- Quote a string as a JSON value.
local function json_string(s)
s = s:gsub('\\', '\\\\'):gsub('"', '\\"')
return '"' .. s .. '"'
end

return function(corpus, output)
local entries = {}
for _, sym in ipairs(corpus.symbols) do
local name = sym.name or ""
if name ~= "" then
entries[#entries + 1] =
'{"name":' .. json_string(name) ..
',"url":' .. json_string(sym._id .. ".html") .. "}"
end
end
output.write(
"search-index.json",
"[" .. table.concat(entries, ",") .. "]")
end
----

== Stylesheet Options

The HTML and AsciiDoc generators ship a bundled stylesheet that is inlined by default. You can replace or layer styles with the following options (available in config files and on the CLI):
Expand Down
2 changes: 1 addition & 1 deletion docs/mrdocs.schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -252,7 +252,7 @@
},
"generator": {
"default": "adoc",
"description": "The generator is responsible for creating the documentation from the extracted symbols. The generator uses the extracted symbols and the templates to create the documentation. The built-in generators include `adoc`, `html`, and `xml`; data-driven generators can be added by dropping a template folder under <addon>/generator/<name>/.",
"description": "The generator is responsible for creating the documentation from the extracted symbols. The generator uses the extracted symbols and the templates to create the documentation. The built-in generators include `adoc`, `html`, and `xml`; data-driven generators can be added by dropping a template folder under <addon>/generator/<name>/; script-driven generators instead ship a Lua or JavaScript script that produces the output.",
"title": "Generator used to create the documentation",
"type": "string"
},
Expand Down
2 changes: 1 addition & 1 deletion src/lib/ConfigOptions.json
Original file line number Diff line number Diff line change
Expand Up @@ -397,7 +397,7 @@
{
"name": "generator",
"brief": "Generator used to create the documentation",
"details": "The generator is responsible for creating the documentation from the extracted symbols. The generator uses the extracted symbols and the templates to create the documentation. The built-in generators include `adoc`, `html`, and `xml`; data-driven generators can be added by dropping a template folder under <addon>/generator/<name>/.",
"details": "The generator is responsible for creating the documentation from the extracted symbols. The generator uses the extracted symbols and the templates to create the documentation. The built-in generators include `adoc`, `html`, and `xml`; data-driven generators can be added by dropping a template folder under <addon>/generator/<name>/; script-driven generators instead ship a Lua or JavaScript script that produces the output.",
"type": "string",
"default": "adoc"
},
Expand Down
195 changes: 195 additions & 0 deletions src/lib/Gen/GeneratorManifest.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,195 @@
//
// Licensed under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
// Copyright (c) 2026 Gennaro Prota (gennaro.prota@gmail.com)
//
// Official repository: https://github.com/cppalliance/mrdocs
//

#include "GeneratorManifest.hpp"
#include <mrdocs/Support/Path.hpp>
#include <llvm/ADT/SmallString.h>
#include <llvm/Support/Casting.h>
#include <llvm/Support/SourceMgr.h>
#include <llvm/Support/YAMLParser.h>
#include <filesystem>

namespace mrdocs {

namespace {

// Read a scalar node into an owned string.
std::string
scalarText(llvm::yaml::ScalarNode& node)
{
llvm::SmallString<32> buf;
llvm::StringRef const text = node.getValue(buf);
return std::string(text.data(), text.size());
}

// Parse a YAML mapping whose entries are non-empty byte-sequence keys
// mapped to replacement strings. An empty key is a hard error.
Expected<void>
parseEscape(
llvm::yaml::MappingNode& node,
GeneratorManifest& manifest,
std::string_view yamlPath)
{
for (llvm::yaml::KeyValueNode& entry : node)
{
llvm::yaml::ScalarNode* const keyNode =
llvm::dyn_cast_or_null<llvm::yaml::ScalarNode>(entry.getKey());
llvm::yaml::ScalarNode* const valNode =
llvm::dyn_cast_or_null<llvm::yaml::ScalarNode>(entry.getValue());
if (!keyNode || !valNode)
{
return Unexpected(formatError(
"{}: each 'escape' entry must be a scalar->scalar mapping",
yamlPath));
}
std::string key = scalarText(*keyNode);
if (key.empty())
{
return Unexpected(formatError(
"{}: escape key must not be empty", yamlPath));
}
manifest.escape.emplace_back(
std::move(key), scalarText(*valNode));
}
return {};
}

// Dispatch a single top-level manifest key to its handler. Unknown keys
// are ignored so future schema additions stay non-breaking.
Expected<void>
parseTopLevelEntry(
llvm::yaml::KeyValueNode& pair,
GeneratorManifest& manifest,
std::string_view yamlPath)
{
llvm::yaml::ScalarNode* const keyNode =
llvm::dyn_cast_or_null<llvm::yaml::ScalarNode>(pair.getKey());
if (!keyNode)
{
return {};
}
llvm::SmallString<16> keyBuf;
llvm::StringRef const key = keyNode->getValue(keyBuf);
if (key == "escape")
{
llvm::yaml::MappingNode* const escNode =
llvm::dyn_cast_or_null<llvm::yaml::MappingNode>(pair.getValue());
if (!escNode)
{
return Unexpected(formatError(
"{}: 'escape' must be a mapping", yamlPath));
}
return parseEscape(*escNode, manifest, yamlPath);
}
if (key == "script")
{
llvm::yaml::ScalarNode* const valNode =
llvm::dyn_cast_or_null<llvm::yaml::ScalarNode>(pair.getValue());
if (!valNode)
{
return Unexpected(formatError(
"{}: 'script' must be a scalar", yamlPath));
}
manifest.script = scalarText(*valNode);
}
return {};
}

} // (anon)

Expected<GeneratorManifest>
loadGeneratorManifest(std::string_view yamlPath)
{
MRDOCS_TRY(std::string text, files::getFileText(yamlPath));
llvm::SourceMgr sm;
llvm::yaml::Stream stream(text, sm);

GeneratorManifest manifest;
llvm::yaml::document_iterator docIt = stream.begin();
if (docIt == stream.end())
{
return manifest;
}
llvm::yaml::Node* const rootNode = docIt->getRoot();
if (rootNode == nullptr ||
llvm::isa<llvm::yaml::NullNode>(rootNode))
{
// Empty document: a file with no content, only comments, or a
// literal `null`. All of these mean "no rules".
return manifest;
}
llvm::yaml::MappingNode* const root =
llvm::dyn_cast<llvm::yaml::MappingNode>(rootNode);
if (!root)
{
return Unexpected(formatError(
"{}: top-level YAML node must be a mapping", yamlPath));
}
for (llvm::yaml::KeyValueNode& pair : *root)
{
MRDOCS_TRY(parseTopLevelEntry(pair, manifest, yamlPath));
}
return manifest;
}

namespace {

constexpr std::string_view metadataFileName = "mrdocs-generator.yml";

// Append every manifested subdirectory of `generatorDir` to `out`.
Expected<void>
scanGeneratorDir(
std::string_view generatorDir,
std::vector<DiscoveredManifest>& out)
{
namespace fs = std::filesystem;
std::error_code iterEc;
fs::directory_iterator const end{};
for (fs::directory_iterator it(generatorDir, iterEc);
!iterEc && it != end;
it.increment(iterEc))
{
std::error_code typeEc;
if (!it->is_directory(typeEc))
{
continue;
}
std::string const dir = it->path().string();
std::string const yamlPath = files::appendPath(
dir, std::string(metadataFileName));
if (!files::exists(yamlPath))
{
continue;
}
MRDOCS_TRY(GeneratorManifest manifest, loadGeneratorManifest(yamlPath));
out.push_back(DiscoveredManifest{ dir, std::move(manifest) });
}
return {};
}

} // (anon)

Expected<std::vector<DiscoveredManifest>>
discoverGeneratorManifests(std::vector<std::string> const& roots)
{
std::vector<DiscoveredManifest> out;
for (std::string const& root : roots)
{
std::string const dir = files::appendPath(root, "generator");
if (!files::exists(dir))
{
continue;
}
MRDOCS_TRY(scanGeneratorDir(dir, out));
}
return out;
}

} // mrdocs
104 changes: 104 additions & 0 deletions src/lib/Gen/GeneratorManifest.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
//
// Licensed under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
// Copyright (c) 2026 Gennaro Prota (gennaro.prota@gmail.com)
//
// Official repository: https://github.com/cppalliance/mrdocs
//

#ifndef MRDOCS_LIB_GEN_GENERATORMANIFEST_HPP
#define MRDOCS_LIB_GEN_GENERATORMANIFEST_HPP

#include <mrdocs/Support/Error.hpp>
#include <mrdocs/Support/Expected.hpp>
#include <optional>
#include <string>
#include <string_view>
#include <utility>
#include <vector>

namespace mrdocs {

/** The parsed contents of a generator manifest.

A manifest is the `mrdocs-generator.yml` that an addon directory
under <root>/generator/<name>/ ships to declare a generator. The two
generator flavors read disjoint fields of the same file:

@li A data-driven (Handlebars) generator reads the escape rules.

@li A script-driven generator reads the entry-file path.

The presence of the `script` entry is what distinguishes the two: a
manifest that names a `script` is a script-driven generator,
otherwise it is data-driven.
*/
struct GeneratorManifest
{
/** The entry file of a script-driven generator.

Holds the value of the manifest's optional `script` key, a path
relative to the generator directory. Empty when the manifest
declares no `script`, in which case the directory is a
data-driven generator.
*/
std::optional<std::string> script;

/** The escape rules of a data-driven generator.

Each pair maps a byte-sequence source to its replacement string,
in manifest order. Empty when no escape rules are declared.
*/
std::vector<std::pair<std::string, std::string>> escape;
};

/** Parse a generator manifest into plain data.

Read the file at `yamlPath` and return its contents. The file is
expected to contain a top-level mapping. The optional `escape` key
holds a sub-mapping from byte-sequence keys to replacement strings;
keys may be one or more bytes long, and an empty key is a hard error.
The optional `script` key holds the entry-file path as a scalar.
Unknown top-level keys are ignored so future schema additions are
non-breaking.

An empty document (an empty file, comments only, or a literal `null`)
yields an empty manifest.
*/
Expected<GeneratorManifest>
loadGeneratorManifest(std::string_view yamlPath);

/** A generator directory paired with its parsed manifest.
*/
struct DiscoveredManifest
{
/** The generator directory, of the form <root>/generator/<name>.
*/
std::string dir;

/** The parsed contents of the directory's manifest.
*/
GeneratorManifest manifest;
};

/** Find every addon generator directory that ships a manifest.

For each addon root, walk the immediate subdirectories of
<root>/generator/. A subdirectory is reported when it ships an
`mrdocs-generator.yml`; the manifest is parsed and returned alongside
its directory. Directories without a manifest (the built-in shared
common/ is the canonical example) are skipped.

The presence of a `script` entry distinguishes the two generator
flavors, so a caller installs the flavor it owns and ignores the
other. Roots are searched in order, so the result preserves addon
precedence.
*/
Expected<std::vector<DiscoveredManifest>>
discoverGeneratorManifests(std::vector<std::string> const& roots);

} // mrdocs

#endif
Loading
Loading