Skip to content

trace-schemas: add namespace-driven schema generation and validation tooling#6527

Open
jutaro wants to merge 11 commits intomasterfrom
jutaro/namespace_generation
Open

trace-schemas: add namespace-driven schema generation and validation tooling#6527
jutaro wants to merge 11 commits intomasterfrom
jutaro/namespace_generation

Conversation

@jutaro
Copy link
Copy Markdown
Contributor

@jutaro jutaro commented Apr 13, 2026

Description

Summary

This PR introduces a namespace-driven trace schema workflow and the tooling around it.

It adds support for generating schema files from traced namespaces, validates generated schemas and trace logs, documents how overrides work, and wires the schema scripts into a standalone Cabal package so they can be run consistently from the dev environment. It also updates trace documentation generation to work with optional namespace lists and refreshes the generated trace-schema artifacts.

On the tracing side, it extends the source discovery and inference logic used by schema generation, improves handling of external trace-dispatcher definitions, and tightens warning reporting so schema generation only reports real unresolved cases instead of false positives.

Why this is hard

This cannot be solved by deriving JSON Schema directly from Haskell types, because the trace JSON is not determined by the types alone. The emitted payloads are shaped by forMachine implementations, helper functions, pattern matches, namespace mapping, verbosity/detail levels, and hand-written rendering logic across multiple packages. In practice, fields may be renamed, omitted, flattened, synthesized, or rendered as strings, so the runtime JSON contract often differs from the source type definition.

Main changes

  • Add bench/trace-schemas as the home for generated schemas, metadata, documentation, presentations, and helper docs.
  • Add schema-generation tooling:
    • GhciSchemaGen.hs
    • ValidateTraceSchemas.hs
    • ValidateTraceLog.hs
    • ApplySchemaOverrides.hs
    • CheckOverrideCoverage.hs
    • RegenerateTraceSchemas.sh
  • Add trace-schema-gen.cabal and wire it into cabal.project.
  • Add override documentation and examples for human-maintained schema patches.
  • Update cardano-node trace documentation generation to support optional namespace selection.
  • Improve namespace/source resolution and schema inference across cardano-node, trace-dispatcher, and related tracing modules.
  • Regenerate newNamespaces.txt, trace-documentation.md, and schema artifacts from the new pipeline.
  • Guard destructive override application by default.
  • Tighten schema-generation warnings so only genuinely empty/unresolved final schemas are reported.

Testing

  • Ran schema generation through nix develop.
  • Ran override application and check scripts.
  • Verified the updated generator completes with Schema generation problems: 0.

@jutaro jutaro self-assigned this Apr 13, 2026
@jutaro jutaro force-pushed the jutaro/namespace_generation branch from 9b37275 to 8913d02 Compare April 20, 2026 11:32
@jutaro jutaro marked this pull request as ready for review April 21, 2026 09:36
@jutaro jutaro requested review from a team as code owners April 21, 2026 09:36
@jutaro jutaro requested review from Icelandjack and removed request for a team April 21, 2026 09:36
Copy link
Copy Markdown
Contributor

@mgmeier mgmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please condense / clean-up commit history for this PR.

Comment thread cabal.project Outdated
Comment thread cardano-node/src/Cardano/Node/Tracing/Tracers/Diffusion.hs Outdated
Comment thread .codex Outdated
Comment thread bench/trace-schemas/trace-documentation.md Outdated
Comment thread bench/trace-schemas/trace-schemas-presentation.odp Outdated
Comment thread bench/trace-schemas/trace-schemas-presentation.pptx Outdated
Comment thread bench/trace-schemas/trace-schemas-speaker-notes.md Outdated
Comment thread trace-dispatcher/src/Cardano/Logging/DocuGenerator.hs Outdated
Copy link
Copy Markdown
Contributor

@mgmeier mgmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing feedback: You've already made the various scripts executables in bench/trace-schemas/scripts/schema-gen/trace-schema-gen.cabal. This means, they're part of the nix flake.

So instead of building a full dev shell with all dependencies, and running the script via bash / runghc, you should switch the new Make targets to something like nix run .#schema-gen.

@jutaro jutaro force-pushed the jutaro/namespace_generation branch from 6dd06b1 to 93fa1f5 Compare April 22, 2026 10:18
readJsonFile :: FilePath -> IO A.Value
readJsonFile fp = do
bs <- BL.readFile fp
case A.eitherDecode bs of
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use A.eitherDecodeFileStrict'

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


parseArgs :: Config -> [String] -> IO Config
parseArgs = go
where
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm aware it's a standalone script... but it still could use optparse-applicative for CLI parsing.
Only a suggestion - the approach taken here is sufficient for the scope.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


splitPath :: FilePath -> [FilePath]
splitPath path = filter (not . null) (go path)
where
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use splitPath or splitDirectories from System.FilePath.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

(a, []) -> [a]
(a, _:rest) -> a : go rest

joinPath :: [FilePath] -> FilePath
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

joinPath is defined in System.FilePath.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

joinPath [] = ""
joinPath (x:xs) = foldl (</>) x xs

replaceFileName :: FilePath -> FilePath -> FilePath
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaceFileName is defined in System.FilePath.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


-- Read file for quick text parsing; tolerate missing files.
readFileSafe :: FilePath -> String
readFileSafe fp = unsafePerformIO (readFile fp `catch` (\(_e :: IOException) -> pure ""))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readFileSafe is only ever called inside IO.
There is no need for it to be a pure function using unsafePerformIO - and then naming it "safe" ;)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

collectTargets :: [FilePath] -> IO [FilePath]
collectTargets roots = do
files <- concat <$> mapM listHsFiles roots
catMaybes <$> mapM (\fp -> do
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use filterM?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

(x:_) -> x

isInfix :: String -> String -> Bool
isInfix needle hay = T.isInfixOf (T.pack needle) (T.pack hay)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use Data.List.isInfixOf ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

then pure v
else do
bs <- BL.readFile histOut
case A.decode bs of
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> A.decodeFile'

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I first tried A.decodeFile' to match the suggestion, but this repo’s Aeson version does not export it, so A.decodeFileStrict' is the compatible file-based replacement here.

[] -> s''

stripPrefix :: String -> String -> Maybe String
stripPrefix pre s = if pre `isPrefixOf` s then Just (drop (length pre) s) else Nothing
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data.List.stripPrefix ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants