Skip to content

feat: add rulebound generate and validate commands#3

Open
armstrongl wants to merge 6 commits intomainfrom
feat/generate-validate-commands
Open

feat: add rulebound generate and validate commands#3
armstrongl wants to merge 6 commits intomainfrom
feat/generate-validate-commands

Conversation

@armstrongl
Copy link
Copy Markdown
Owner

@armstrongl armstrongl commented Apr 11, 2026

Summary

  • rulebound generate <file.md> converts structured Markdown (YAML frontmatter + vale-* fenced code blocks) into Vale-compatible YAML rule files
  • rulebound validate <file.yml> checks Vale YAML rule files for structural errors without requiring Hugo
  • Supports substitution, existence, occurrence, and capitalization rule types (v1 scope)
  • New internal/mdgen/ package with order-preserving YAML emission via yaml.Node
  • New validation functions in internal/parser/ with comprehensive error collection

Key decisions

  • yaml.Node tree for deterministic field and swap-entry ordering (R8)
  • map[string]interface{} for validation to avoid Go zero-value ambiguity on int fields
  • Shared ExtractFrontmatter and IsKnownField exports from parser package (no duplication)
  • exitError moved to cmd/root.go for all commands to share

Test plan

  • 20 parser tests (happy paths, edge cases, error paths for all 4 rule types)
  • 15 emitter tests (semantic, golden-file, round-trip, swap order verification)
  • 7 generate command tests (default output, explicit path, stdout, force, errors)
  • 6 validate command tests (valid, invalid, mixed, nonexistent, generate-then-validate integration)
  • 20 validation logic tests (all rule types, multi-error, zero-value edge cases, real Microsoft rules)
  • go test ./... all packages pass
  • go vet ./... clean
  • End-to-end: rulebound generate testdata/substitution.md | rulebound validate - round-trip verified

Post-Deploy Monitoring & Validation

No additional operational monitoring required — these are new CLI commands with no runtime/server component. Validation is that the commands appear in rulebound --help and produce correct output on the testdata files.

Both generate and validate commands need exitError. Moving it next to
the exit code constants and Execute() that consumes it.
ExtractFrontmatter provides shared frontmatter extraction for the new
mdgen package. IsKnownField exposes the Vale field whitelist so mdgen
can warn on unknown frontmatter fields without duplicating the map.
New internal/mdgen package converts structured Markdown files (YAML
frontmatter + vale-* fenced code blocks) into Vale-compatible YAML.

- ParseMarkdown extracts frontmatter, vale-swap/tokens/exceptions blocks
- EmitYAML builds yaml.Node tree for deterministic field and swap ordering
- Supports substitution, existence, occurrence, capitalization rule types
- Unknown frontmatter fields warn and pass through; meta: block stripped
- Comprehensive test coverage with golden files for all four rule types
Validates Vale rule files without Hugo: checks extends, message, level,
and type-specific fields (swap, tokens, max/min/token, match). Parses
into map[string]interface{} for reliable field-presence detection.
Collects all errors rather than failing on first.
rulebound generate converts structured Markdown to Vale YAML with
--output and --force flags. rulebound validate checks YAML rule files
for structural errors without requiring Hugo.

Includes generate-then-validate integration test proving the two
commands work together end-to-end.
@semanticdiff-com
Copy link
Copy Markdown

semanticdiff-com Bot commented Apr 11, 2026

Review changes with  SemanticDiff

Changed Files
File Status
  cmd/build.go  0% smaller
  cmd/generate.go  0% smaller
  cmd/generate_test.go  0% smaller
  cmd/root.go  0% smaller
  cmd/validate.go  0% smaller
  cmd/validate_test.go  0% smaller
  internal/mdgen/emit.go  0% smaller
  internal/mdgen/emit_test.go  0% smaller
  internal/mdgen/parse.go  0% smaller
  internal/mdgen/parse_test.go  0% smaller
  internal/mdgen/testdata/capitalization-missing-match.md Unsupported file format
  internal/mdgen/testdata/capitalization.md Unsupported file format
  internal/mdgen/testdata/crlf.md Unsupported file format
  internal/mdgen/testdata/duplicate-swap-blocks.md Unsupported file format
  internal/mdgen/testdata/empty-message.md Unsupported file format
  internal/mdgen/testdata/empty-swap.md Unsupported file format
  internal/mdgen/testdata/existence.md Unsupported file format
  internal/mdgen/testdata/expected/capitalization.yml  0% smaller
  internal/mdgen/testdata/expected/existence.yml  0% smaller
  internal/mdgen/testdata/expected/occurrence.yml  0% smaller
  internal/mdgen/testdata/expected/substitution.yml  0% smaller
  internal/mdgen/testdata/meta-block.md Unsupported file format
  internal/mdgen/testdata/missing-extends.md Unsupported file format
  internal/mdgen/testdata/missing-message.md Unsupported file format
  internal/mdgen/testdata/missing-swap.md Unsupported file format
  internal/mdgen/testdata/missing-tokens.md Unsupported file format
  internal/mdgen/testdata/no-frontmatter.md Unsupported file format
  internal/mdgen/testdata/no-level.md Unsupported file format
  internal/mdgen/testdata/occurrence-missing-token.md Unsupported file format
  internal/mdgen/testdata/occurrence.md Unsupported file format
  internal/mdgen/testdata/quoted-swap-keys.md Unsupported file format
  internal/mdgen/testdata/substitution.md Unsupported file format
  internal/mdgen/testdata/unknown-field.md Unsupported file format
  internal/mdgen/testdata/unsupported-extends.md Unsupported file format
  internal/parser/frontmatter.go  0% smaller
  internal/parser/parser.go  0% smaller
  internal/parser/testdata/validate/invalid-extends.yml  0% smaller
  internal/parser/testdata/validate/invalid-level.yml  0% smaller
  internal/parser/testdata/validate/missing-message.yml  0% smaller
  internal/parser/testdata/validate/missing-swap.yml  0% smaller
  internal/parser/testdata/validate/multiple-errors.yml  0% smaller
  internal/parser/testdata/validate/occurrence-max-only.yml  0% smaller
  internal/parser/testdata/validate/occurrence-min-zero.yml  0% smaller
  internal/parser/testdata/validate/valid-capitalization.yml  0% smaller
  internal/parser/testdata/validate/valid-existence.yml  0% smaller
  internal/parser/testdata/validate/valid-occurrence.yml  0% smaller
  internal/parser/testdata/validate/valid-substitution.yml  0% smaller
  internal/parser/validate.go  0% smaller
  internal/parser/validate_test.go  0% smaller

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds new CLI capabilities to rulebound for (1) generating Vale rule YAML from structured Markdown and (2) validating Vale rule YAML structurally without Hugo, supported by new parsing/emitting/validation utilities and extensive tests.

Changes:

  • Introduces rulebound generate <file.md> backed by internal/mdgen (frontmatter + vale-* fenced blocks → deterministic YAML emission).
  • Introduces rulebound validate <file.yml> backed by internal/parser/validate.go (structural checks + aggregated errors).
  • Exports shared parser helpers (ExtractFrontmatter, IsKnownField) and centralizes exitError for consistent CLI exits.

Reviewed changes

Copilot reviewed 49 out of 49 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
internal/parser/validate.go New structural validator for Vale YAML rules (v1 supported extends).
internal/parser/validate_test.go Unit tests for YAML rule validation, including multi-error cases and real rules.
internal/parser/testdata/validate/valid-substitution.yml Validation fixture: valid substitution rule.
internal/parser/testdata/validate/valid-occurrence.yml Validation fixture: valid occurrence rule.
internal/parser/testdata/validate/valid-existence.yml Validation fixture: valid existence rule.
internal/parser/testdata/validate/valid-capitalization.yml Validation fixture: valid capitalization rule.
internal/parser/testdata/validate/occurrence-min-zero.yml Validation fixture: occurrence with min: 0.
internal/parser/testdata/validate/occurrence-max-only.yml Validation fixture: occurrence with only max.
internal/parser/testdata/validate/multiple-errors.yml Validation fixture: triggers multiple validation errors.
internal/parser/testdata/validate/missing-swap.yml Validation fixture: missing substitution swap.
internal/parser/testdata/validate/missing-message.yml Validation fixture: missing message.
internal/parser/testdata/validate/invalid-level.yml Validation fixture: invalid level.
internal/parser/testdata/validate/invalid-extends.yml Validation fixture: unsupported extends.
internal/parser/parser.go Exposes IsKnownField for reuse by mdgen.
internal/parser/frontmatter.go Adds shared Markdown frontmatter extraction helper.
internal/mdgen/testdata/unsupported-extends.md Markdown fixture: unsupported extends type.
internal/mdgen/testdata/unknown-field.md Markdown fixture: unknown frontmatter key passthrough.
internal/mdgen/testdata/substitution.md Markdown fixture: substitution rule with vale-swap.
internal/mdgen/testdata/quoted-swap-keys.md Markdown fixture: quoted swap keys.
internal/mdgen/testdata/occurrence.md Markdown fixture: occurrence rule frontmatter.
internal/mdgen/testdata/occurrence-missing-token.md Markdown fixture: missing token for occurrence.
internal/mdgen/testdata/no-level.md Markdown fixture: default-level behavior.
internal/mdgen/testdata/no-frontmatter.md Markdown fixture: missing frontmatter error case.
internal/mdgen/testdata/missing-tokens.md Markdown fixture: missing vale-tokens block.
internal/mdgen/testdata/missing-swap.md Markdown fixture: missing vale-swap block.
internal/mdgen/testdata/missing-message.md Markdown fixture: missing message.
internal/mdgen/testdata/missing-extends.md Markdown fixture: missing extends.
internal/mdgen/testdata/meta-block.md Markdown fixture: meta block stripping.
internal/mdgen/testdata/expected/substitution.yml Golden output for substitution emission.
internal/mdgen/testdata/expected/occurrence.yml Golden output for occurrence emission.
internal/mdgen/testdata/expected/existence.yml Golden output for existence emission.
internal/mdgen/testdata/expected/capitalization.yml Golden output for capitalization emission.
internal/mdgen/testdata/existence.md Markdown fixture: existence rule with tokens.
internal/mdgen/testdata/empty-swap.md Markdown fixture: empty swap block edge case.
internal/mdgen/testdata/empty-message.md Markdown fixture: empty message error case.
internal/mdgen/testdata/duplicate-swap-blocks.md Markdown fixture: duplicate fenced blocks warning.
internal/mdgen/testdata/crlf.md Markdown fixture: CRLF normalization.
internal/mdgen/testdata/capitalization.md Markdown fixture: capitalization + exceptions.
internal/mdgen/testdata/capitalization-missing-match.md Markdown fixture: missing match error case.
internal/mdgen/parse.go New Markdown → RuleSource parser (frontmatter + fenced blocks).
internal/mdgen/parse_test.go Tests for Markdown parsing, warnings, and error paths.
internal/mdgen/emit.go New deterministic YAML emitter using yaml.Node.
internal/mdgen/emit_test.go Semantic, golden, and ordering tests for YAML emission.
cmd/validate.go New rulebound validate command wiring to parser validator.
cmd/validate_test.go Tests for validate command behavior and integration with generate.
cmd/root.go Moves exitError to shared location and registers new commands.
cmd/generate.go New rulebound generate command wiring to mdgen parse/emit.
cmd/generate_test.go Tests for generate command output modes and error cases.
cmd/build.go Removes duplicated exitError (now in root).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cmd/validate.go
Comment on lines +12 to +20
var validateCmd = &cobra.Command{
Use: "validate <file.yml> [file2.yml ...]",
Short: "Validate Vale YAML rule files for structural errors",
Long: `validate reads one or more Vale rule YAML files and reports structural
errors: missing required fields, invalid extends values, and rule-type-specific
field errors — without invoking Hugo.

Exit code 0 if all files are valid, 1 if any file has errors.`,
Args: cobra.MinimumNArgs(1),
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description/test plan mentions supporting rulebound validate - (stdin), but this implementation always calls parser.ValidateRule(path) which reads from disk. Passing - will currently try to open a file literally named "-" and fail, breaking the documented generate→validate pipeline. Consider adding stdin support (detect path == "-" and read from os.Stdin) and/or exposing a ValidateRuleBytes/ValidateRuleReader helper in the parser package; also update the Use string to advertise - if supported.

Copilot uses AI. Check for mistakes.
Comment thread internal/mdgen/parse.go
Comment on lines +136 to +154
case parser.ExtendsSubstitution:
swapBlock, ok := blocks["vale-swap"]
if !ok {
return nil, nil, fmt.Errorf("substitution rule requires a ```vale-swap``` fenced block")
}
pairs, swapWarnings, err := parseSwapBlock(swapBlock)
if err != nil {
return nil, nil, fmt.Errorf("parsing vale-swap block: %w", err)
}
warnings = append(warnings, swapWarnings...)
src.Swap = pairs

case parser.ExtendsExistence:
tokenBlock, ok := blocks["vale-tokens"]
if !ok {
return nil, nil, fmt.Errorf("existence rule requires a ```vale-tokens``` fenced block")
}
src.Tokens = parseLineBlock(tokenBlock)

Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ParseMarkdown allows empty vale-swap/vale-tokens blocks (e.g., empty swap returns pairs == nil, empty tokens returns empty slice). EmitYAML then omits swap/tokens when the slice is empty, which can produce YAML that ValidateRule will flag as invalid for substitution/existence. Consider treating empty required blocks as an error during parsing (or emitting an explicit empty mapping/sequence and aligning validator behavior) so generate can reliably produce valid Vale rule files.

Copilot uses AI. Check for mistakes.
Comment on lines +146 to +169
switch v := swapVal.(type) {
case map[string]interface{}:
if len(v) == 0 {
errs = append(errs, ValidationError{
Field: "swap",
Message: "substitution rule requires a non-empty 'swap' map",
Severity: "error",
})
}
case []interface{}:
if len(v) == 0 {
errs = append(errs, ValidationError{
Field: "swap",
Message: "substitution rule requires a non-empty 'swap' map",
Severity: "error",
})
}
default:
errs = append(errs, ValidationError{
Field: "swap",
Message: fmt.Sprintf("'swap' must be a mapping, got %T", swapVal),
Severity: "error",
})
}
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validateSubstitution accepts swap as []interface{} but only checks that the sequence is non-empty. This means structurally invalid values like swap: [1, 2] (or a sequence of non-mappings / mappings with non-string values) will pass validation. Consider validating the sequence contents (each item should be a mapping with string keys/values) and, for mapping form, verifying all values are strings (and possibly non-empty) to match Vale’s expected swap semantics.

Copilot uses AI. Check for mistakes.
Comment thread internal/parser/validate.go Outdated
Comment on lines +212 to +228
_, hasMax := m["max"]
_, hasMin := m["min"]
if !hasMax && !hasMin {
errs = append(errs, ValidationError{
Field: "max/min",
Message: "occurrence rule requires at least one of 'max' or 'min'",
Severity: "error",
})
}

if _, hasToken := m["token"]; !hasToken {
errs = append(errs, ValidationError{
Field: "token",
Message: "occurrence rule requires a 'token' field",
Severity: "error",
})
}
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validateOccurrence only checks presence of max/min and token, but doesn’t validate their types/values. As written, max: foo or token: 123 would be treated as valid. Consider validating that token is a non-empty string and that max/min are integers (and optionally enforce min <= max when both are present) to catch common structural errors.

Copilot uses AI. Check for mistakes.
Comment on lines +21 to +30
rest := content[len(fence):]
if len(rest) == 0 || rest[0] != '\n' {
return nil, nil, fmt.Errorf("no frontmatter found")
}
rest = rest[1:] // consume newline after opening ---

idx := strings.Index(rest, "\n"+fence)
if idx == -1 {
return nil, nil, fmt.Errorf("no closing frontmatter fence")
}
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExtractFrontmatter searches for the closing fence using strings.Index(rest, "\n---"). This fails for an empty frontmatter (---\n---\n...) because the closing fence begins at the start of rest, resulting in a misleading "no closing frontmatter fence" error. Consider handling the rest-prefix case (closing fence immediately) or adjusting the delimiter search to recognize fences at the start of rest and/or require fences on their own line.

Copilot uses AI. Check for mistakes.
Comment thread internal/mdgen/emit_test.go Outdated
Comment on lines +304 to +338
// ── Swap order verification ───────────────────────────────────────────────��

func TestEmitYAML_SwapOrder(t *testing.T) {
src := &mdgen.RuleSource{
Extends: parser.ExtendsSubstitution,
Message: "test",
Level: "warning",
Fields: map[string]interface{}{},
Swap: []mdgen.SwapPair{
{Key: "zebra", Value: "z"},
{Key: "apple", Value: "a"},
{Key: "mango", Value: "m"},
},
}

out, _, err := mdgen.EmitYAML(src)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}

// Verify insertion order preserved (zebra before apple before mango).
outStr := string(out)
zebraIdx := strings.Index(outStr, "zebra")
appleIdx := strings.Index(outStr, "apple")
mangoIdx := strings.Index(outStr, "mango")

if zebraIdx == -1 || appleIdx == -1 || mangoIdx == -1 {
t.Fatalf("missing swap keys in output:\n%s", outStr)
}
if zebraIdx >= appleIdx || appleIdx >= mangoIdx {
t.Errorf("swap order not preserved: zebra@%d, apple@%d, mango@%d", zebraIdx, appleIdx, mangoIdx)
}
}

// ── Helpers ──────────────────────────────────��─────────────────────────────
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are stray replacement characters in these comment separators (likely an encoding artifact). Please remove them to keep the file clean and avoid issues with editors/linters that flag unexpected Unicode characters.

Copilot uses AI. Check for mistakes.
- Handle empty frontmatter edge case in ExtractFrontmatter
- Add stdin support to validate command (validate -)
- Validate swap sequence item types and string values
- Validate occurrence token/max/min field types
- Error on empty vale-swap/vale-tokens blocks during parsing
- Export ValidateRuleBytes for non-file validation
- Fix stray Unicode replacement characters in emit_test.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants