Skip to content

bloblang(v2): Add V2 language runtime and migration path#436

Open
Jeffail wants to merge 20 commits into
mainfrom
bloblang-v2
Open

bloblang(v2): Add V2 language runtime and migration path#436
Jeffail wants to merge 20 commits into
mainfrom
bloblang-v2

Conversation

@Jeffail
Copy link
Copy Markdown
Collaborator

@Jeffail Jeffail commented May 19, 2026

Summary

Introduces Bloblang V2 — a redesigned mapping language — as a parallel runtime alongside the existing V1 implementation, together with a V1 → V2 migration path for both individual mappings and full Benthos YAML configs.

V2 is reached through opt-in surfaces (bloblang_v2 / bloblang_v2_file processors, a separate V2 plugin registry, dedicated lint rules). V1 remains untouched: no V1 API is removed, no V1 processor is renamed, and no schema-level changes affect V1-only pipelines.

The branch is organised into 20 topical commits — reviewers can navigate by subsystem rather than by chronological development.

What's included

  • Language specification — a 13-chapter design document under internal/bloblang2/spec/, the source of truth for both runtimes.
  • Go runtime — scanner, parser, AST, optimizer, name resolver, pretty-printer, tree-walking interpreter, and standard library under internal/bloblang2/go/pratt/.
  • TypeScript runtime — full port (scanner → interpreter → stdlib) under internal/bloblang2/ts/, passing the same spec conformance corpus as the Go runtime.
  • Spec conformance corpus — 132 YAML test files covering access, control flow, error handling, imports, lambdas, maps, operators, the standard library, and edge cases. Both runtimes are gated on the full corpus.
  • Editor support — an LSP server with diagnostics and completions, a Neovim plugin, a tree-sitter grammar, and a Go-served demo web playground with browser-side execution via the TypeScript runtime.
  • V1 reference + migrator — a V1 reference specification, a V1 parser, a V1 → V2 translator that surfaces semantic divergences as flagged SemanticChange entries rather than silently rewriting, and a corpus-wide migration benchmark.
  • Public APIpublic/bloblangv2 (plugin registration, parsing, execution) and public/service/migrator (config-level migration with diamond-import handling, mixed V1/V2 configs, and partial-failure reporting).
  • V1 stdlib parity ports — V2 implementations of the V1 standard library, registered against the global V2 environment under internal/impl/{pure,io}/bloblangv2_*.go.
  • Service-framework wiring — V2 environment threaded through internal/manager, internal/bundle, the CLI, studio sync, the config schema, and the public/service surface (Environment, StreamBuilder, ResourceBuilder, lint, schema).

Coexistence with V1

V2 and V1 maintain separate plugin registries; plugins registered against one are not visible to the other. Host components select a language per field — a bloblang field uses V1 and is linted via the V1 path; a bloblang_v2 field uses V2 and is linted via LintBloblangV2Mapping. The two processor types operate side by side in the same pipeline.

One known gap: interpolated string fields (the ${! ... } form) still dispatch through the V1 environment only. This and other deferred items (custom V2 lint surface, V1 ↔ V2 plugin bridge) are tracked in internal/bloblang2/REMAINING.md.

For documentation authors

The full V2 language specification lives at internal/bloblang2/spec/, organised into 13 chapters covering lexical structure, type system, expressions, control flow, maps, imports, the execution model, error handling, special features, formal grammar, common patterns, an implementation guide, and the standard library reference. The spec README provides a guided table of contents and is the recommended entry point.

Per-method V1 → V2 status, including any semantic shifts (e.g. variadic arguments folded into arrays, the new error-object shape, deferred batch-3 items), is tracked in internal/bloblang2/PARITY.md.

Jeffail added 20 commits May 19, 2026 09:49
Adds the design specification for Bloblang V2 under
internal/bloblang2/spec/, split across thirteen numbered chapters
(overview, type system, expressions, control flow, maps, imports,
execution model, error handling, special features, grammar, common
patterns, implementation guide, standard library) plus a top-level
PROPOSAL.md and README.md.

The spec is the source of truth for the Go and TypeScript runtimes
and the V1 -> V2 migrator that follow.
Adds the syntax half of the Go runtime under
internal/bloblang2/go/pratt/syntax/: a hand-written scanner, a Pratt
parser producing an AST, a name resolver, a post-parse optimizer
pass, and a V2 pretty-printer used by the migrator's output stage.

Includes parser and scanner unit tests plus go-fuzz harnesses
(FuzzParse, FuzzScan) with a seed corpus.
Adds the runtime half of the Go implementation under
internal/bloblang2/go/pratt/eval/: a tree-walking interpreter with a
small opcode dispatch path for methods and functions, a variable
stack with slot allocation handled by the resolver, message-context
support, and the V2 standard library covering arithmetic,
collections, string handling, lambdas, strftime, and message-coupled
operations.

Tested against the eval-side unit suite (interp, stdlib, strftime,
argument folding). Spec conformance is exercised separately once the
spectest runner and corpus land.
Adds internal/bloblang2/go/spectest/, a runner that reads the YAML
spec test corpus, executes each case against a configurable
interpreter, and produces structured pass/fail reports.

Provides the schema for spec tests, typed-value support so YAML can
carry V2-typed inputs/outputs, and a compare layer that distinguishes
exact-equality assertions from error-shape assertions. The runner is
shared by both the Go interpreter tests and (via the TypeScript port)
the TS interpreter tests.
Adds internal/bloblang2/spec/tests/, the YAML corpus of conformance
cases that anchors both runtimes to the V2 spec. Tests are organised
by topic: access, case_studies, control_flow, edge_cases,
error_handling, imports, input_output, lambdas, maps, operators,
optimizations, stdlib, types, variables.

Each case is executed via the spectest runner; both the Go and TS
runtimes are required to pass the full corpus.
Adds the top-level internal/bloblang2 entry point: bloblang2.go
exposes the runtime to other internal packages, benchmark_test.go
provides a corpus-wide V1 vs V2 benchmark harness, and the Taskfile
ties together Go, TypeScript, and tree-sitter tests behind a
unified action-first surface (build, test, demo, clean).

Also includes the PARITY.md V1 method tracking table, the package
README, and REMAINING.md.
Adds an LSP server under internal/bloblang2/go/lsp/ (with a small
bloblang2-lsp binary) that exposes the V2 parser/resolver as
diagnostics and completions over the standard JSON-RPC protocol.

Ships a matching Neovim plugin under internal/bloblang2/plugins/nvim/
that wires .blobl2 filetype detection, syntax highlighting via the
tree-sitter grammar, and the LSP client.
Adds internal/bloblang2/tree-sitter/, a tree-sitter grammar for V2
syntax with corpus tests, plus the syntax highlighting query used by
the Neovim plugin.

Also adds internal/bloblang2/demo/, a small Go-served web playground
with a Monaco-style editor, a case-study dropdown of real-world
mappings, an engine selector for switching between server-side and
browser-side execution (the latter via the TypeScript runtime), and
syntax highlighting backed by tree-sitter.
Adds internal/bloblang2/ts/, a full TypeScript port of the V2
runtime: scanner, parser, resolver, optimizer, tree-walking
interpreter, value system, and standard library. Bundles to a
browser-loadable script that powers the demo's client-side engine.

The TypeScript runtime passes the same spec conformance corpus
(internal/bloblang2/spec/tests/) as the Go runtime, exercised via a
small spectest harness layered over the same YAML schema.
Adds internal/bloblang2/migrator/bloblang_v1_spec.md, a reference
specification for Bloblang V1 derived from the existing V1
implementation and tightened via adversarial review and a
test-driven verification pass. This is the source-of-truth document
the migrator's translator rules are written against.

Adds internal/bloblang2/migrator/v1spec/, a V1 conformance test
corpus organised by topic (access, case_studies, control_flow,
edge_cases, error_handling, imports, input_output, lambdas, maps,
operators, optimizations, stdlib, types, variables) together with a
runner that exercises the V1 corpus against internal/impl/pure for
typed-numeric coverage.
Adds internal/bloblang2/migrator/v1ast/, a hand-written scanner,
parser, and AST for Bloblang V1, plus a printer used to round-trip
V1 source through the migrator. The package preserves comments and
blank-line trivia so the V1 -> V2 translation pipeline can emit V2
source that retains the V1 author's formatting intent.

This is the V1 front-end consumed by the translator package that
follows.
Adds internal/bloblang2/migrator/translator/, the core V1 -> V2
translation pipeline. The translator consumes V1 AST produced by
v1ast, walks expressions and statements applying targeted rewrite
rules (methods, imports, control flow, lambdas, mapping mode), and
emits V2 source via the syntax printer while preserving trivia.

The package layers its rules so behaviour-equivalent rewrites are
applied directly and known V1 -> V2 semantic divergences are flagged
as SemanticChange entries on the change report rather than silently
rewritten. A corpus regression test plus per-rule, property, and
spec-coverage tests pin behaviour.
Adds internal/bloblang2/migrator/benchmark/, a corpus-wide V1 -> V2
migration benchmark suite with a coverage probe and migration smoke
test that quantifies translator coverage and runtime parity against
a curated V1 corpus.

Adds internal/bloblang2/migrator/demo/, a small Go-served web
playground that wires the migrator behind a UI with real V1 case
studies. Useful for eyeballing translator output and for showing the
behaviour of flagged SemanticChange entries.
Adds public/bloblangv2/, the exposed Go surface for V2: Environment,
Executor, MessageContext, plugin registration for methods and
functions with parse-time argument folding, ParseError plumbing,
parameter and spec types, and a View layer over the schema for
external consumers.

The package is the integration point used by public/service (and any
downstream Benthos plugin) to register V2 plugins, parse mappings,
and execute them against message contexts.
Adds public/bloblangv2/migrator/, the public-facing wrapper around
the internal translator. It exposes a stable API for migrating a
single Bloblang mapping (or a set of mappings sharing an import
graph) from V1 to V2, surfaces structured Change entries (rewrites
and flagged SemanticChange divergences), and lets callers register
extra rules via the Options surface for plugin-specific rewrites.

This is the layer consumed by the upcoming public/service config
migrator, and by external tools that just need mapping-level
translation without pulling in internal/.
Threads public/bloblangv2 through the Benthos framework so V2
mappings can be parsed, linted, and executed alongside V1 in
existing pipelines:

  - internal/bundle, internal/manager: NewManagement now carries a
    BloblV2Environment alongside the V1 BloblEnvironment, with a
    matching OptSetBloblV2Environment option.
  - internal/docs: LintBloblangV2Mapping lints a field as V2,
    side-effect-free using the configured V2 environment.
  - internal/cli, internal/cli/studio, internal/cli/test,
    internal/config/schema, internal/stream/manager: V2 envs are
    plumbed through CLI, studio sync, config schema enumeration,
    and the stream manager.
  - public/service: a bloblangv2 batch processor, a
    config_bloblangv2 field type, schema and linter integration,
    plus the corresponding additions to Environment, StreamBuilder,
    ResourceBuilder, StreamConfigLinter, and ComponentConfigLinter.

Adds config/test/bloblang/ YAML fixtures covering the V2 batch
processor's golden path, filter behaviour, and metadata reset
semantics.
Adds internal/impl/pure/processor_bloblang_v2.go and
processor_bloblang_v2_file.go, the runtime processors that execute
V2 mappings inside a Benthos pipeline. The _v2 processor takes a
mapping inline; the _file processor reads it from disk, which is
how the public/service config migrator emits long mappings.

Both register against the public/service registry and parse via the
configured public/bloblangv2 environment, so they pick up plugin
methods and functions registered elsewhere in the binary (including
the V1 stdlib parity ports).
Adds V2 implementations of V1 stdlib methods and functions,
registered against the global public/bloblangv2 environment via
init() side-effects. Pure (deterministic) helpers live under
internal/impl/pure/bloblangv2_*.go; impure (random, time-based)
helpers live under internal/impl/io/bloblangv2_*.go.

Coverage:

  - pure: arrays, crypto (AES, hashes), encoding (JSON, YAML, CSV,
    base64, JSON schema, URLs), numbers (bitwise, log family, min,
    max), objects, parsing (parse_json, format_json, escapes),
    regex (replace, replace_many, find/find_all variants), string
    (case, trim, replace, hash, filepath, uuid_v5), time
    (deterministic timestamp formatting and arithmetic).
  - io: ids (uuid_v7, ksuid, nanoid), time (timestamp_unix variants
    that touch the wall clock).

Each port mirrors the V1 method or function signature so the V1 ->
V2 translator can rewrite call sites with no semantic change. The
internal/bloblang2/PARITY.md table tracks remaining gaps.
Adds public/service/migrator/, a config-level migrator that walks
parsed Benthos YAML configurations, locates bloblang fields and the
bloblang processor (including instances nested inside switch,
branch, processor_resources, and cache_resources), and applies the
public/bloblangv2 mapping migrator to each.

Includes a lazy import resolver so transitively-imported V1 mapping
files are translated once and rewritten in place, a from-only rule
that guards against silently broken rewrites, support for emitting
the bloblang_v2_file processor for file-backed mappings, a
structured report API surfacing per-field outcomes, and pluggable
rules for plugin-specific rewrites. Integration tests exercise
multi-config migration, diamond imports, mixed V1/V2 configs, and
partial-failure behaviour.
Adds the google/uuid dependency required by the V2 standard library
(uuid_v5).

Restructures taskfiles/test.yml so the default unit and unit-race
tasks pass -short and skip long-running corpus and benchmark tests,
keeping the per-PR loop under a minute. Adds a new unit-full task
(alias ut-full) with a longer timeout that runs the full suite,
including the migrator corpus, fuzz seeds, and benchmark smoke
tests.
@mihaitodor
Copy link
Copy Markdown
Contributor

That's cool! Curious to play with the language server.

Will this also have a mutation processor if users want to avoid copying the message? Also, will there be an easy way to rename some keys in objects inside an array? I can't recall a good example now, but that has popped up from time to time.

And, a small pet peeve: do you think you could shoehorn bitwise math operators in there? :)

@Jeffail
Copy link
Copy Markdown
Collaborator Author

Jeffail commented May 19, 2026

That's cool! Curious to play with the language server.

Will this also have a mutation processor if users want to avoid copying the message? Also, will there be an easy way to rename some keys in objects inside an array? I can't recall a good example now, but that has popped up from time to time.

And, a small pet peeve: do you think you could shoehorn bitwise math operators in there? :)

I think instead of mutation I'm just going to work on a follow up suite of optimizations so that the user doesn't need to choose between the two modes. Most of the performance gain of mutation can be achieved with better COW mechanisms which we're already playing with under the hood in v2. The other stuff can also come later, not 100000% sure on the bitwise stuff but because we no longer have the pipe operator for coalescing it's possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants